- ·Future of AI·3 min read
Agents That Pay for Themselves: The Economic Loop That Will Redefine Software in 2027
Most agents lose money on every call. A handful have started covering their own inference cost by generating measurable value per run. The structural difference is worth understanding before you build the next one.
- ·RAG·2 min read
From RAG to Production — Observability, Cost Controls, and the Reality No Demo Shows
Everything the tutorials skip. The instrumentation, the kill switches, and the 3am pager habits that turn a RAG demo into something you can keep on call.
- ·Vibe Coding·3 min read
The Vibe Coder's Dilemma: When to Read the Code You Just Generated
Reading every AI-generated line is slow. Reading none is reckless. The honest answer is "it depends," and the dependency is more predictable than people admit.
- ·Spec-Driven Development·3 min read
OpenSpec in a Monorepo: Keeping AI-Generated Code Consistent Across 12 Services
Specs work great for one feature. They start to fight each other in a monorepo with a dozen services, each prompting AI tools with its own conventions. Here's how to share specs without making every team write YAML by hand.
- ·Microsoft Agent Framework·3 min read
From Prototype to Production: Deploying Microsoft Agent Framework on Azure
Your console app works on a developer laptop. Production needs auth, telemetry, secrets, scaling, and a way to deploy without `dotnet run`. Here's the smallest Azure setup that actually works.
- ·RAG·2 min read
GraphRAG in .NET — When Vector Search Can't Reason Across Documents
Vector search is great at "find me a relevant chunk." It's bad at "find me chunks that mention X and Y in a particular relationship." That's what GraphRAG is for.
- ·LLM Integration·2 min read
Caching LLM Responses: When It Saves Money and When It Silently Breaks UX
Caching LLM calls is irresistible. Same input, same output, free. Except outputs aren't always supposed to be the same, and a stale cache hit looks like a broken product.
- ·Vibe Coding·3 min read
I Vibe-Coded a SaaS in a Weekend. Here's What Broke in Week Two.
Saturday: I shipped a working SaaS with AI tools and almost no manual code. Sunday: 14 paying users. The following Wednesday: my Stripe webhook had been creating duplicate subscriptions for 36 hours.
- ·RAG·2 min read
The Hidden Cost of Re-Ranking: Benchmarking Cross-Encoders in Production RAG
Cross-encoder rerankers boost recall on paper. In production, they doubled our p95 latency and the lift didn't show up in user metrics. The benchmark we wish we'd run first.
- ·Spec-Driven Development·3 min read
Why I Stopped Writing Prompts and Started Writing Specs (with OpenSpec)
A year of writing increasingly clever prompts ended with the same problem on every feature. I could not reliably reproduce what the AI built yesterday. Specs solved it. Prompts didn't.
- ·RAG·2 min read
RAG Caching Strategies — Semantic Caching, Embedding Reuse, and the Cost Math
Three layers of caching that cut 60-80% of your LLM bill in a busy RAG system. Plus the one cache that will silently break your UX.
- ·Microsoft Agent Framework·3 min read
Building a Multi-Agent Workflow with Microsoft Agent Framework in C#
A single agent can do a lot. Two agents that hand off cleanly can do more, and you don't have to invent message routing to make it work.
- ·LLM Integration·3 min read
The Feature Flag Playbook for Rolling Out LLM Features Safely
LLM features fail differently to normal code. They get slow, they get expensive, they get weird. Three flag patterns that let you ship them without a 3am rollback.
- ·Future of AI·2 min read
Small Models, Big Impact: The Quiet Shift From Frontier Models to Specialized Ones
Two years of "use the biggest model you can afford" is ending. Smaller, specialised models are quietly matching frontier performance on the narrow tasks that pay the bills.
- ·Spec-Driven Development·3 min read
Spec-Driven Development with OpenSpec: Killing the 'It Works on My Machine' of AI Coding
AI coding tools generate code that compiles, looks right, and quietly disagrees with what you actually wanted. Specs in version control turn vibes into contracts.
- ·RAG·3 min read
From Naive RAG to Agentic RAG: A Migration Story in 5 Steps
Naive RAG works fine until your users start asking compound questions. Here's how we turned a single-shot retriever into something that plans, retrieves, and verifies. Without a rewrite.
- ·RAG·2 min read
Building an Agentic RAG with Microsoft Semantic Kernel
When single-shot retrieval isn't enough, an agent that decides whether to retrieve — and critiques its own answer — earns the extra latency. Here's how it looks in SK.
- ·Vibe Coding·3 min read
Vibe Coding Isn't a Skill Issue — It's a Verification Issue
The popular dunk on "vibe coders" is that they can't read code. The real problem is they can't verify it. Two different failures, two very different fixes.
- ·Microsoft Agent Framework·2 min read
Microsoft Agent Framework vs Semantic Kernel: What Changed and Why It Matters
If you bet on Semantic Kernel a year ago, you noticed Microsoft quietly drifting toward "Agent Framework." Here's what actually changed, what carries over, and where the migration hurts.
- ·LLM Integration·3 min read
Streaming LLM Responses Through Your Existing REST API: Patterns That Actually Work
Your frontend wants tokens as they generate. Your API gateway only speaks JSON. Here's how to bridge the two without rewriting the stack you spent two years getting stable.
- ·Future of AI·2 min read
The Post-Chat Era: Why Conversational UIs Are a Local Maximum
Every AI product has a chat box. Most of them shouldn't. Chat is the worst interface for most LLM use cases. We just defaulted there because OpenAI shipped it first.
- ·RAG·2 min read
Evaluating RAG Systems — RAGAS, Faithfulness, and Setting Up an Eval Harness in .NET
"Looks right to me" is not evaluation. Four metrics that catch regressions before users do, and how to wire them into your test suite.
- ·RAG·3 min read
Hybrid Search in RAG: When Vector Similarity Alone Isn't Enough
Pure vector search misses exact-match queries. Product SKUs, error codes, function names. Hybrid search fixes that without giving up the semantic recall you actually like.
- ·LLM Integration·3 min read
Bolting an LLM Onto a Legacy .NET App Without Breaking Production
You've got a 12-year-old .NET Framework app, an SLA, and a director who wants AI features by Q3. Here's how to bolt on an LLM without touching the monolith.
- ·RAG·2 min read
Hybrid Search in RAG with Azure AI Search and BM25 — The .NET Implementation
Vector search alone misses product codes, error messages, and proper nouns. Hybrid search with Reciprocal Rank Fusion fixes that. Here's how it looks in C#.
- ·RAG·3 min read
Why Your RAG Pipeline Returns Garbage (And It's Probably Your Chunking)
Your retriever pulls the right documents and the answers still come out wrong. Nine times out of ten the problem isn't where you're looking. It's upstream.
- ·RAG·2 min read
Kernel Memory in .NET — Microsoft's Out-of-the-Box RAG Service Reviewed
Kernel Memory packages "do RAG" into a service you can run in-process or stand alone. Useful at the start. Constraining when you grow out of it.
- ·RAG·2 min read
Naive RAG vs Advanced RAG vs Agentic RAG vs GraphRAG vs Adaptive RAG — Which One Do You Actually Need?
Five RAG architectures. Each one earns its complexity in a specific situation. Most teams pick the wrong one because they read about the fanciest one first.
- ·.NET·2 min read
EF Core 10 — Named Query Filters, JSON Columns, and Bulk Updates That Actually Work
Three EF Core 10 features that quietly fix long-running pain. Plus the migration trap on each one.
- ·.NET·2 min read
The New Built-in Validation in .NET 10 Minimal APIs — Goodbye, FluentValidation Boilerplate?
.NET 10 finally wires data annotations into Minimal API endpoints with a single AddValidation call. FluentValidation still has a job — just a smaller one.
- ·RAG·2 min read
Embeddings Are Not Created Equal — Choosing the Right Model for Your RAG Domain
A legal-tech RAG needs different embeddings to a customer-support one. The MTEB leaderboard hides this. Three things to check before you commit.
- ·Career·2 min read
From Senior Engineer to AI-Augmented Senior Engineer — What 20 Years of Experience Means Now
Twenty years of code does not make you obsolete. It does change which parts of the job pay you back. An honest take from someone who's been around long enough to see a few of these waves.
- ·.NET·2 min read
Minimal APIs in .NET 10 — Everything That Actually Changed
The headline features in one place, with honest takes on which ones you should adopt now and which can wait.
- ·.NET·2 min read
Native AOT for ASP.NET Core APIs in .NET 10 — When Cold-Start Latency Actually Matters
Native AOT trades some flexibility for a 50ms cold start and a 30MB image. Worth it for serverless and edge. Not worth it for the average internal API.
- ·Architecture·2 min read
The .NET Aspire Production Reality Check
A year of Aspire in real projects. It earns its keep at dev time. The "deploy to prod" story is better than people think and not as complete as Microsoft's slides suggest.
- ·.NET·2 min read
Server-Sent Events in .NET 10 — Streaming LLM Responses Without WebSockets
.NET 10 ships SSE as a first-class Minimal API result. Streaming OpenAI tokens to a browser is now a one-line endpoint.
- ·RAG·2 min read
Why Your RAG Hallucinates — A Debugging Checklist
Ten reasons RAG systems lie, in order of how often I see them. Each one has a symptom you can spot and a fix that takes less than a day.
- ·.NET·2 min read
Building Your Own MCP Server in .NET — Exposing Your APIs to Claude, ChatGPT, and Cursor
MCP turns "my LLM client should be able to use my internal API" from a custom integration into a 50-line server. Here's how it looks in C#.
- ·C#·2 min read
C# 14 Field-Backed Properties and Null-Conditional Assignment — The Quietly Useful Upgrades
Two C# 14 features that pay for themselves in your next PR. No grand pattern, just less boilerplate where you had it.
- ·RAG·2 min read
Vector Databases Compared — Qdrant, pgvector, Azure AI Search, Pinecone, Weaviate
Five vector stores, one .NET engineer's honest take. The right choice depends on your scale, your ops appetite, and whether you already pay an Azure bill.
- ·RAG·2 min read
Chunking Strategies for RAG — Fixed-Size, Recursive, Semantic, and Document-Aware
Four ways to split documents. Each one is the right answer for some doc type, and the wrong answer for others. The mistake is using the same chunker for everything.
- ·Architecture·2 min read
Designing AI Features That Survive Real Users
Demos pass because there's one user, one prompt, and no outages. Real users break all three. The operational layer that turns an AI demo into something you can keep on call.
- ·RAG·2 min read
Reranking in RAG — When Top-K Vector Search Isn't Enough
Vector search has a precision ceiling. A cross-encoder reranker breaks through it for the cost of one extra API call. Worth the 200ms more often than you'd guess.
- ·.NET·2 min read
FastEndpoints vs Minimal APIs vs Controllers in .NET — The Honest Comparison
Three ways to write the same endpoint in 2024. Each has a real downside. Here's what they actually cost — beyond the marketing.
- ·RAG·2 min read
Building Your First RAG Pipeline in .NET with Semantic Kernel and Qdrant
A document Q&A app in C# that doesn't depend on Python and doesn't take a week. Semantic Kernel, Qdrant in Docker, Azure OpenAI (or Ollama if you're cheap).
- ·C#·2 min read
LINQ Performance in .NET 9 / 10 — Where the JIT Helps You and Where It Doesn't
The runtime quietly made LINQ a lot faster. The patterns that still cost you are the ones the JIT can't fix on your behalf.
- ·Architecture·2 min read
Microservices, Modular Monoliths, and 'Just Build a Monolith' — A Decision Framework
The pendulum has swung back. Most teams should start with a modular monolith. Here's how to tell when you actually need microservices, and when you're just cosplaying as one of the FAANGs.
- ·.NET·2 min read
Structuring Minimal APIs Without the Program.cs Monster
Minimal APIs are wonderful for the first 200 lines. Then Program.cs starts looking like a fanfic. Here's the pattern that keeps it sane.
- ·C#·2 min read
The Real Cost of async/await in C# — When You're Allocating More Than You Think
Every async method generates a state machine. Most of the time it costs nothing. The hot path where it costs you a lot is smaller than people think, and bigger than they want to admit.
- ·C#·1 min read
Source Generators in C# — Stop Writing Boilerplate, Start Generating It
Hand-writing DTO mappers and tool descriptors is a tax. Incremental source generators pay it for you, compile-time, with no reflection cost at runtime.