03 — Blog
Thoughts and lessons from the field — on AI, machine learning, and building intelligent systems.
2026-03-18 · 12 min
A practical guide to context engineering — how to design what your LLM actually sees, manage token budgets wisely, and measure what matters.
2026-03-16 · 11 min
A practical guide to building safe AI agents with input validation, output filtering, action boundaries, circuit breakers, and production-ready guardrail patterns.
2026-03-15 · 12 min
Practical lessons on building production RAG systems — chunking strategies, embedding selection, reranking, hybrid search, and keeping hallucinations in check.
2026-03-12 · 13 min
A practical guide to prompt injection attacks — how they work, why they're dangerous, and the layered defenses you need to protect your LLM-powered apps in production.
2026-03-10 · 10 min
A practical guide to production ML pipelines — feature stores, training automation, model versioning, drift monitoring, and CI/CD for ML. Lessons from real-world systems.
2026-03-08 · 11 min
A practical guide to building composable agent skills — the architecture patterns behind tools, knowledge, and workflows that turn generic AI agents into reliable domain experts.
2026-03-05 · 11 min
Practical strategies for scaling LLM applications — from token optimization and semantic caching to cost management, latency tricks, and keeping your system observable.
2026-02-28 · 14 min
Everything I've learned about fine-tuning large language models — when to do it, when not to, LoRA/QLoRA techniques, dataset prep, and getting your model into production.
2026-02-20 · 10 min
A practical comparison of Pinecone, Weaviate, Qdrant, Chroma, and Milvus — covering architecture, real-world performance, features, and what they'll cost you.
2026-02-12 · 15 min
A practical walkthrough of the Transformer — self-attention, multi-head attention, positional encoding, and a hands-on PyTorch implementation you can actually learn from.
2026-02-05 · 11 min
A hands-on look at AI agents — what separates real agents from fancy wrappers, architecture patterns like ReAct and Plan-and-Execute, tool use, memory, multi-agent orchestration, and the limitations nobody talks about.
2026-01-28 · 9 min
A practical guide to prompt engineering — from zero-shot and few-shot prompting to chain-of-thought reasoning, system prompt design, structured outputs, and how to evaluate it all.
2026-01-20 · 10 min
A practical guide to running AI models on edge devices — covering model compression, quantization, ONNX Runtime, TensorRT, mobile deployment, and the hardware trade-offs you actually need to think about.