Notes from production AI.
No vendor takes. Practical engineering writing on what actually works when you ship AI to real users — and what we've broken along the way.
20 articles · Updated Jun 2, 2026 · Subscribe via info@aiinfox.com
Latest articles
UK GDPR for AI Development: A Practical 2026 Guide
Most UK GDPR posts read like a legal essay. This one is the engineering version — DPIAs, lawful bases, Article 22, ICO guidance, SCCs — written for CTOs shipping production AI.
PIPEDA + Quebec Law 25 for AI in Canada: 2026 Compliance Checklist
PIPEDA is the federal floor. Quebec Law 25 is the strictest provincial overlay. OSFI E-23 sits on top for federally-regulated banks. Here is the engineering checklist that ties them together.
Australian Privacy Act + APPs for AI Development in 2026
The Privacy Act sets the federal floor. APRA CPS 234 and CPS 230 add the financial-services overlay. The NDB clock is unforgiving. Here is the practical engineering checklist.
RAG vs Fine-Tuning in 2026: Cost, Latency, and When to Pick Which
RAG is the default for most production AI in 2026. Fine-tuning is the right call about a third of the time it gets requested. Here is the honest cost math.
Offshore AI Development in 2026: What Actually Works and What Doesn't
Offshore AI in 2026 is not what offshore meant in 2014. The senior-only model, eval-first delivery, and the takeover audit reality have made the bench-rate-and-pyramid model obsolete.
AI Development RFP Template: 12 Questions Every Vendor Should Answer in Writing
Most AI RFPs ask the wrong questions. Here are the 12 questions that actually separate vendors who ship from vendors who pitch — with the answers good vendors give and the answers that should disqualify.
Voice Agent ROI: The Real Cost Math Behind 4,000 Calls a Day
Voice agents pencil at 10-30 cents per call when built right. They pencil at $1.20 a call when built wrong. Here is the actual cost model behind a production deployment doing 4,000 calls a day.
AI Agent Observability in Production: What to Instrument Before Launch
Agents you cannot trace are agents you cannot debug. Here is the observability stack we instrument on every production engagement — what to log, what to dashboard, and what to alert on.
LLM Evaluation Harness 101: How to Test an LLM Before Your Users Do
Most failed LLM engagements share one missing artifact — the eval set. Here is how to build one, score against it, and gate every prompt change in CI before users see the regression.
AI Vendor Takeover Audit: 7 Signs Your Current Vendor Isn't Shipping
Most stuck AI engagements share the same seven symptoms. Here is the audit checklist we run before a takeover — and the recovery process that gets a system shipping inside 8 weeks.
How to Evaluate Offshore Senior AI Engineers (Without Falling for Resume Theater)
Most offshore AI hiring rounds optimise for the wrong signals. Here is the interview pattern that actually surfaces whether an engineer has shipped production LLM systems — and why takehomes alone fail.
HIPAA-Compliant AI Deployment — A 12-Point Checklist
Every healthcare AI we have shipped passes the same 12 controls before a clinician sees it. BAAs, VPC isolation, audit logs, refusal layers, eval gating.
RAG Hallucination Rates — What Actually Moves the Needle
Most RAG hallucination posts mistake model choice for the actual lever. Here is the ranked list of what actually drops hallucination rate in our production RAG.
Top AI Companies in Mohali — 2026 Ecosystem
A look at the Mohali AI ecosystem — what's being built, who's hiring, and where the next wave of production AI work is coming from.
When LLM Fine-Tuning Actually Pays Off
A cost / quality / data-residency decision tree. We've fine-tuned 12 models across healthcare, legal, and EdTech — here's what we learned.
Building an LLM eval harness from scratch
What to evaluate, how to score it without humans-in-the-loop on every change, and how to actually keep evals trustworthy as your prompts evolve.
Shipping RAG in production — what nobody tells you
Vector search is the easy part. The hard parts are chunking, re-ranking, citations, refusals, and the eval suite that gates every prompt change.
Voice Agents Under One Second — Latency Playbook
Latency budgets, streaming STT, speculative LLM responses, and TTS chunking. A practical playbook from production deployments.
Build bounded agents, not autonomous ones
Open-ended agent loops are a debugging nightmare. Bounded recursion, explicit tool whitelists, and approval gates make agentic systems shippable.
Get the next post by email.
Practical AI engineering writing every other week. No promotional fluff, no sponsored takes — just what works in production.
