Blue and teal light streaks on a dark background

AI IMPLEMENTATION

RAG isn’t a strategy — it’s a tactic. Here’s what to think about first.

Every other AI project we’re called into starts with the same opening line: “we’re building a RAG system.” That’s never the actual goal. The goal is something the business wants — faster support resolution, better internal knowledge access, fewer expensive lookups in a manual. RAG is just one possible mechanism. Treating it as the strategy guarantees expensive surprises.

What people actually mean by RAG

Retrieval-Augmented Generation is a technique for grounding model output in source documents the model didn’t see during training. You take a query, you find the most relevant chunks of documents, you stuff them into the prompt, and you ask the model to answer based on what it sees. It is genuinely useful and it has, in 2026, become almost too easy to spin up — which is half the problem.

When a team says ‘we’re building a RAG system’, what they usually mean is: ‘we want the model to know about our company’s internal documents.’ That is a description of a desired outcome dressed up as an architecture. The architecture comes after the outcome, after the user research, after the cost analysis, and after eliminating the simpler options.

What you should decide before choosing RAG

Three questions, in order. First: what does the user actually do today, and how slow or expensive is that process? If the answer is ‘they Slack a colleague and get an answer in two minutes’, you may be optimising the wrong workflow. Second: what does ‘wrong’ cost? An incorrect retrieval-grounded answer has a cost. Quantify it before you build the thing.

Third: how often does the underlying source change? If it changes hourly, your retrieval index has to be live, which is a different and much more expensive system than a nightly batch job. If it changes monthly, you may not need retrieval at all — you may be better off including the relevant documents in a curated system prompt and never indexing anything.

Three alternatives to RAG (often better)

First: structured data over freeform retrieval. If your ‘documents’ are actually rows in a database with consistent fields (orders, customers, tickets), a model that calls a typed query API will outperform a RAG system on the same data. The query is exact, the latency is lower, the cost is lower, the answer is auditable.

Second: prompt-stuffing with curated context. If the corpus is small and stable — a brand voice guide, a set of company policies, a single product manual — paste it into the system prompt and forget retrieval. Cheaper, faster, no index to maintain. Third: tool-calling agents that read documents on demand rather than retrieving by similarity. Modern frontier models are excellent at deciding which document to open. Often this beats vector search outright.

When RAG genuinely is the right answer

RAG earns its keep when the corpus is large, growing, and unstructured; when answers must cite the source; when the user query is naturally expressed in language rather than as a structured filter; and when the cost of a wrong answer is moderate (i.e. the user can verify, and verification is cheap). Customer support knowledge bases are a near-perfect fit. Internal HR policy assistants are a decent fit. Legal advice systems are not — the cost of wrong is too high.

If your use case clears all four bars, RAG is the right tactic. But notice: it is now a tactic answering a strategic question, not the strategic decision itself. That is the right relationship between the two.

The strategy questions that matter

The strategic questions in any AI implementation engagement are: who is this for, what does it replace, what does ‘wrong’ cost, and what will it cost us to maintain six months in? RAG vs. structured query vs. tool-calling vs. prompt-stuffing falls out from those answers. We’ve never run a project where the architecture decision survived contact with those four questions unmodified.

If you’re at the start of an AI implementation and want a second opinion before you commit to RAG (or anything else), book a triage call. Twenty minutes is usually enough to surface the strategic questions hiding under the architectural one.

Pre-RAG sanity check?

We run 30-minute triage calls on AI implementation strategy, free of charge.