In the past two years, whenever it comes to enterprise-level AI projects, one term is almost unavoidable: "RAG" (Retrieval-Augmented Generation).
Many see it as the standard approach for implementing large models, as if simply dumping private documents into a vector database and connecting them to a model would instantly give a company an "enterprise brain." In various solutions, whitepapers, and sales pitches, RAG has almost become the default configuration.
But if you’ve actually worked on a few to-B or to-G projects or have hands-on experience with RAG, you’ve likely felt a strong sense of discomfort. It doesn’t seem as useful as advertised. In many cases, it’s just a high-cost model hallucination amplifier.
So now, when I see discussions about "whether to implement RAG," my first reaction is just four words: it’s meaningless.
What’s truly worth discussing is never "whether to implement RAG," but "what problem are you actually facing?"
The capabilities of vector-based RAG are actually quite clear. It’s only good at one thing: helping you find "semantically similar" snippets in a pile of text. Note the word: similar.
RAG essentially provides "clues," not "knowledge."
It prioritizes coverage over precision. This works well in scenarios like emotional chatting, personal note-taking, or brainstorming for writing inspiration. But in to-B business contexts, the core tasks often demand the opposite from models. Code, financial standards, compliance clauses, approval workflows—getting even a single word wrong in these tasks could lead to serious issues.
I often use a vivid analogy: if your private knowledge base contains mostly curse words following "Hello~," then a model connected to RAG might actually curse back when someone says "Hello~." It’s just doing probabilistic matching—it doesn’t understand context or causality. This logical blindness makes it unsuitable for rigorous production environments.
Besides being imprecise, RAG has another fatal flaw that most people overlook: misleading due to lag.
Business operations are dynamic. Rules might have changed last week, but RAG’s vector database is often a static "archive of outdated records." Due to the high cost of maintaining and reconstructing vector indexes, many systems are filled with obsolete SOPs and invalid parameters.
This mechanism can trap the model in a "knowledge poisoning" pitfall. The retrieval algorithm, based on semantic similarity, feeds the model what looks like relevant but actually outdated "discarded truths." The model doesn’t know the business has changed; it will confidently use outdated information to mislead you into making wrong decisions. Such high-cost interference is a disaster in fast-paced business operations.
A recent representative discussion comes from Boris Cherny, the lead engineer of Anthropic’s Claude Code.
When they were developing the Coding Agent, they initially followed the standard approach: vectorizing the codebase and using RAG retrieval. But later, they abandoned it entirely in favor of a seemingly "retro" solution: using ls to view directories, grep to search for keywords, read file to read files, and iterating through multiple rounds of validation.
In short, they made the model work like an engineer using the command line. The result was a dramatic improvement in efficiency and a significant reduction in system complexity.
Why? Because code requires 100% certainty.
grep provides definitive answers—with line numbers, context, and accountability. Vector retrieval, on the other hand, is a black box. You never know why it retrieves a particular result, and debugging it is extremely difficult. When a task demands absolute correctness, fuzzy matching is just planting landmines for yourself.
So why is RAG still so popular in to-B and to-G fields? The most practical reason is for quick implementation.
Most enterprise projects face three hard constraints: they need to be fast, appear advanced, and be reportable. Whether the solution is sustainable in the long term often takes a backseat.
Many internal methodologies within enterprises are just comfort-zone SOPs resulting from organizational inertia and low knowledge ceilings. RAG forces models to adapt to this low-level knowledge, effectively solidifying mediocre thinking. It makes companies spend a fortune only to end up with a "digital parrot" that repeats outdated internal regulations, hindering genuine organizational progress.
If you actually undertake foundational improvements—like restructuring knowledge systems, cleaning data, streamlining processes, building knowledge graphs, implementing rule-based systems, or redesigning business architectures—these efforts are valuable. But they take time, cost more, involve higher risks, require cross-department collaboration, and nobody wants to take the blame.
On the other hand, implementing RAG only requires bundling documents, chunking them, generating embeddings, connecting to a vector database, and adding a chat UI. You can have a demo in two weeks and "deliver" the project in a month. From a project management perspective, it’s almost the only realistic solution.
I call this type of technical delivery "AI sticker engineering." It looks advanced but offers no long-term compounding benefits. Worse, it creates an illusion within the organization that "we’ve already achieved AI transformation," delaying the difficult but essential digital reforms.
From this perspective, RAG can sometimes even be a barrier to progress.
Continuing to debate "whether RAG is useful" in 2026 is meaningless.
More valuable questions to ask are:
- Does this task require certainty or fuzziness?
- How high is the cost of errors?
- Is the knowledge structured?
- Do you really need a large model, or just a well-maintained Wiki?
Many companies spend millions on "intelligent Q&A" systems, only to achieve results worse than a carefully maintained documentation system (though document management is complex and also requires significant investment).
RAG is not a savior. It’s useful when you need inspiration and associative thinking. But when you need certainty and accountability, stay cautious.
Those who still sell RAG as a one-size-fits-all solution are mostly just in the business of "AI for AI’s sake."