In the past two years, whenever working on enterprise-level AI projects, one term has been almost unavoidable: "RAG" (Retrieval-Augmented Generation).
Many people see it as the standard approach for implementing large models, as if simply dumping private documents into a vector database and connecting it to a model would instantly give a company an "enterprise brain." In various solutions, white papers, and sales pitches, RAG has almost become the default configuration.
But if you've actually worked on a few to-B or to-G projects or have genuinely used RAG, you’ve likely felt a strong sense of discomfort. It doesn’t seem as useful as advertised. In many cases, it’s just a high-cost model hallucination amplifier.
So now, when I see discussions about "whether to implement RAG," my first reaction is just four words: it’s meaningless.
What’s truly worth discussing is never "whether to implement RAG," but "what problem are you actually facing?"
The capabilities of vector-based RAG are actually very clear. It’s only good at one thing: helping you find "semantically similar" snippets in a pile of text. Pay attention to this word: similar.
RAG essentially provides "clues," not "knowledge."
It prioritizes coverage, not precision. This works well in scenarios like emotional chatting, personal note-taking, or brainstorming for writing inspiration. But in to-B business contexts, the core tasks demand the opposite from models. Code, financial standards, compliance clauses, approval processes—in these tasks, even a single wrong character can lead to incidents.
I often use a vivid analogy: if in your private knowledge base, "Hello~" is often followed by a curse word, then a model connected to RAG might actually curse back when someone says "Hello~." That’s because it’s just doing probabilistic matching; it doesn’t understand context or causality. This logical blindness means it can’t function well in rigorous production environments.
Besides being imprecise, RAG has another fatal flaw that most people overlook: misleading due to lag.
Business operations are dynamic; rules might have changed last week, but RAG’s vector database is often a static "archive of old accounts." Due to the high cost of maintaining and reconstructing vector indexes, many systems are filled with outdated SOPs and invalid parameters.
This mechanism can trap models in a "knowledge poisoning" pitfall. The retrieval algorithm, based on semantic similarity, feeds the model with what looks similar but is actually outdated "obsolete truths." The model doesn’t know the business has changed; it will just use a highly professional tone to mislead you into making wrong decisions with outdated information. Such high-cost interference is a disaster in fast-paced business operations.
A recent representative discussion comes from Boris Cherny, the lead engineer of Anthropic’s Claude Code.
When they were developing the Coding Agent, they initially followed the standard route: vectorizing the codebase and using RAG retrieval. But later, they abandoned it entirely, replacing it with a solution that sounds very "retro": using ls to view directories, grep to search for keywords, read file to read files, and iterating through multiple rounds of validation.
In short, it’s about making the model use the command line like an engineer. The result was a crushing improvement in efficiency, along with a significant reduction in system complexity.
Why? Because code requires 100% certainty.
grep provides definitive answers—with line numbers, context, and accountability. Vector recall is a black box; you never know why it matched something, and it’s extremely hard to debug. When a task demands absolute correctness, fuzzy matching is just planting landmines for yourself.
So why is RAG still so popular in the to-B and to-G fields? The most practical reason is for quick implementation.
Most enterprise projects have three hard constraints: they need to be fast, look advanced, and be reportable. Whether the results are sustainable in the long term often comes later.
Many internal methodologies within enterprises are just comfort-zone SOPs born from a lack of organizational vitality and low knowledge ceilings. RAG forces models to adapt to this low-level knowledge, effectively solidifying mediocre thinking. It makes companies spend a fortune only to end up with a "digital parrot" that merely repeats outdated internal regulations, hindering genuine organizational evolution.
If you actually undertake foundational reforms—like restructuring knowledge systems, cleaning data, streamlining processes, building knowledge graphs, creating rule systems, or redesigning business architectures—these are all valuable. But they take time, are costly, risky, require cross-departmental collaboration, and no one wants to take the blame.
On the other hand, implementing RAG only requires bundling documents, chunking them, doing embeddings, connecting a vector database, and adding a conversational UI. You can have a demo in two weeks and "acceptance" in a month. From a project management perspective, it’s almost the only realistic solution.
I call this form of technology delivery "AI sticker engineering." It looks advanced but yields no long-term compound benefits. Worse, it creates an illusion within the enterprise, making people think "we’ve already completed AI transformation," thereby delaying the truly difficult but important digital reforms.
From this perspective, RAG can sometimes even be a barrier to progress.
Continuing to debate "whether RAG is useful" in 2026 is meaningless.
More valuable questions should be:
- Does this task require certainty or fuzziness?
- How high is the cost of error?
- Is the knowledge structured?
- Do you really need a large model, or just a well-maintained Wiki?
Many companies spend millions on "intelligent Q&A" systems that perform worse than a carefully maintained documentation system (though document management is complex and also requires significant investment).
RAG is not a savior. It’s useful when you need inspiration and association. When you need certainty and accountability, stay cautious.
Those who are still selling RAG as a universal solution are mostly just in the business of "AI for AI's sake."