Discussing whether RAG is useful in 2026 is actually meaningless

Over the past two years, almost any enterprise-level AI project has inevitably encountered the term "RAG" (Retrieval-Augmented Generation).

Many see it as the standard approach for deploying large models, as if simply dumping private documents into a vector database and connecting a model will give a company an "enterprise brain" overnight. In various proposals, whitepapers, and sales pitches, RAG has become the default configuration.

But if you've actually worked on several B2B or B2G projects, or genuinely used RAG, you'll likely experience a strong sense of discomfort. This thing doesn't seem as useful as advertised. In many cases, it's just a high-cost hallucination amplifier for models.

So now, when I see discussions about "whether to implement RAG," my first reaction is only four words: it's meaningless.

What's truly worth discussing is never "whether to implement RAG," but rather "what problem are you actually facing?"

The capability boundary of vector RAG is actually very clear. It excels at only one thing: finding "semantically similar" fragments within a pile of text. Note the word: similar.

RAG essentially provides "clues," not "knowledge."

It prioritizes coverage over precision. This works well in scenarios like emotional chats, personal notes, and brainstorming for writing inspiration. But in B2B operations, the core tasks demand the opposite from the model. Code, financial metrics, compliance clauses, approval workflows—a single wrong word in these tasks can lead to accidents.

I often use a vivid analogy: if in your private knowledge base, "Hello~" is mostly followed by a curse word, then a model connected to RAG that says "Hello~" will likely curse back. It's just doing probabilistic matching, without understanding context or causality. This logical blindness makes it unsuitable for rigorous production environments.

Besides inaccuracy, RAG has another fatal flaw most people overlook: misleading due to lag.

Business operations are dynamic; rules might have changed last week, but RAG's vector database is often a static "old ledger." Because maintaining and reconstructing vector indexes is extremely costly, many systems are filled with outdated SOPs and obsolete parameters.

This mechanism traps the model in a "knowledge poisoning" pitfall. The retrieval algorithm, based on semantic similarity, feeds the model "discarded truths" that look similar but are actually outdated. The model doesn't know the business has changed; it uses an extremely professional tone to mislead you into making wrong decisions with outdated information. This high-cost interference is a disaster in fast-changing business operations.

A recent representative discussion came from Boris Cherny, lead engineer of Anthropic's Claude Code.

In the early days of building their Coding Agent, they also followed the standard route: vectorizing the codebase and using RAG retrieval. But later, they abandoned it entirely, replacing it with a seemingly "retro" approach: ls to view directories, grep to search keywords, read file to read files, and multiple iterative verifications.

In short, they let the model use command lines like an engineer. The result was a dramatic efficiency boost with a significant reduction in system complexity.

Why? Because code requires 100% certainty.

Grep provides deterministic answers with line numbers, context, and traceability. Vector retrieval is a black box; you never know why it hits, and it's extremely hard to debug. When the task demands absolute correctness, fuzzy matching is like laying landmines for yourself.

So why is RAG still so popular in B2B and B2G fields? The most realistic reason is for rapid deployment.

Most enterprise projects have three hard constraints: be fast, look advanced, and be reportable. Whether the effect is sustainable in the long run often comes last.

Many internal methodologies in companies are just comfort-zone SOPs born from a lack of organizational vitality and low knowledge ceilings. RAG forces models to adapt to these low-level knowledge, effectively solidifying mediocre thinking. It makes companies spend a lot of money only to end up with a "digital repeater" that just recites outdated internal regulations, hindering true organizational evolution.

If you actually do underlying transformations—like restructuring knowledge structures, cleaning data, organizing processes, building knowledge graphs, creating rule systems, or revamping business architectures—these are all valuable, but they take time, cost a lot, carry high risks, and require cross-departmental collaboration. No one wants to take the blame.

But with RAG, you just need to package documents, chunk them, do embedding, connect a vector database, and add a chat UI. You can have a demo in two weeks and "accept delivery" in a month. From a project management perspective, it's almost the only realistic solution.

I call this technical delivery model "AI screen protector engineering." It looks advanced upfront but offers no long-term compounding benefits. Worse, it creates an illusion within the company that "we've already achieved AI transformation," thereby delaying the truly difficult but important digital overhaul.

From this perspective, RAG can sometimes even be a barrier to progress.

In 2026, continuing to argue about "whether RAG is useful" is meaningless.

A more valuable question should be:

Does this task require certainty or ambiguity?
Is the cost of error high?
Is the knowledge structured?
Do you really need a large model, or just a clean Wiki?

Many companies spend millions on "intelligent Q&A" systems that perform worse than a well-maintained documentation system (though documentation management is also complex and costly).

RAG is not a savior. It's useful when you need inspiration and associations. But when you need certainty and accountability, stay vigilant.

Those still peddling RAG as a universal solution are mostly just doing business "for AI's sake."

Let's talk