Waboom AI
AI Training
AI Automation
AI Voice Agents
Resources
Contact
09 888 0402
Back to BlogTechnology

The RAG-Powered Voice Agent: How Retell AI Elevates Knowledge Retrieval

Leonardo Garcia-Curtis06/08/2025
The RAG-Powered Voice Agent: How Retell AI Elevates Knowledge Retrieval

A manufacturing company in Lower Hutt called us with a problem. 3,000+ equipment manuals spanning 30 years. Their support staff spent 40% of each shift hunting through filing cabinets and shared drives for the right spec sheet.

"How long does it take your technicians to find the right manual?" we asked.

"Anywhere from 5 minutes to an hour. Depends who you ask."

We loaded those 3,000 documents into 6 specialist knowledge bases. Linked them to a single voice agent. Now their technicians call a number, describe the issue, and get the right answer in under 10 seconds. 89% reduction in delayed responses.

Your documents aren't the problem. How your team accesses them is.

What RAG Actually Does

RAG stands for Retrieval-Augmented Generation. Sounds complex. The concept is simple.

When your caller asks a question, 3 things happen:

1. Retrieve — Your agent searches its knowledge base for content relevant to what your caller just said.

2. Augment — The retrieved content gets injected into your LLM's context alongside the conversation.

3. Generate — Your LLM crafts a response using both the conversation and the retrieved documents.

Without RAG, your agent knows what you told it in the system prompt. With RAG, your agent knows what's in your documents. That's the difference between "I don't have that information" and a correct answer that converts.

Knowledge base architecture

Your documents become instant answers.

How Retell's Knowledge Base Works

Retell's KB system handles the heavy lifting. You upload your documents, configure retrieval settings, and your agent starts answering from them.

What You Can Upload

Retell accepts a wide range of formats:

  • Documents: PDF, DOCX, ODT, RTF, EPUB
  • Spreadsheets: CSV, TSV, XLS, XLSX (1,000 rows max, 50 columns)
  • Markup: Markdown, TXT, HTML, XML
  • Pro tip: use Markdown. Retell's chunking engine handles structured Markdown better than any other format. Headings create natural chunk boundaries, and your retrieval accuracy improves.

    The Limits You Need to Know

    Each knowledge base has hard caps:

  • 25 files (50 MB each)
  • 500 URLs (auto-crawled daily)
  • 50 text snippets (manually added)
  • Hit these limits? Create additional knowledge bases. Your workspace gets 10 free. Extra KBs cost you roughly the same as a coffee a month.

    Retrieval Settings

    Two dials control how your agent searches:

    Chunks to retrieve (1-10, default: 3) — How much context your agent pulls per question. More chunks = more context for your LLM, but also more noise. Start at 3.

    Similarity threshold (default: 0.60) — How closely a chunk must match your caller's question. Higher = stricter, fewer results. Lower = broader, more results.

    For pricing and legal content, push your threshold to 0.75-0.80. For general FAQs, drop it to 0.45. Your domain determines the right setting.

    Optimising Your Knowledge Base

    The difference between a KB that adds 50ms and one that adds 300ms+ comes down to structure. Here's what we've learned across 40+ deployments:

    Structure Your Documents for Chunking

    Retell chunks your content at ingestion. You don't control chunk size directly, but you control what those chunks look like.

    Use headings. Every H2 section becomes a natural chunk boundary. Keep each section focused on one topic. A section that covers pricing AND returns AND shipping will retrieve poorly for all three.

    One concept per document. Don't dump your entire company wiki into a single PDF. Split it by department, product line, or topic. Your retrieval precision improves immediately.

    Assign KBs at the Node Level

    This is the feature most teams miss. In your conversation flow, you can assign a specific knowledge base to a specific node.

    Your pricing node only searches the pricing KB. Your product node only searches your product catalogue. No cross-contamination. Faster retrieval.

    Better answers.

    Node-level KB assignment reduces retrieval latency by narrowing the search space. Instead of searching across all your documents, your agent searches only what's relevant to that stage of the conversation.

    Keep Your KBs Fresh

    Retell auto-refreshes URL sources every 24 hours. If your website content changes, your agent picks it up the next day.

    For dynamic content (product catalogues, pricing, schedules), use the auto-crawl feature. Point it at a URL path, and new pages under that path get automatically indexed.

    Set exclusion lists for navigation pages, login paths, and duplicate content.

    The Multi-KB Strategy

    One knowledge base rarely covers everything your agent needs. Here's how we structure multi-KB deployments:

    Segment by domain. Products in one KB. Policies in another. Technical specs in a third. Each KB stays focused, and your retrieval stays sharp.

    Segment by update frequency. Static content (company history, leadership bios) goes in one KB. Dynamic content (pricing, availability, schedules) goes in another. You update the dynamic KB without touching the static one.

    Segment by audience. Customer-facing answers in one KB. Internal operational docs in another. Your data security stays clean because your agent only accesses what it needs.

    We deployed an insurance company's voice agent with 500+ policy documents across 4 knowledge bases. Product info, claims procedures, compliance requirements, and general FAQs.

    Each KB linked to the relevant nodes in the conversation flow.

    What Changed with Knowledge Base 2.0

    Retell shipped a major KB upgrade in mid-2025. The improvements matter for your deployments:

  • 50% improvement in answer accuracy across the board
  • Retrieval now focuses on the most relevant parts of your conversation, not the full transcript
  • Better handling of structured data like phone numbers and addresses
  • Improved ranking that eliminates irrelevant content outscoring correct answers
  • If you built your KB before this update, test it again. Your retrieval accuracy has improved without you changing a thing.

    Testing Your Knowledge Base

    Before going live, verify your KB answers correctly. Retell's playground shows which chunks were retrieved per turn. Run through your common questions and check:

  • Does your agent pull the right chunks?
  • Are irrelevant chunks outranking relevant ones?
  • Does your similarity threshold need adjusting?
  • Then run your full test suite with batch simulation testing. Test every knowledge-dependent question your callers will ask.

    Your documents should answer calls. We make that happen.

    Book a Strategy Call | See the Platform

    Frequently Asked Questions

    What file types does Retell's knowledge base support?

    Retell accepts PDF, DOCX, ODT, RTF, EPUB, CSV, TSV, XLS, XLSX, Markdown, TXT, HTML, and XML. For best results, use Markdown — Retell's chunking engine handles structured headings more accurately than plain text.

    Spreadsheets are limited to 1,000 rows and 50 columns per file.

    How does retrieval latency affect voice agent performance?

    Each knowledge base lookup adds roughly 100ms to your response time. That's negligible for a single retrieval but compounds if your agent searches across multiple KBs on every turn.

    Use node-level KB assignment to narrow the search scope and keep your total retrieval under 150ms.

    Can I use multiple knowledge bases with one agent?

    Yes. Your workspace gets 10 free knowledge bases. An agent can have multiple KBs linked at once — all are searched per retrieval.

    For better performance, assign specific KBs to specific nodes in your conversation flow. This reduces noise and improves both speed and accuracy.

    How do I know if my knowledge base is working correctly?

    Use Retell's test playground to see which chunks your agent retrieves per turn. Check that your agent pulls the right content for common questions.

    Adjust your similarity threshold (higher for precision, lower for coverage) and chunk count (3 is a good default, increase for multi-part answers). Run batch simulation tests before deploying to production.

    LG

    Leonardo Garcia-Curtis

    Founder & CEO at Waboom AI. Building voice AI agents that convert.

    Ready to Build Your AI Voice Agent?

    Let's discuss how Waboom AI can help automate your customer conversations.

    Book a Free Demo

    Related Articles

    29,309 Tribunal Applications. Two Thirds Were Rent Arrears.

    29,309 Tribunal Applications. Two Thirds Were Rent Arrears.

    I Deleted 11 Custom GPTs This Week. ChatGPT Skills Made Them Redundant.

    I Deleted 11 Custom GPTs This Week. ChatGPT Skills Made Them Redundant.

    44,000 Dormant Leads. They Tested 800. Booked 12 Meetings.

    44,000 Dormant Leads. They Tested 800. Booked 12 Meetings.

    Waboom AI

    Empowering New Zealand and Australian businesses with AI voice agents and automation that deliver real, measurable value.

    hello@waboom.ai+64 9 888 0402
    Level 8, 139 Quay Street
    Auckland CBD, New Zealand

    Solutions

    • AI Training
    • AI Strategy
    • AI Automation
    • AI Voice Agents
    • AI Champion Workshop

    Resources

    • AI Voice Agent Pricing
    • AI Voice Demos
    • Resources
    • Blog

    Company

    • About Us
    • Contact
    • Privacy Policy
    • Terms of Service

    Powered by leading AI technologies

    VAPIRetell AIOpenAIZapierMakeStripe

    © 2026 Waboom.ai. All rights reserved.

    PrivacyTermsSecurity