Use Case 1: Debug RAG Retrieval Failures

What It Is

RAG (Retrieval-Augmented Generation) systems rely on semantic search to find relevant document chunks. When a user asks a question, the system:

  1. Embeds the user's query

  2. Searches for similar document embeddings

  3. Retrieves the top-k most similar chunks

  4. Passes them to the LLM for answer generation

The Problem: When retrieval fails—returning irrelevant or incorrect documents—you have no visibility into why. You can't see which chunks are being retrieved, how close they are in embedding space, or why an irrelevant chunk scored higher than the correct one.


How It Solves the Problem

VectorBoard visualizes your entire document embedding space, making retrieval behavior visible and debuggable:

  1. Visualize Semantic Overlap

    • See where your document chunks actually cluster in the embedding space

    • Identify if similar topics (e.g., "pricing" and "cancellation") are too close together

    • Spot isolated chunks that never get retrieved

  2. Test Queries Visually

    • Enter a problematic query → see which chunks are actually retrieved

    • Compare the query embedding position vs. document embeddings

    • Identify why irrelevant chunks score highly

  3. Compare Embedding Models

    • Test different models side-by-side on your documents

    • See which model clusters your categories best

    • Pick the one that separates ambiguous docs properly

Time Saved: Hours of blind debugging → 10 minutes of visual analysis


Example Scenario

The Situation

You've built a customer support RAG system. Users are complaining:

But you can't debug why the cancellation doc wasn't retrieved. You spend hours:

  • Adding logging to print similarity scores

  • Manually checking each document's embedding

  • Still confused about why pricing beats cancellation

The Root Cause (Revealed by VectorBoard)

After pushing your documents to VectorBoard and visualizing:

  1. You see the problem immediately:

    • The "cancellation" document mentions "$29" and "subscription"

    • The "pricing" document mentions "$29/month" and "subscription"

    • Both are very close in embedding space (they cluster together)

  2. Why it happens:

    • The embedding model treats both as semantically similar

    • When a user queries "cancel subscription", both documents are near the query

    • The pricing doc might score slightly higher due to more context

  3. The fix:

    • Add more distinctive context to cancellation doc: "Cancellation process: Step-by-step guide"

    • Or try a different embedding model that better separates billing operations from pricing

    • Re-upload to VectorBoard → verify cancellation docs now cluster separately ✓

Result: 10 minutes of visual debugging vs. hours of blind troubleshooting


How to Use

Prerequisites

  • VectorBoard running:

    • Local: docker compose up -d

    • NodeOps: Deploy from NodeOps marketplace (you'll get a deployment URL)

  • Python 3.10+ with httpx and sentence-transformers installed

  • Your RAG documents (or a representative sample)

Step-by-Step Instructions

Step 1: Start VectorBoard

This starts VectorBoard with all services accessible through port 8501:

  • Dashboard UI: http://localhost:8501

  • API endpoints: http://localhost:8501/api/*

Note: If deploying via NodeOps, replace localhost:8501 with your NodeOps deployment URL (e.g., https://your-deployment.nodeops.network).

Verify it's running:

Step 2: Install Dependencies

Important: Use the same embedding model you use in production for accurate visualization.

Step 3: Prepare Your Documents

Create a Python script debug_rag.py:

Step 4: Run the Script

Step 5: Visualize in VectorBoard

  1. Open the dashboard:

    • Local: http://localhost:8501

    • NodeOps: https://your-deployment.nodeops.network

  2. Select your collection:

    • Use the dropdown to select my_rag_docs (or your collection name)

  3. Choose visualization method:

    • UMAP (recommended) - Best for preserving local structure

    • t-SNE - Good for seeing clusters, slower on large datasets

    • PCA - Fast, linear projection

  4. Color-code by category:

    • Select category from the metadata color dropdown

    • You'll immediately see if categories overlap or separate

  5. Test problematic queries:

    • Use the Query Playground section

    • Enter your problematic query (e.g., "cancel subscription")

    • See which documents are closest

    • Identify why wrong docs are retrieved

  6. Analyze the results:

    • Tight, separate clusters → Good separation, model works well

    • Overlapping clusters → Documents too similar, need better context or different model

    • Isolated points → Documents that might never get retrieved

Step 6: Fix and Verify

After making changes (adding context, trying a different model):

  1. Re-run your script with updated documents

  2. Use a different collection name: my_rag_docs_v2

  3. Compare both collections in VectorBoard

  4. Verify the fix worked

Alternative: Load from Files

If your documents are in files:

Alternative: Export from Existing Vector DB

If you already have embeddings in Pinecone/Qdrant/Weaviate:

Last updated