Use Case 1: Debug RAG Retrieval Failures

What It Is

RAG (Retrieval-Augmented Generation) systems rely on semantic search to find relevant document chunks. When a user asks a question, the system:

Embeds the user's query
Searches for similar document embeddings
Retrieves the top-k most similar chunks
Passes them to the LLM for answer generation

The Problem: When retrieval fails—returning irrelevant or incorrect documents—you have no visibility into why. You can't see which chunks are being retrieved, how close they are in embedding space, or why an irrelevant chunk scored higher than the correct one.

How It Solves the Problem

VectorBoard visualizes your entire document embedding space, making retrieval behavior visible and debuggable:

Visualize Semantic Overlap
- See where your document chunks actually cluster in the embedding space
- Identify if similar topics (e.g., "pricing" and "cancellation") are too close together
- Spot isolated chunks that never get retrieved
Test Queries Visually
- Enter a problematic query → see which chunks are actually retrieved
- Compare the query embedding position vs. document embeddings
- Identify why irrelevant chunks score highly
Compare Embedding Models
- Test different models side-by-side on your documents
- See which model clusters your categories best
- Pick the one that separates ambiguous docs properly

Time Saved: Hours of blind debugging → 10 minutes of visual analysis

Example Scenario

The Situation

You've built a customer support RAG system. Users are complaining:

User asks: "How do I cancel my subscription?"
RAG returns: "Our premium plan costs $29/month..." ❌

You know the cancellation doc exists:
"To cancel: Go to Account Settings > Billing > Cancel Subscription"

But you can't debug why the cancellation doc wasn't retrieved. You spend hours:

Adding logging to print similarity scores
Manually checking each document's embedding
Still confused about why pricing beats cancellation

The Root Cause (Revealed by VectorBoard)

After pushing your documents to VectorBoard and visualizing:

You see the problem immediately:
- The "cancellation" document mentions "$29" and "subscription"
- The "pricing" document mentions "$29/month" and "subscription"
- Both are very close in embedding space (they cluster together)
Why it happens:
- The embedding model treats both as semantically similar
- When a user queries "cancel subscription", both documents are near the query
- The pricing doc might score slightly higher due to more context
The fix:
- Add more distinctive context to cancellation doc: "Cancellation process: Step-by-step guide"
- Or try a different embedding model that better separates billing operations from pricing
- Re-upload to VectorBoard → verify cancellation docs now cluster separately ✓

Result: 10 minutes of visual debugging vs. hours of blind troubleshooting

How to Use

Prerequisites

VectorBoard running:
- Local: docker compose up -d
- NodeOps: Deploy from NodeOps marketplace (you'll get a deployment URL)
Python 3.10+ with httpx and sentence-transformers installed
Your RAG documents (or a representative sample)

Step-by-Step Instructions

Step 1: Start VectorBoard

# From the project root
docker compose up -d

This starts VectorBoard with all services accessible through port 8501:

Dashboard UI: http://localhost:8501
API endpoints: http://localhost:8501/api/*

Note: If deploying via NodeOps, replace localhost:8501 with your NodeOps deployment URL (e.g., https://your-deployment.nodeops.network).

Verify it's running:

curl http://localhost:8501/health
# Or for NodeOps: curl https://your-deployment.nodeops.network/health

Step 2: Install Dependencies

pip install httpx sentence-transformers

Important: Use the same embedding model you use in production for accurate visualization.

Step 3: Prepare Your Documents

Create a Python script debug_rag.py:

import httpx
from sentence_transformers import SentenceTransformer

# 1. Initialize your embedding model (SAME as your production RAG)
model = SentenceTransformer('all-MiniLM-L6-v2')  # Replace with your model

# 2. Connect to VectorBoard
# For local deployment:
VECTORBOARD_URL = "http://localhost:8501"
# For NodeOps deployment, use your deployment URL:
# VECTORBOARD_URL = "https://your-deployment.nodeops.network"
client = httpx.Client(timeout=30.0)

# 3. Load your documents
documents = [
    {
        "text": "To cancel your subscription, go to Account Settings > Billing > Cancel Subscription. You will receive a confirmation email.",
        "category": "billing"
    },
    {
        "text": "Our Premium plan costs $29/month and includes unlimited projects, priority support, and advanced analytics.",
        "category": "pricing"
    },
    {
        "text": "Refund policy: Full refund within 30 days of purchase, no questions asked. Contact support@example.com",
        "category": "billing"
    },
    # ... add all your RAG documents here
]

# 4. Push to VectorBoard
print(f"Pushing {len(documents)} documents...")
for i, doc in enumerate(documents):
    # Generate embedding
    embedding = model.encode(doc["text"]).tolist()
    
    # Push to VectorBoard (API accessible through /api path)
    response = client.post(
        f"{VECTORBOARD_URL}/api/embeddings",
        json={
            "id": f"doc_{i}",
            "vector": embedding,
            "text": doc["text"],
            "metadata": {"category": doc.get("category", "unknown")},
            "collection": "my_rag_docs"  # Your collection name
        }
    )
    
    if response.status_code == 200:
        print(f"✅ {i+1}/{len(documents)}: {doc.get('category', 'unknown')}")
    else:
        print(f"❌ Failed {i+1}: {response.status_code}")

print(f"\n🎉 Done! View at {VECTORBOARD_URL}")

Step 4: Run the Script

python debug_rag.py

Step 5: Visualize in VectorBoard

Open the dashboard:
- Local: http://localhost:8501
- NodeOps: https://your-deployment.nodeops.network
Select your collection:
- Use the dropdown to select my_rag_docs (or your collection name)
Choose visualization method:
- UMAP (recommended) - Best for preserving local structure
- t-SNE - Good for seeing clusters, slower on large datasets
- PCA - Fast, linear projection
Color-code by category:
- Select category from the metadata color dropdown
- You'll immediately see if categories overlap or separate
Test problematic queries:
- Use the Query Playground section
- Enter your problematic query (e.g., "cancel subscription")
- See which documents are closest
- Identify why wrong docs are retrieved
Analyze the results:
- Tight, separate clusters → Good separation, model works well
- Overlapping clusters → Documents too similar, need better context or different model
- Isolated points → Documents that might never get retrieved

Step 6: Fix and Verify

After making changes (adding context, trying a different model):

Re-run your script with updated documents
Use a different collection name: my_rag_docs_v2
Compare both collections in VectorBoard
Verify the fix worked

Alternative: Load from Files

If your documents are in files:

from pathlib import Path

# Load from text files
documents = []
for file in Path("./my_docs/").glob("*.txt"):
    with open(file) as f:
        documents.append({
            "text": f.read(),
            "category": file.stem,  # Use filename as category
            "source": str(file)
        })

Alternative: Export from Existing Vector DB

If you already have embeddings in Pinecone/Qdrant/Weaviate:

# Example: Export from your existing vector DB
# This is pseudocode - adapt to your vector DB's API

existing_db = YourVectorDB()  # Your existing connection
all_vectors = existing_db.get_all_vectors()

client = httpx.Client()
# Use your VectorBoard URL (local or NodeOps)
VECTORBOARD_URL = "http://localhost:8501"  # Or your NodeOps URL
for vector_data in all_vectors:
    client.post(
        f"{VECTORBOARD_URL}/api/embeddings",
        json={
            "id": vector_data['id'],
            "vector": vector_data['embedding'],
            "text": vector_data['metadata'].get('text', ''),
            "metadata": vector_data['metadata'],
            "collection": "exported_rag_docs"
        }
    )

PreviousUse Cases Overview NextUse Case 2: Monitor AI Agent Memory Drift

Last updated 5 months ago

Good afternoon

hashtagWhat It Is

hashtagHow It Solves the Problem

hashtagExample Scenario

hashtagThe Situation

hashtagThe Root Cause (Revealed by VectorBoard)

hashtagHow to Use

hashtagPrerequisites

hashtagStep-by-Step Instructions

hashtagStep 1: Start VectorBoard

hashtagStep 2: Install Dependencies

hashtagStep 3: Prepare Your Documents

hashtagStep 4: Run the Script

hashtagStep 5: Visualize in VectorBoard

hashtagStep 6: Fix and Verify

hashtagAlternative: Load from Files

hashtagAlternative: Export from Existing Vector DB