# Use Case 1: Debug RAG Retrieval Failures

## What It Is

RAG (Retrieval-Augmented Generation) systems rely on semantic search to find relevant document chunks. When a user asks a question, the system:

1. Embeds the user's query
2. Searches for similar document embeddings
3. Retrieves the top-k most similar chunks
4. Passes them to the LLM for answer generation

**The Problem:** When retrieval fails—returning irrelevant or incorrect documents—you have no visibility into *why*. You can't see which chunks are being retrieved, how close they are in embedding space, or why an irrelevant chunk scored higher than the correct one.

***

## How It Solves the Problem

VectorBoard visualizes your entire document embedding space, making retrieval behavior visible and debuggable:

1. **Visualize Semantic Overlap**
   * See where your document chunks actually cluster in the embedding space
   * Identify if similar topics (e.g., "pricing" and "cancellation") are too close together
   * Spot isolated chunks that never get retrieved
2. **Test Queries Visually**
   * Enter a problematic query → see which chunks are actually retrieved
   * Compare the query embedding position vs. document embeddings
   * Identify why irrelevant chunks score highly
3. **Compare Embedding Models**
   * Test different models side-by-side on your documents
   * See which model clusters your categories best
   * Pick the one that separates ambiguous docs properly

**Time Saved:** Hours of blind debugging → 10 minutes of visual analysis

***

## Example Scenario

### The Situation

You've built a customer support RAG system. Users are complaining:

```
User asks: "How do I cancel my subscription?"
RAG returns: "Our premium plan costs $29/month..." ❌

You know the cancellation doc exists:
"To cancel: Go to Account Settings > Billing > Cancel Subscription"
```

But you can't debug why the cancellation doc wasn't retrieved. You spend hours:

* Adding logging to print similarity scores
* Manually checking each document's embedding
* Still confused about why pricing beats cancellation

### The Root Cause (Revealed by VectorBoard)

After pushing your documents to VectorBoard and visualizing:

1. **You see the problem immediately:**
   * The "cancellation" document mentions "$29" and "subscription"
   * The "pricing" document mentions "$29/month" and "subscription"
   * Both are very close in embedding space (they cluster together)
2. **Why it happens:**
   * The embedding model treats both as semantically similar
   * When a user queries "cancel subscription", both documents are near the query
   * The pricing doc might score slightly higher due to more context
3. **The fix:**
   * Add more distinctive context to cancellation doc: "Cancellation process: Step-by-step guide"
   * Or try a different embedding model that better separates billing operations from pricing
   * Re-upload to VectorBoard → verify cancellation docs now cluster separately ✓

**Result:** 10 minutes of visual debugging vs. hours of blind troubleshooting

***

## How to Use

### Prerequisites

* VectorBoard running:
  * **Local:** `docker compose up -d`
  * **NodeOps:** Deploy from NodeOps marketplace (you'll get a deployment URL)
* Python 3.10+ with `httpx` and `sentence-transformers` installed
* Your RAG documents (or a representative sample)

### Step-by-Step Instructions

#### Step 1: Start VectorBoard

```bash
# From the project root
docker compose up -d
```

This starts VectorBoard with all services accessible through port 8501:

* Dashboard UI: `http://localhost:8501`
* API endpoints: `http://localhost:8501/api/*`

**Note:** If deploying via NodeOps, replace `localhost:8501` with your NodeOps deployment URL (e.g., `https://your-deployment.nodeops.network`).

Verify it's running:

```bash
curl http://localhost:8501/health
# Or for NodeOps: curl https://your-deployment.nodeops.network/health
```

#### Step 2: Install Dependencies

```bash
pip install httpx sentence-transformers
```

**Important:** Use the **same embedding model** you use in production for accurate visualization.

#### Step 3: Prepare Your Documents

Create a Python script `debug_rag.py`:

```python
import httpx
from sentence_transformers import SentenceTransformer

# 1. Initialize your embedding model (SAME as your production RAG)
model = SentenceTransformer('all-MiniLM-L6-v2')  # Replace with your model

# 2. Connect to VectorBoard
# For local deployment:
VECTORBOARD_URL = "http://localhost:8501"
# For NodeOps deployment, use your deployment URL:
# VECTORBOARD_URL = "https://your-deployment.nodeops.network"
client = httpx.Client(timeout=30.0)

# 3. Load your documents
documents = [
    {
        "text": "To cancel your subscription, go to Account Settings > Billing > Cancel Subscription. You will receive a confirmation email.",
        "category": "billing"
    },
    {
        "text": "Our Premium plan costs $29/month and includes unlimited projects, priority support, and advanced analytics.",
        "category": "pricing"
    },
    {
        "text": "Refund policy: Full refund within 30 days of purchase, no questions asked. Contact support@example.com",
        "category": "billing"
    },
    # ... add all your RAG documents here
]

# 4. Push to VectorBoard
print(f"Pushing {len(documents)} documents...")
for i, doc in enumerate(documents):
    # Generate embedding
    embedding = model.encode(doc["text"]).tolist()
    
    # Push to VectorBoard (API accessible through /api path)
    response = client.post(
        f"{VECTORBOARD_URL}/api/embeddings",
        json={
            "id": f"doc_{i}",
            "vector": embedding,
            "text": doc["text"],
            "metadata": {"category": doc.get("category", "unknown")},
            "collection": "my_rag_docs"  # Your collection name
        }
    )
    
    if response.status_code == 200:
        print(f"✅ {i+1}/{len(documents)}: {doc.get('category', 'unknown')}")
    else:
        print(f"❌ Failed {i+1}: {response.status_code}")

print(f"\n🎉 Done! View at {VECTORBOARD_URL}")
```

#### Step 4: Run the Script

```bash
python debug_rag.py
```

#### Step 5: Visualize in VectorBoard

1. **Open the dashboard:**
   * Local: <http://localhost:8501>
   * NodeOps: <https://your-deployment.nodeops.network>
2. **Select your collection:**
   * Use the dropdown to select `my_rag_docs` (or your collection name)
3. **Choose visualization method:**
   * **UMAP** (recommended) - Best for preserving local structure
   * **t-SNE** - Good for seeing clusters, slower on large datasets
   * **PCA** - Fast, linear projection
4. **Color-code by category:**
   * Select `category` from the metadata color dropdown
   * You'll immediately see if categories overlap or separate
5. **Test problematic queries:**
   * Use the Query Playground section
   * Enter your problematic query (e.g., "cancel subscription")
   * See which documents are closest
   * Identify why wrong docs are retrieved
6. **Analyze the results:**
   * **Tight, separate clusters** → Good separation, model works well
   * **Overlapping clusters** → Documents too similar, need better context or different model
   * **Isolated points** → Documents that might never get retrieved

#### Step 6: Fix and Verify

After making changes (adding context, trying a different model):

1. Re-run your script with updated documents
2. Use a different collection name: `my_rag_docs_v2`
3. Compare both collections in VectorBoard
4. Verify the fix worked

### Alternative: Load from Files

If your documents are in files:

```python
from pathlib import Path

# Load from text files
documents = []
for file in Path("./my_docs/").glob("*.txt"):
    with open(file) as f:
        documents.append({
            "text": f.read(),
            "category": file.stem,  # Use filename as category
            "source": str(file)
        })
```

### Alternative: Export from Existing Vector DB

If you already have embeddings in Pinecone/Qdrant/Weaviate:

```python
# Example: Export from your existing vector DB
# This is pseudocode - adapt to your vector DB's API

existing_db = YourVectorDB()  # Your existing connection
all_vectors = existing_db.get_all_vectors()

client = httpx.Client()
# Use your VectorBoard URL (local or NodeOps)
VECTORBOARD_URL = "http://localhost:8501"  # Or your NodeOps URL
for vector_data in all_vectors:
    client.post(
        f"{VECTORBOARD_URL}/api/embeddings",
        json={
            "id": vector_data['id'],
            "vector": vector_data['embedding'],
            "text": vector_data['metadata'].get('text', ''),
            "metadata": vector_data['metadata'],
            "collection": "exported_rag_docs"
        }
    )
```
