# Use Case 3: Choose the Right Embedding Model

## What It Is

There are dozens of embedding models available:

* **OpenAI** (`text-embedding-3-small`, `text-embedding-3-large`) - $0.13 per 1M tokens
* **Sentence-transformers** (`all-MiniLM-L6-v2`, `BGE-small-en-v1.5`) - Free, local
* **Cohere** (`embed-english-v3.0`) - Paid API
* **And many more...**

**The Problem:** General benchmarks don't tell you which model works best for *your* specific data. You might:

* Pick the most expensive model hoping it's best
* Discover it performs poorly for your domain in production
* Waste weeks A/B testing different models with real users

***

## How It Solves the Problem

VectorBoard lets you visually compare multiple embedding models side-by-side:

1. **A/B Test Embedding Models**
   * Generate embeddings for the same documents using different models
   * Upload each to separate collections
   * Compare clustering quality visually
2. **Visual Quality Assessment**
   * **Good embeddings:** Clear, tight clusters by category; smooth transitions; few outliers
   * **Poor embeddings:** Random scatter; categories mixed together; many isolated points
3. **Cost vs. Quality Trade-off**
   * See if free models match paid model quality
   * Make data-driven decisions before production deployment

**Outcome:** Find the perfect model in 30 minutes vs. weeks of A/B testing

***

## Example Scenario

### The Situation

You're building an e-commerce product recommendation system:

```
10,000 products across 50 categories:
- Running shoes
- Dress shoes
- Athletic wear
- Formal wear
- Electronics
- etc.
```

**You pick OpenAI** (`text-embedding-3-small`) based on general benchmarks:

* Cost: $1,300/month
* Benchmarks: "Best on general tasks"

**In production:**

* "Running shoes" recommendations include "dress shoes" ❌
* Users complain about irrelevant recommendations
* You realize the model doesn't understand your domain well

### The Solution (With VectorBoard)

1. **Test 3 models on 100 representative products:**
   * OpenAI `text-embedding-3-small` (paid)
   * BGE-small-en-v1.5 (free)
   * BGE-base-en-v1.5 (free)
2. **Visualize in VectorBoard:**
   * Upload each to separate collections
   * Color-code by product category
3. **Compare results:**
   * **OpenAI:** Categories mixed together (running shoes with dress shoes)
   * **BGE-small:** Better separation, but some overlap
   * **BGE-base:** Perfect clustering - athletic wear clearly separated from formal wear ✓
4. **Decision:**
   * Choose BGE-base → FREE and better quality than OpenAI for your use case
   * Savings: $1,300/month → $0/month
   * Better recommendations → happier users

***

## How to Use

### Prerequisites

* VectorBoard running:
  * **Local:** `docker compose up -d`
  * **NodeOps:** Deploy from NodeOps marketplace (you'll get a deployment URL)
* 50-100 representative test documents from your domain
* Python 3.10+ with `httpx` and `sentence-transformers`
* Optional: OpenAI API key if testing OpenAI models

### Step-by-Step Instructions

#### Step 1: Prepare Test Documents

Select 50-100 representative documents from your actual data:

```python
# test_documents.py
test_docs = [
    {
        "text": "Machine learning algorithms learn patterns from data",
        "category": "ai"
    },
    {
        "text": "How to bake chocolate chip cookies with vanilla extract",
        "category": "cooking"
    },
    {
        "text": "Python programming tutorial for beginners",
        "category": "programming"
    },
    # ... add 50-100 more documents covering your categories
]
```

**Important:** Use documents that represent your actual use case. Include:

* Different categories you need to separate
* Edge cases that are confusing
* Ambiguous documents that should be distinct

#### Step 2: Create Comparison Script

Create `compare_models.py`:

```python
import httpx
from sentence_transformers import SentenceTransformer

# Your test documents (50-100 representative samples)
test_docs = [
    {"text": "Your document 1", "category": "category1"},
    {"text": "Your document 2", "category": "category2"},
    # ... add more
]

# Models to test
models = {
    "minilm": SentenceTransformer('all-MiniLM-L6-v2'),
    "bge-small": SentenceTransformer('BAAI/bge-small-en-v1.5'),
    "bge-base": SentenceTransformer('BAAI/bge-base-en-v1.5'),
    # Add more models as needed
}

# Optional: OpenAI model
# from openai import OpenAI
# openai_client = OpenAI(api_key="your-key-here")

# For local deployment:
VECTORBOARD_URL = "http://localhost:8501"
# For NodeOps deployment, use your deployment URL:
# VECTORBOARD_URL = "https://your-deployment.nodeops.network"
client = httpx.Client(timeout=30.0)

# Test each model
for model_name, model in models.items():
    print(f"Testing {model_name}...")
    
    for i, doc in enumerate(test_docs):
        # Generate embedding
        embedding = model.encode(doc["text"]).tolist()
        
        # Push to separate collection per model
        response = client.post(
            f"{VECTORBOARD_URL}/api/embeddings",
            json={
                "id": f"{model_name}_doc_{i}",
                "vector": embedding,
                "text": doc["text"],
                "metadata": {
                    "category": doc["category"],
                    "model": model_name
                },
                "collection": f"model_{model_name}"  # Separate collection per model
            }
        )
        
        if response.status_code != 200:
            print(f"  ❌ Failed doc {i}: {response.status_code}")
    
    print(f"  ✅ {model_name} complete - Collection: model_{model_name}")

# Test OpenAI (if you have API key)
# if openai_client:
#     print("Testing OpenAI...")
#     for i, doc in enumerate(test_docs):
#         response = openai_client.embeddings.create(
#             model="text-embedding-3-small",
#             input=doc["text"]
#         )
#         embedding = response.data[0].embedding
#         
#         client.post(
#             f"{VECTORBOARD_URL}/api/embeddings",
#             json={
#                 "id": f"openai_doc_{i}",
#                 "vector": embedding,
#                 "text": doc["text"],
#                 "metadata": {"category": doc["category"], "model": "openai"},
#                 "collection": "model_openai"
#             }
#         )

print("\n🎉 All models uploaded!")
print(f"Compare in VectorBoard: {VECTORBOARD_URL}")
```

#### Step 3: Run the Comparison

```bash
python compare_models.py
```

**Note:** This may take several minutes depending on:

* Number of documents
* Number of models
* Model download time (first run)

#### Step 4: Compare in VectorBoard

1. **Open dashboard:**
   * Local: <http://localhost:8501>
   * NodeOps: <https://your-deployment.nodeops.network>
2. **Compare collections:**
   * Switch between collections: `model_minilm`, `model_bge-small`, `model_bge-base`
   * For each collection:
     * Color-code by `category`
     * Note clustering quality
3. **Quality indicators:**

   **✅ Good Model:**

   * Clear, tight clusters by category
   * Smooth transitions between related topics
   * Few outliers
   * Categories visually separated

   **❌ Poor Model:**

   * Random scatter, no clear clusters
   * Categories mixed together
   * Many isolated points
   * Can't distinguish between categories
4. **Make decision:**
   * **Quality:** Which clusters your categories best?
   * **Cost:** Free models vs. paid (OpenAI: $0.13/1M tokens)
   * **Speed:** Local models are instant, API models require network calls
5. **Document findings:**
   * Screenshot visualizations
   * Note which model works best for your domain
   * Update production code

#### Step 5: Production Deployment

After choosing your model:

1. Update your production code to use the winning model
2. Re-embed all documents with the chosen model
3. Deploy with confidence

### Example: Comparing Free vs. Paid

Want to see if free models match OpenAI quality?

```python
# Quick comparison script
import httpx
from sentence_transformers import SentenceTransformer
from openai import OpenAI

test_docs = [{"text": "Doc 1", "category": "tech"}, ...]

# Test MiniLM (free)
model = SentenceTransformer('all-MiniLM-L6-v2')
client = httpx.Client()

# Use your VectorBoard URL (local or NodeOps)
VECTORBOARD_URL = "http://localhost:8501"  # Or your NodeOps URL
for i, doc in enumerate(test_docs):
    embedding = model.encode(doc["text"]).tolist()
    client.post(
        f"{VECTORBOARD_URL}/api/embeddings",
        json={
            "id": f"minilm_{i}",
            "vector": embedding,
            "text": doc["text"],
            "metadata": {"category": doc["category"]},
            "collection": "comparison_minilm"
        }
    )

# Test OpenAI
openai_client = OpenAI(api_key="your-key")
for i, doc in enumerate(test_docs):
    response = openai_client.embeddings.create(
        model="text-embedding-3-small",
        input=doc["text"]
    )
    embedding = response.data[0].embedding
    # Use your VectorBoard URL (local or NodeOps)
    vectorboard_url = "http://localhost:8501"  # Or your NodeOps URL
    client.post(
        f"{vectorboard_url}/api/embeddings",
        json={
            "id": f"openai_{i}",
            "vector": embedding,
            "text": doc["text"],
            "metadata": {"category": doc["category"]},
            "collection": "comparison_openai"
        }
    )
```