diff --git a/website/docs/use-cases/index.md b/website/docs/use-cases/index.md new file mode 100644 index 000000000..250e11b74 --- /dev/null +++ b/website/docs/use-cases/index.md @@ -0,0 +1,59 @@ +--- +sidebar_position: 1 +--- + +# Use Cases + +This section covers proposed use cases and scenarios for the vLLM Semantic Router. These features are currently **Work in Progress** and not yet implemented in the current version. + +:::warning Work in Progress +The use cases described in this section are **proposed features** that are not yet implemented in the current version of the semantic router. They represent future capabilities that would enhance the router's functionality. +::: + +## Proposed Enterprise Use Cases + +### [Multi-Provider Routing](multi-provider-routing.md) 🚧 + +**Status: Work in Progress** + +Route queries to the best AI provider for each specific task - coding questions to Claude, business writing to GPT-4, research to Gemini, and sensitive data to internal models. + +**Key Scenarios:** +- **Tech Startups**: Different teams get specialized AI assistance (engineering → Claude, marketing → GPT-4, legal → internal) +- **Consulting Firms**: Route strategic projects to premium models, routine tasks to cost-effective ones +- **Financial Services**: Balance external AI capabilities with regulatory compliance requirements + +**Proposed Benefits:** +- Get the best AI for each specific task instead of one-size-fits-all +- Cost optimization by using appropriate models for each query type +- Automatic compliance routing for sensitive data +- Leverage each provider's unique strengths + +### [Keyword-Based Routing](keyword-based-routing.md) 🚧 + +**Status: Work in Progress** + +Implement transparent, deterministic routing rules for enterprise security and compliance. Route queries based on keywords and business policies with complete visibility into routing decisions. + +**Key Scenarios:** +- **Financial Services**: Ensure confidential trading strategies never reach external AI providers +- **Healthcare**: Route patient data to HIPAA-compliant models while using external AI for general medical knowledge +- **Enterprise Search**: Route queries requiring web search to providers with search capabilities + +**Proposed Benefits:** +- Complete data sovereignty and regulatory compliance +- Transparent, auditable routing decisions +- Sub-millisecond routing for deterministic cases +- Enterprise policy enforcement with full visibility + +## Documentation Structure + +Each proposed use case includes: +- **User Stories**: Real-world scenarios and challenges these features would solve +- **Business Value**: Clear benefits for users and organizations +- **Use Case Examples**: Specific scenarios with business needs and solutions +- **Next Steps**: Guidance for planning and implementation + +## Contributing to Development + +These use cases represent proposed features that could be implemented in future versions. We welcome contributions to help implement these capabilities! Please see our [Contributing Guide](https://github.com/vllm-project/semantic-router/blob/main/CONTRIBUTING.md) for details on how to contribute to the development of these features. diff --git a/website/docs/use-cases/keyword-based-routing.md b/website/docs/use-cases/keyword-based-routing.md new file mode 100644 index 000000000..5efb76f4b --- /dev/null +++ b/website/docs/use-cases/keyword-based-routing.md @@ -0,0 +1,218 @@ +--- +sidebar_position: 2 +--- + +# Keyword-Based Routing + +:::warning Work in Progress +This use case describes a **proposed feature** that is not yet implemented in the current version of the semantic router. This represents a future capability that would enhance the router's functionality. +::: + +## The Enterprise Challenge + +Imagine you're the CTO of a financial services company. Your team has built an AI-powered customer service system that routes queries to different AI models based on their content. However, you're facing critical challenges: + +- **Security Risk**: Customer queries about "internal account procedures" or "confidential trading strategies" are being sent to external AI providers, potentially exposing sensitive information +- **Compliance Violations**: Regulatory requirements mandate that certain types of financial data must never leave your on-premise infrastructure +- **Lack of Control**: You have no visibility into why certain routing decisions are made, making it impossible to audit or explain routing logic to regulators + +## The Solution: Keyword-Based Routing + +Keyword-based routing provides deterministic, transparent routing rules that complement AI-powered semantic classification. It allows enterprises to implement business policies with complete visibility and control. + +## Real-World Use Cases + +### Use Case 1: Financial Services Data Sovereignty + +**Scenario**: A bank needs to ensure that any query containing internal procedures, account details, or trading strategies stays within their secure, on-premise infrastructure. + +**Business Rule**: "Any query containing keywords like 'internal', 'confidential', 'account', 'trading', or 'procedures' must be routed to our internal AI model, never to external providers." + +**Implementation**: +```yaml +routing_rules: + - name: "financial-confidential" + description: "Route confidential financial queries to internal infrastructure" + conditions: + - type: "keyword_match" + keywords: ["internal", "confidential", "account", "trading", "procedures", "strategy"] + action: + type: "route" + endpoint: "internal-financial-ai" + reasoning: "Contains confidential financial terminology - must stay on-premise" +``` + +**Business Value**: +- ✅ Compliance with financial regulations +- ✅ Complete data sovereignty +- ✅ Audit trail for regulatory reporting +- ✅ Zero risk of data leakage to external providers + +### Use Case 2: Healthcare Information Protection + +**Scenario**: A healthcare provider needs to route patient-related queries to HIPAA-compliant models while allowing general medical information queries to go to external providers with broader medical knowledge. + +**Business Rule**: "Queries containing patient identifiers, medical records, or specific patient information go to HIPAA-compliant internal models. General medical questions can use external medical AI." + +**Implementation**: +```yaml +routing_rules: + - name: "patient-data-protection" + description: "Route patient-specific queries to HIPAA-compliant infrastructure" + conditions: + - type: "keyword_match" + keywords: ["patient", "medical record", "diagnosis", "treatment plan", "prescription"] + - type: "pii_detection" + threshold: 0.7 + action: + type: "route" + endpoint: "hipaa-compliant-ai" + reasoning: "Contains patient information - HIPAA compliance required" + + - name: "general-medical" + description: "Route general medical questions to external medical AI" + conditions: + - type: "keyword_match" + keywords: ["symptoms", "treatment", "medication", "disease", "condition"] + action: + type: "route" + endpoint: "external-medical-ai" + reasoning: "General medical query - can use external medical knowledge" +``` + +### Use Case 3: Search Capability Routing + +**Scenario**: An enterprise wants to route queries that require web search to providers with search capabilities, while keeping other queries on their preferred models. + +**Business Rule**: "Queries asking for 'search', 'find', 'look up', or 'current information' should go to providers with web search capabilities." + +**Implementation**: +```yaml +routing_rules: + - name: "search-queries" + description: "Route search requests to providers with web search capabilities" + conditions: + - type: "keyword_match" + keywords: ["search", "find", "look up", "current", "latest", "recent", "news"] + action: + type: "route" + endpoint: "search-enabled-provider" + reasoning: "Query requires web search capabilities" +``` + +## How It Works + +Keyword-based routing provides **transparent, interpretable routing decisions** by evaluating clear business rules before falling back to semantic classification: + +1. **Rule Evaluation**: Check each rule's conditions against the incoming query using simple keyword matching +2. **Match Found**: Route to the specified endpoint with **complete transparency** - you know exactly why the decision was made +3. **No Match**: Fall back to semantic classification for intelligent routing +4. **Audit Trail**: Log every decision with clear reasoning for compliance and debugging + +The key benefit is **interpretability**: stakeholders can easily understand and validate routing decisions because they're based on explicit, human-readable rules rather than opaque ML models. + +## Business Benefits + +### Immediate Value +- **Security**: Ensure sensitive data never leaves your infrastructure with transparent rules +- **Compliance**: Meet regulatory requirements with auditable, interpretable routing decisions +- **Capability Routing**: Route queries to providers with specific capabilities (search, coding, etc.) +- **Performance**: Get sub-millisecond routing for deterministic cases with clear reasoning + +### Enterprise Governance +- **Transparency**: See exactly why each query was routed where - no black box decisions +- **Interpretability**: Human-readable rules that business stakeholders can understand and validate +- **Control**: Implement business policies with confidence and clear justification +- **Auditability**: Complete logs for compliance and regulatory reporting +- **Flexibility**: Easy to modify rules as business needs change with full visibility + +## Getting Started + +### Step 1: Identify Your Business Rules +Start by identifying the key business policies that need deterministic routing: + +- **Security Rules**: What data must never leave your infrastructure? +- **Compliance Rules**: What regulatory requirements do you need to meet? +- **Capability Rules**: Which queries need specific capabilities (search, coding, etc.)? + +### Step 2: Define Your Keywords +Create keyword lists for each business rule: + +```yaml +# Example: Financial services keywords +financial_keywords: + confidential: ["internal", "confidential", "proprietary", "restricted"] + account_data: ["account", "balance", "transaction", "statement"] + trading: ["trading", "strategy", "portfolio", "investment"] + procedures: ["procedure", "process", "workflow", "policy"] +``` + +### Step 3: Configure Your Rules +Define your routing rules with clear business justification: + +```yaml +routing_rules: + - name: "confidential-financial" + description: "Route confidential financial queries to internal infrastructure" + business_justification: "Compliance with financial data sovereignty regulations" + owner: "compliance-team@company.com" + conditions: + - type: "keyword_match" + keywords: ["internal", "confidential", "account", "trading"] + action: + type: "route" + endpoint: "internal-financial-ai" + reasoning: "Contains confidential financial terminology" +``` + +### Step 4: Test and Validate +Test your rules with real queries to ensure they work as expected: + +```yaml +test_cases: + - input: "How do I access internal account procedures?" + expected_endpoint: "internal-financial-ai" + expected_reasoning: "Contains 'internal' and 'account' keywords" + + - input: "What's the weather like today?" + expected_endpoint: "semantic_classification" + expected_reasoning: "No keyword matches - using semantic routing" +``` + +## Success Stories + +### Financial Services Company +**Challenge**: Needed to ensure confidential trading strategies never reached external AI providers. + +**Solution**: Implemented keyword-based routing for terms like "trading", "strategy", "confidential". + +**Result**: +- ✅ 100% compliance with data sovereignty requirements +- ✅ Complete audit trail for regulatory reporting +- ✅ Zero incidents of data leakage +- ✅ 50% reduction in compliance review time + +### Healthcare Provider +**Challenge**: Required HIPAA-compliant routing for patient data while using external AI for general medical knowledge. + +**Solution**: Created rules to route patient-specific queries internally and general medical questions externally. + +**Result**: +- ✅ Full HIPAA compliance +- ✅ Improved patient care with broader medical knowledge +- ✅ Reduced costs by using appropriate models for each query type +- ✅ Streamlined compliance audits + +## Next Steps + +1. **Evaluate Your Needs**: Identify which business rules require deterministic routing +2. **Start Simple**: Begin with high-priority security and compliance rules +3. **Iterate**: Add more rules as you identify additional business requirements +4. **Monitor**: Use audit trails to validate rule effectiveness and compliance +5. **Optimize**: Refine rules based on real-world usage patterns + +## Related Documentation + +- [Multi-Provider Routing](multi-provider-routing.md) - External provider integration +- [Configuration Guide](../installation/configuration.md) - General configuration options +- [Architecture Overview](../overview/architecture/system-architecture.md) - System design details \ No newline at end of file diff --git a/website/docs/use-cases/multi-provider-routing.md b/website/docs/use-cases/multi-provider-routing.md new file mode 100644 index 000000000..6feeae747 --- /dev/null +++ b/website/docs/use-cases/multi-provider-routing.md @@ -0,0 +1,187 @@ +--- +sidebar_position: 1 +--- + +# Multi-Provider Routing + +:::warning Work in Progress +This use case describes a **proposed feature** that is not yet implemented in the current version of the semantic router. This represents a future capability that would enhance the router's functionality. +::: + +## The Challenge: One Size Doesn't Fit All + +Imagine you're building an AI-powered assistant for your company. Your users ask all kinds of questions: + +- **Sarah** needs help writing a professional email to a client +- **Mike** is stuck on a complex calculus problem for his engineering project +- **Lisa** wants to understand the legal implications of a new contract +- **David** needs to debug a Python script that's throwing errors + +Currently, you'd route all these queries to the same AI provider. But what if different providers excel at different tasks? What if you could automatically route each query to the AI that's best suited for that specific type of work? + +## The Solution: Smart Provider Selection + +This use case demonstrates how to implement **semantic-aware routing** that automatically selects the best AI provider based on what the user is actually asking for. Instead of treating all queries the same, the router would understand the intent and route accordingly: + +- **Coding & Math Questions** → Anthropic Claude (excellent at reasoning and code) +- **Document Drafting & Creative Writing** → OpenAI GPT-4 (strong at language generation) +- **Legal & Business Analysis** → GPT-4 (comprehensive knowledge base) +- **Advanced Math & Research** → GPT-5 (when available, for cutting-edge capabilities) + +## Real-World Scenarios + +### Scenario 1: The Development Team + +**Story**: A software development team needs AI assistance throughout their workflow. + +- **Code Reviews**: Route to Anthropic Claude for detailed technical analysis +- **Documentation Writing**: Route to OpenAI GPT-4 for clear, professional documentation +- **Algorithm Design**: Route to Gemini for complex mathematical reasoning +- **Bug Fixing**: Route to Claude for systematic debugging approaches + +### Scenario 2: The Legal Department + +**Story**: A law firm wants AI assistance for different types of legal work. + +- **Contract Analysis**: Route to GPT-4 for comprehensive legal knowledge +- **Research Tasks**: Route to Gemini for thorough analysis and reasoning +- **Client Communications**: Route to GPT-4 for professional, diplomatic language +- **Case Strategy**: Route to GPT-5 for advanced strategic thinking + +### Scenario 3: The Business Team + +**Story**: A consulting firm needs AI support for various business activities. + +- **Market Analysis**: Route to GPT-4 for broad business knowledge +- **Financial Modeling**: Route to Claude for precise calculations and logic +- **Presentation Writing**: Route to Gemini for compelling business language +- **Strategic Planning**: Route to GPT-5 for innovative thinking + +## Current Limitations + +The current semantic router only supports vLLM endpoints with IP addresses, making it impossible to route to external AI providers like: + +- OpenAI API (`api.openai.com`) +- Anthropic Claude API (`api.anthropic.com`) +- Google Gemini API (`generativelanguage.googleapis.com`) + +## How It Would Work: Smart Provider Selection + +The router would automatically route queries to the most appropriate provider based on the type of task and content. Here are some examples of how this could work: + +### Example Routing Decisions + +**User Query**: "Help me debug this Python function that's returning None instead of the expected list" + +**Routing Decision**: → Anthropic Claude (excellent at code analysis and debugging) + +--- + +**User Query**: "Write a professional email to decline a client's request for a discount" + +**Routing Decision**: → OpenAI GPT-4 (strong at professional writing and tone) + +--- + +**User Query**: "What are the legal implications of using open source software in our commercial product?" + +**Routing Decision**: → GPT-4 (comprehensive knowledge base) or GPT-5 (if available for advanced analysis) + +## How It Would Work + +Multi-provider routing would automatically select the best AI provider based on the type of task and content. The router would understand what each provider excels at and route accordingly: + +- **Coding & Debugging** → Anthropic Claude (excellent at systematic analysis) +- **Business Writing & Communication** → OpenAI GPT-4 (strong at professional language) +- **Research & Presentations** → Google Gemini (great at information synthesis) +- **Mathematical Reasoning** → Gemini or Claude (depending on complexity) +- **Sensitive Data** → Internal models (compliance and security) + +## Real-World Use Cases + +### Use Case 1: The Tech Startup + +**Scenario**: A growing tech startup needs AI assistance for different team functions. + +**Business Need**: Different teams have different AI requirements - engineers need code help, marketers need writing assistance, and legal needs secure contract analysis. + +**Solution**: Route queries based on team needs and data sensitivity: +- **Engineering queries** → Claude (excellent at code analysis and debugging) +- **Marketing queries** → GPT-4 (strong at creative and business writing) +- **Legal queries** → Internal models (sensitive contract analysis stays secure) + +**Business Value**: +- ✅ Each team gets the best AI for their specific needs +- ✅ Sensitive legal data stays internal +- ✅ Improved productivity across all teams +- ✅ Cost optimization by using appropriate models + +### Use Case 2: The Consulting Firm + +**Scenario**: A management consulting firm needs different AI capabilities for different project types. + +**Business Need**: Strategic projects need advanced reasoning, operational projects need systematic analysis, and quick research needs cost-effective solutions. + +**Solution**: Route based on project complexity and domain expertise: +- **Strategic projects** → GPT-4 (complex business analysis and strategy) +- **Operational projects** → Claude (process analysis and systematic thinking) +- **Quick research** → Claude Haiku (cost-effective for initial research) + +**Business Value**: +- ✅ Premium capabilities for high-stakes strategic work +- ✅ Cost-effective solutions for routine tasks +- ✅ Better project outcomes with appropriate AI assistance +- ✅ Optimized costs across different project types + +### Use Case 3: The Financial Services Company + +**Scenario**: A financial services company needs to balance AI capabilities with regulatory compliance. + +**Business Need**: Public information can use external AI, but client data and sensitive financial information must stay internal for compliance. + +**Solution**: Route based on data sensitivity and regulatory requirements: +- **Public financial analysis** → GPT-4 (broad market knowledge) +- **Client data analysis** → Internal models (compliance and security) +- **Mathematical modeling** → Claude (precise calculations and risk analysis) + +**Business Value**: +- ✅ Full regulatory compliance with data sovereignty +- ✅ Access to broad external knowledge for public information +- ✅ Secure handling of sensitive client data +- ✅ Best-in-class mathematical modeling capabilities + +## Why This Matters: The Benefits + +### For Users + +- **Better Results**: Get the best AI for each specific task instead of one-size-fits-all +- **Faster Responses**: Simple questions get quick answers from efficient models +- **Higher Quality**: Complex problems get the attention they deserve from premium models + +### For Organizations + +- **Cost Optimization**: Pay premium prices only when you need premium capabilities +- **Compliance**: Automatically route sensitive data to secure, on-premise models +- **Reliability**: If one provider is down, automatically failover to another +- **Specialization**: Leverage each provider's unique strengths + +### Real-World Impact + +- **Development Teams**: Get better code reviews and debugging assistance +- **Legal Departments**: Ensure sensitive contracts stay internal while getting expert analysis +- **Business Teams**: Access the best writing and analysis capabilities for each task type +- **Research Teams**: Route complex problems to the most capable models available + +## Next Steps + +1. **Evaluate Your Needs**: Identify which types of queries would benefit from different AI providers +2. **Start Simple**: Begin with high-priority use cases like coding vs. writing tasks +3. **Plan Your Strategy**: Map your team's needs to different provider strengths +4. **Consider Compliance**: Identify which data must stay internal vs. can use external providers +5. **Monitor and Optimize**: Track which queries benefit from different routing decisions + +## Related Documentation + +- [Keyword-Based Routing](keyword-based-routing.md) - Deterministic routing rules +- [Configuration Guide](../installation/configuration.md) - General configuration options +- [Architecture Overview](../overview/architecture/system-architecture.md) - System design details diff --git a/website/sidebars.ts b/website/sidebars.ts index 35312198e..e485cb601 100644 --- a/website/sidebars.ts +++ b/website/sidebars.ts @@ -95,6 +95,15 @@ const sidebars: SidebarsConfig = { }, ], }, + { + type: 'category', + label: 'Use Cases', + items: [ + 'use-cases/index', + 'use-cases/multi-provider-routing', + 'use-cases/keyword-based-routing', + ], + }, { type: 'category', label: 'Proposals',