Skip to content

Conversation

@aeroxy
Copy link

@aeroxy aeroxy commented Nov 15, 2025

This PR adds comprehensive support for Claude's web search functionality (web_search_20250305 tool) in the proxy server, including proper handling of both streaming and non-streaming modes.

Key Features Added:

  1. Web Search Request Detection & Translation

    • Detect Claude requests containing web_search_20250305 tool type
    • Convert these requests to /chat/retrieve endpoint format:
      {
        "phase": "UNIFY",
        "query": "...",
        "enableIntention": false,
        "appCode": "COMPLEX_CHATBOT",
        "enableQueryRewrite": false
      }
  2. Endpoint Routing

    • Route web search requests to /chat/retrieve instead of standard /chat/completions
    • Standard Claude requests continue to use /chat/completions
  3. Response Translation

    • Convert /chat/retrieve responses back to proper Claude message format
    • Handle both streaming and non-streaming response modes
    • For streaming: Generate complete SSE event sequence for Claude clients:
      • event: message_start
      • event: content_block_start
      • Multiple event: content_block_delta events
      • event: content_block_stop
      • event: message_delta
      • event: message_stop
  4. Streaming Mode Compatibility

    • When upstream /messages request is streaming, convert complete /chat/retrieve JSON response to streaming SSE events
    • When upstream /messages request is non-streaming, return complete Claude message
    • Preserve original streaming behavior for standard OpenAI responses

Technical Implementation:

  • Request Processing: Enhanced ConvertClaudeRequestToOpenAI to detect and translate web search requests
  • Response Processing: Updated streaming translator to handle web search response format
  • Executor Logic: Modified OpenAICompatExecutor to route web search requests appropriately
  • State Management: Proper parameter state handling for streaming SSE event sequences

Files Changed:

  • internal/translator/openai/claude/openai_claude_request.go - Web search request detection and translation
  • internal/translator/openai/claude/openai_claude_response.go - Web search response conversion with streaming support
  • internal/runtime/executor/openai_compat_executor.go - Web search request routing and response handling

Compatibility:

  • ✅ Maintains backward compatibility for existing Claude and OpenAI requests
  • ✅ Proper streaming/non-streaming mode detection based on original request
  • ✅ Correct SSE event sequence generation for Claude clients
  • ✅ No breaking changes to existing functionality

Fixes web search handling where complete JSON responses from /chat/retrieve needed to be properly converted to streaming SSE events when the original upstream request was in streaming mode.

@gemini-code-assist
Copy link

Summary of Changes

Hello @aeroxy, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the proxy server's capabilities by introducing robust support for Claude's web search functionality. It establishes a seamless translation and routing mechanism for web search requests, converting them to an internal /chat/retrieve endpoint and then meticulously translating the responses back into Claude's native message format, ensuring full compatibility with both streaming and non-streaming interactions.

Highlights

  • Claude Web Search Integration: Adds comprehensive support for Claude's web_search_20250305 tool, enabling the proxy server to detect, translate, and process web search requests.
  • Request Translation and Routing: Claude web search requests are now identified, converted to the internal /chat/retrieve endpoint format, and routed appropriately, while standard requests continue to use /chat/completions.
  • Comprehensive Response Handling: Implements robust translation of /chat/retrieve responses back into the proper Claude message format, supporting both streaming (generating full SSE event sequences) and non-streaming modes.
  • Streaming Compatibility: Ensures correct streaming behavior for web search responses by converting complete JSON responses from /chat/retrieve into appropriate SSE events when the original upstream request was in streaming mode.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for Claude's web search functionality by detecting, routing, and translating requests and responses appropriately for both streaming and non-streaming modes. The implementation is thorough and handles many edge cases, especially around generating the correct SSE event sequence for streaming web search responses. My review focuses on improving robustness, maintainability, and addressing a couple of potential issues. I've provided suggestions to refactor duplicated code, use safer JSON manipulation methods, remove redundant logic, and replace magic numbers with constants for better clarity.

func isWebSearchRequest(translated []byte) bool {
// Check if the translated request has the web search marker
// This looks for the "_web_search_request":true field we add
return bytes.Contains(translated, []byte("\"_web_search_request\":true"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using bytes.Contains to check for the _web_search_request marker is fragile. It relies on the exact string representation, which could break if the JSON serialization changes (e.g., adds whitespace). Since gjson is already used in the project, it would be more robust to use it for this check.

Suggested change
return bytes.Contains(translated, []byte("\"_web_search_request\":true"))
return gjson.GetBytes(translated, "_web_search_request").Bool()

Comment on lines +30 to +34
result := make([]byte, len(webSearchRequest)+30)
copy(result, webSearchRequest[:len(webSearchRequest)-1]) // Copy everything except the closing brace
metadata := `,"_web_search_request":true}`
copy(result[len(webSearchRequest)-1:], metadata)
return result

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation uses string manipulation and byte copying to add the _web_search_request field. This approach is brittle and can lead to invalid JSON if the input format changes. Since the rest of the file already uses the sjson library, it would be safer and more consistent to use it here as well. This also removes the need for the arbitrary buffer allocation of +30 bytes.

Suggested change
result := make([]byte, len(webSearchRequest)+30)
copy(result, webSearchRequest[:len(webSearchRequest)-1]) // Copy everything except the closing brace
metadata := `,"_web_search_request":true}`
copy(result[len(webSearchRequest)-1:], metadata)
return result
webSearchRequest, _ = sjson.SetBytes(webSearchRequest, "_web_search_request", true)
return webSearchRequest

Comment on lines 174 to 192
// Check if this is a web search request (has special marker we added in translator)
isWebSearch := isWebSearchRequest(translated)

var url string
if isWebSearch {
url = strings.TrimSuffix(baseURL, "/") + "/chat/retrieve"
} else {
url = strings.TrimSuffix(baseURL, "/") + "/chat/completions"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for determining if a request is for a web search and constructing the appropriate URL is duplicated in both Execute and ExecuteStream methods. To improve maintainability and avoid redundancy, consider extracting this logic into a private helper function.

For example, you could create a helper like this:

func (e *OpenAICompatExecutor) getRequestDetails(baseURL string, translated []byte) (string, bool) {
	isWebSearch := isWebSearchRequest(translated)
	path := "/chat/completions"
	if isWebSearch {
		path = "/chat/retrieve"
	}
	url := strings.TrimSuffix(baseURL, "/") + path
	return url, isWebSearch
}

Then, you can call it in both Execute and ExecuteStream:
url, isWebSearch := e.getRequestDetails(baseURL, translated)

Comment on lines +321 to +409
if query == "" {
// Try to find text after common search phrases
searchPhrases := []string{
"perform a web search for the query:",
"perform a web search for:",
"web search for the query:",
"web search for:",
"search for the query:",
"search for:",
"query:",
"search query:",
}
for _, phrase := range searchPhrases {
phraseLower := strings.ToLower(phrase)
if idx := strings.Index(strings.ToLower(text), phraseLower); idx >= 0 {
query = strings.TrimSpace(text[idx+len(phrase):])
// Remove any trailing punctuation that might be part of the instruction
query = strings.TrimRight(query, ".!?")
if query != "" {
return false // stop iteration
}
}
}

// If still no query found, check if the entire text is a search-like query
if query == "" && (strings.Contains(strings.ToLower(text), "search") ||
strings.Contains(strings.ToLower(text), "find") ||
strings.Contains(strings.ToLower(text), "what") ||
strings.Contains(strings.ToLower(text), "how") ||
strings.Contains(strings.ToLower(text), "why") ||
strings.Contains(strings.ToLower(text), "when") ||
strings.Contains(strings.ToLower(text), "where")) {
trimmed := strings.TrimSpace(text)
if len(trimmed) > 5 { // Basic sanity check
query = trimmed
return false // stop iteration
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block for alternative query extraction is quite complex with multiple nested if statements and a hardcoded list of phrases. This can be simplified for better readability and maintainability.

  1. The searchPhrases slice can be defined as a package-level constant.
  2. The logic for checking keywords like search, find, what, etc., can be extracted into a helper function.
  3. The nested if query == "" checks can be streamlined.

Here's a suggested refactoring:

// At package level
var webSearchQueryPhrases = []string{
	"perform a web search for the query:",
	"perform a web search for:",
	"web search for the query:",
	"web search for:",
	"search for the query:",
	"search for:",
	"query:",
	"search query:",
}

func isSearchLikeQuery(text string) bool {
	lowerText := strings.ToLower(text)
	keywords := []string{"search", "find", "what", "how", "why", "when", "where"}
	for _, kw := range keywords {
		if strings.Contains(lowerText, kw) {
			return true
		}
	}
	return false
}

// Inside extractWebSearchQuery, replacing the block:
if query == "" {
    // Try to find text after common search phrases
    for _, phrase := range webSearchQueryPhrases {
        phraseLower := strings.ToLower(phrase)
        if idx := strings.Index(strings.ToLower(text), phraseLower); idx >= 0 {
            query = strings.TrimSpace(text[idx+len(phrase):])
            query = strings.TrimRight(query, ".!?")
            if query != "" {
                return false // stop iteration
            }
        }
    }

    // If still no query found, check if the entire text is a search-like query
    if query == "" && isSearchLikeQuery(text) {
        trimmed := strings.TrimSpace(text)
        if len(trimmed) > 5 { // Basic sanity check
            query = trimmed
            return false // stop iteration
        }
    }
}

// Send content_block_delta with the content in chunks (to simulate streaming)
if len(contentText) > 0 {
// Break content into reasonable-sized chunks to simulate streaming (avoiding tiny chunks)
charsPerChunk := 100

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are a few magic numbers used for truncating search abstracts and chunking SSE data. It's a good practice to define these as named constants to improve readability and make them easier to modify in the future.

For example:

const (
	webSearchAbstractMaxLength = 500
	webSearchSSEChunkSize      = 100
)

This applies to:

  • The abstract length limit of 500 on lines 792 and 899.
  • The SSE chunk size of 100 on line 925.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +28 to +34
webSearchRequest := createWebSearchRequestJSON(root)
// Add a metadata field to indicate this is a special web search request
result := make([]byte, len(webSearchRequest)+30)
copy(result, webSearchRequest[:len(webSearchRequest)-1]) // Copy everything except the closing brace
metadata := `,"_web_search_request":true}`
copy(result[len(webSearchRequest)-1:], metadata)
return result

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Trim added metadata when emitting web search payload

In ConvertClaudeRequestToOpenAI the web-search branch allocates result := make([]byte, len(webSearchRequest)+30) and copies the base JSON plus ,"_web_search_request":true} (lines 30‑33). Because the slice is never resliced to the actual number of bytes written, the returned payload contains trailing \x00 bytes, which makes the HTTP body invalid JSON (Go’s encoding/json errors out with Extra data). Consequently every translated web-search request sent to /chat/retrieve will be rejected before it reaches the provider. Build the payload via append or slice result to the written length before returning so the JSON is valid.

Useful? React with 👍 / 👎.

@luispater
Copy link
Collaborator

I'm not quite clear on what the purpose of this PR is, where it's used, and which model provider offers the /chat/retrieve endpoint?

@aeroxy
Copy link
Author

aeroxy commented Nov 17, 2025

I'm not quite clear on what the purpose of this PR is, where it's used, and which model provider offers the /chat/retrieve endpoint?

Hey thx for the question. It's the open AI compatible standard for web search. I saw a similar feature has already been implemented for the codex provider but not the older open AI compatible provider.

- Added web search detection in request handling
- Routes web search requests to /chat/retrieve endpoint
- Implemented streaming and non-streaming response conversion
- Added SSE event simulation for Claude Code compatibility
- Updated dependencies and gitignore

Files changed:
- internal/runtime/executor/openai_compat_executor.go (+217 lines)
- internal/translator/openai/claude/openai_claude_request.go (+167 lines)
- internal/translator/openai/claude/openai_claude_response.go (+228 lines)
- go.mod (dependency updates)
- .gitignore (added /refs/* and .DS_Store)
@luispater
Copy link
Collaborator

Can you tell me which providers and clients support this feature?

How do I test the feature?

@aeroxy
Copy link
Author

aeroxy commented Nov 17, 2025

Can you tell me which providers and clients support this feature?

How do I test the feature?

You may try it with Minimax / Kimi / iFlow / Other Open AI Compatible providers

@luispater
Copy link
Collaborator

Can you tell me which providers and clients support this feature?
How do I test the feature?

You may try it with Minimax / Kimi / iFlow / Other Open AI Compatible providers

Which client?

@aeroxy
Copy link
Author

aeroxy commented Nov 18, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants