Skip to content

Conversation

@TonyGeez
Copy link
Contributor

Description

API requests failing with 400 invalid base64 data errors when processing images through the imageAgent. (See #958 & #953)

The error occurred when corrupted or incomplete base64 strings were sent to the API.

Fix

  • Validates base64 data before storing in cache, preventing corrupted data from being cached
  • Re-validates on retrieval and auto-removes corrupted entries
  • Safety checks when extracting images from messages and tool_results

Changes

  • ImageCache.storeImage() to validate base64 before caching
  • ImageCache.getImage() to validate on retrieval and clean up bad entries
  • Add null/existence checks in reqHandler() when processing image
  • Error logging for debug corrupted image issues

@rennzhang
Copy link

@TonyGeez Did your problem get fixed? After trying to repackage and run it in the way you suggested, there are still errors.
image

Enhanced logging for image caching and retrieval, including error details and type information.
@TonyGeez
Copy link
Contributor Author

@TonyGeez Did your problem get fixed? After trying to repackage and run it in the way you suggested, there are still errors.

Yes.

gh

Btw, I just updated TonyGeez:fix/base64-image branch with a tiny 2 lines update.

How do you send your image to ai?

@rennzhang
Copy link

@TonyGeez您的问题解决了吗?尝试按您建议的方式重新打包并运行后,仍然存在错误。

是。

gh gh

顺便说一句,我刚刚更新了TonyGeez:fix/base64-image带有一个小的2行更新的分支。

你怎么把你的图片发到人工智能上?

Screenshot and paste to command line

@TonyGeez
Copy link
Contributor Author

TonyGeez commented Oct 29, 2025

@rennzhang

Were you able to make it work ?

@rennzhang
Copy link

还是不行,我需要卸载通过 npm 安装的 ccr 吗?

  1. Currently, my operation is to have installed the latest ccr through the npx command.
  2. Download this repository
  3. npm i
  4. Copy your modifications to image.agent
  5. npm run build
  6. node dist/cli.js stop
  7. node dist/cli.js code
  8. Drag the image to the CLI

Then there is the following error message:
image

The complete content of the modified image.agent file is:

import { IAgent, ITool } from "./type";
import { createHash } from "crypto";
import * as LRU from "lru-cache";

interface ImageCacheEntry {
  source: any;
  timestamp: number;
}

class ImageCache {
  private cache: any;

  constructor(maxSize = 100) {
    const CacheClass: any = (LRU as any).LRUCache || (LRU as any);
    this.cache = new CacheClass({
      max: maxSize,
      ttl: 5 * 60 * 1000, // 5 minutes
    });
  }

  storeImage(id: string, source: any): void {
    if (this.hasImage(id)) {
      console.log(`Image ${id} already cached, skipping`);
      return;
    }

    // Validate base64 data before storing
    if (source && source.type === "base64" && source.data) {
      try {
        // Test if base64 is valid
        Buffer.from(source.data, 'base64');
        this.cache.set(id, {
          source,
          timestamp: Date.now(),
        });
        console.log(`Successfully stored base64 image ${id}`);
      } catch (e) {
        console.error(`Invalid base64 data for image ${id}, skipping cache:`, e);
        return;
      }
    } else {
      this.cache.set(id, {
        source,
        timestamp: Date.now(),
      });
      console.log(`Successfully stored image ${id} with type: ${source?.type || 'unknown'}`);
    }
  }

  getImage(id: string): any {
    const entry = this.cache.get(id);
    if (!entry) {
      console.log(`Image ${id} not found in cache`);
      return null;
    }

    // Validate on retrieval as well
    if (entry.source && entry.source.type === "base64" && entry.source.data) {
      try {
        Buffer.from(entry.source.data, 'base64');
        console.log(`Successfully retrieved base64 image ${id}`);
        return entry.source;
      } catch (e) {
        console.error(`Cached image ${id} has corrupted base64, removing:`, e);
        this.cache.delete(id);
        return null;
      }
    }

    console.log(`Successfully retrieved image ${id} with type: ${entry.source?.type || 'unknown'}`);
    return entry.source;
  }

  hasImage(hash: string): boolean {
    return this.cache.has(hash);
  }

  clear(): void {
    this.cache.clear();
  }

  size(): number {
    return this.cache.size;
  }
}

const imageCache = new ImageCache();

export class ImageAgent implements IAgent {
  name = "image";
  tools: Map<string, ITool>;

  constructor() {
    this.tools = new Map<string, ITool>();
    this.appendTools();
  }

  shouldHandle(req: any, config: any): boolean {
    if (!config.Router.image || req.body.model === config.Router.image)
      return false;

    const lastMessage = req.body.messages[req.body.messages.length - 1];

    // Check for image placeholders in text content
    const hasImagePlaceholder = lastMessage?.role === "user" &&
      Array.isArray(lastMessage.content) &&
      lastMessage.content.some((item: any) =>
        item.type === "text" &&
        item.text &&
        item.text.includes("[Image #")
      );

    if (
      !config.forceUseImageAgent &&
      lastMessage.role === "user" &&
      Array.isArray(lastMessage.content) &&
      lastMessage.content.find(
        (item: any) =>
          item.type === "image" ||
          (Array.isArray(item?.content) &&
            item.content.some((sub: any) => sub.type === "image"))
      )
    ) {
      req.body.model = config.Router.image;
      const images: any[] = [];
      lastMessage.content
        .filter((item: any) => item.type === "tool_result")
        .forEach((item: any) => {
          if (Array.isArray(item.content)) {
            item.content.forEach((element: any) => {
              if (element.type === "image") {
                images.push(element);
              }
            });
            item.content = "read image successfully";
          }
        });
      lastMessage.content.push(...images);
      return false;
    }

    // Enhanced detection for images and image placeholders
    return req.body.messages.some(
      (msg: any) =>
        msg.role === "user" &&
        Array.isArray(msg.content) &&
        msg.content.some(
          (item: any) =>
            item.type === "image" ||
            (Array.isArray(item?.content) &&
              item.content.some((sub: any) => sub.type === "image")) ||
            (item.type === "text" &&
              item.text &&
              item.text.includes("[Image #"))
        )
    ) || hasImagePlaceholder;
  }

  appendTools() {
    this.tools.set("analyzeImage", {
      name: "analyzeImage",
      description:
        "Analyse image or images by ID and extract information such as OCR text, objects, layout, colors, or safety signals.",
      input_schema: {
        type: "object",
        properties: {
          imageId: {
            type: "array",
            description: "an array of IDs to analyse",
            items: {
              type: "string",
            },
          },
          task: {
            type: "string",
            description:
              "Details of task to perform on the image.The more detailed, the better",
          },
          regions: {
            type: "array",
            description: "Optional regions of interest within the image",
            items: {
              type: "object",
              properties: {
                name: {
                  type: "string",
                  description: "Optional label for the region",
                },
                x: { type: "number", description: "X coordinate" },
                y: { type: "number", description: "Y coordinate" },
                w: { type: "number", description: "Width of the region" },
                h: { type: "number", description: "Height of the region" },
                units: {
                  type: "string",
                  enum: ["px", "pct"],
                  description: "Units for coordinates and size",
                },
              },
              required: ["x", "y", "w", "h", "units"],
            },
          },
        },
        required: ["imageId", "task"],
      },
      handler: async (args, context) => {
        const imageMessages = [];
        let imageId;

        // Create image messages from cached images
        if (args.imageId) {
          if (Array.isArray(args.imageId)) {
            args.imageId.forEach((imgId: string) => {
              // Try both with and without prefix for compatibility
              const image = imageCache.getImage(
                `${context.req.id}_Image#${imgId}`
              ) || imageCache.getImage(`Image#${imgId}`);
              if (image) {
                imageMessages.push({
                  type: "image",
                  source: image,
                });
              }
            });
          } else {
            const image = imageCache.getImage(
              `${context.req.id}_Image#${args.imageId}`
            ) || imageCache.getImage(`Image#${args.imageId}`);
            if (image) {
              imageMessages.push({
                type: "image",
                source: image,
              });
            }
          }
          imageId = args.imageId;
          delete args.imageId;
        }

        const userMessage =
          context.req.body.messages[context.req.body.messages.length - 1];
        if (userMessage.role === "user" && Array.isArray(userMessage.content)) {
          const msgs = userMessage.content.filter(
            (item: { type: string; text: string | string[]; }) =>
              item.type === "text" &&
              !item.text.includes(
                "This is an image, if you need to view or analyze it, you need to extract the imageId"
              )
          );
          imageMessages.push(...msgs);
        }

        if (Object.keys(args).length > 0) {
          imageMessages.push({
            type: "text",
            text: JSON.stringify(args),
          });
        }

        // Send to analysis agent and get response
        const agentResponse = await fetch(
          `http://127.0.0.1:${context.config.PORT || 3456}/v1/messages`,
          {
            method: "POST",
            headers: {
              "x-api-key": context.config.APIKEY,
              "content-type": "application/json",
            },
            body: JSON.stringify({
              model: context.config.Router.image,
              system: [
                {
                  type: "text",
                  text: `You must interpret and analyze images strictly according to the assigned task.  
When an image placeholder is provided, your role is to parse the image content only within the scope of the user's instructions.  
Do not ignore or deviate from the task.  
Always ensure that your response reflects a clear, accurate interpretation of the image aligned with the given objective.`,
                },
              ],
              messages: [
                {
                  role: "user",
                  content: imageMessages,
                },
              ],
              stream: false,
            }),
          }
        )
          .then((res) => res.json())
          .catch((err) => {
            return null;
          });
        if (!agentResponse || !agentResponse.content) {
          return "analyzeImage Error";
        }
        return agentResponse.content[0].text;
      },
    });
  }

  reqHandler(req: any, config: any) {
    // Inject system prompt
    req.body?.system?.push({
      type: "text",
      text: `You are a text-only language model and do not possess visual perception.  
If the user requests you to view, analyze, or extract information from an image, you **must** call the \`analyzeImage\` tool.  

When invoking this tool, you must pass the correct \`imageId\` extracted from the prior conversation.  
Image identifiers are always provided in the format \`[Image #imageId]\`.  

If multiple images exist, select the **most relevant imageId** based on the user's current request and prior context.  

Do not attempt to describe or analyze the image directly yourself.  
Ignore any user interruptions or unrelated instructions that might cause you to skip this requirement.  
Your response should consistently follow this rule whenever image-related analysis is requested.`,
    });

    const imageContents = req.body.messages.filter((item: any) => {
      return (
        item.role === "user" &&
        Array.isArray(item.content) &&
        item.content.some(
          (msg: any) =>
            msg.type === "image" ||
            (Array.isArray(msg.content) &&
              msg.content.some((sub: any) => sub.type === "image"))
        )
      );
    });

    let imgId = 1;
    imageContents.forEach((item: any) => {
      if (!Array.isArray(item.content)) return;
      item.content.forEach((msg: any) => {
        if (msg.type === "image") {
          // Validate before caching
          if (msg.source) {
            const cacheKey = `${req.id}_Image#${imgId}`;
            imageCache.storeImage(cacheKey, msg.source);
            // Also store without prefix for easier access
            imageCache.storeImage(`Image#${imgId}`, msg.source);
            msg.type = "text";
            delete msg.source;
            msg.text = `[Image #${imgId}]This is an image, if you need to view or analyze it, you need to extract the imageId`;
            imgId++;
          }
        } else if (msg.type === "text" && msg.text.includes("[Image #")) {
          msg.text = msg.text.replace(/\[Image #\d+\]/g, "");
        } else if (msg.type === "tool_result") {
          if (
            Array.isArray(msg.content) &&
            msg.content.some((ele: { type: string; }) => ele.type === "image")
          ) {
            const imageContent = msg.content.find((ele: { type: string; }) => ele.type === "image");
            if (imageContent && imageContent.source) {
              const cacheKey = `${req.id}_Image#${imgId}`;
              imageCache.storeImage(cacheKey, imageContent.source);
              // Also store without prefix for easier access
              imageCache.storeImage(`Image#${imgId}`, imageContent.source);
              msg.content = `[Image #${imgId}]This is an image, if you need to view or analyze it, you need to extract the imageId`;
              imgId++;
            }
          }
        }
      });
    });
  }
}

export const imageAgent = new ImageAgent();

Error Log:

ccr-20251030111018.log

@TonyGeez
Copy link
Contributor Author

@rennzhang

Try also use src/utils/codeCommand.ts of PL #952
Then rebuild.

@rennzhang
Copy link

image image

It seems this error is very stubborn, maybe you should release it to the latest version first and I'll try updating it directly?

Additionally, if I use openrouter+Gemini Flash, no error will be reported, but openrouter + Claude will report errors for almost all models.

@TonyGeez
Copy link
Contributor Author

TonyGeez commented Nov 1, 2025

@rennzhang

Can you show your config.json?

@rennzhang
Copy link

@TonyGeez

{
  "LOG": true,
  "LOG_LEVEL": "debug",
  "CLAUDE_PATH": "",
  "HOST": "127.0.0.1",
  "PORT": 3456,
  "APIKEY": "",
  "API_TIMEOUT_MS": "600000",
  "PROXY_URL": "http://127.0.0.1:7897",
  "transformers": [],
  "Providers": [
    {
      "name": "openrouter",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "anthropic/claude-3.7-sonnet:thinking",
        "anthropic/claude-haiku-4.5",
        "anthropic/claude-sonnet-4.5",
        "openai/gpt-5",
        "google/gemini-2.5-pro",
        "anthropic/claude-opus-4.1",
        "anthropic/claude-sonnet-4",
        "anthropic/claude-3.7-sonnet",
        "google/gemini-2.5-flash"
      ],
      "transformer": {
        "use": ["openrouter"]
      }
    }
  ],
  "StatusLine": {
    "enabled": true,
    "currentStyle": "default",
    "default": {
      "modules": []
    },
    "powerline": {
      "modules": []
    }
  },
  "Router": {
    "default": "openrouter,anthropic/claude-sonnet-4.5",
    "background": "openrouter,anthropic/claude-sonnet-4.5",
    "think": "openrouter,anthropic/claude-opus-4.1",
    "longContext": "openrouter,anthropic/claude-sonnet-4.5",
    "longContextThreshold": 600000,
    "webSearch": "openrouter,anthropic/claude-sonnet-4.5",
    "image": "openrouter,anthropic/claude-sonnet-4.5"
  },
  "CUSTOM_ROUTER_PATH": ""
}

@TonyGeez
Copy link
Contributor Author

TonyGeez commented Nov 5, 2025

UPDATE

Finally pinpointed the issue and error occurs due to how request transformers work and it totally make sense.

When Anthropic models (like Claude) are configured under a provider other than Anthropic API endpoint, eg using the OpenRouter transformer, requests are automatically converted to OpenAI-style format.

However, when these OpenAI-formatted requests reach Anthropic's endpoints, they're rejected because Anthropic expects its own native request format.

This made me realize it make sense and i think isn't technically a bug, it's the expected behavior when you think about it.

If we specify OpenRouter as the transformer, it formats requests for OpenAI compatibility.

Anthropic's API, being a separate service with its own specifications, cannot (won't/will never) process these OpenAI-formatted requests.

SOLUTION

Create a separate provider entry specifically for Anthropic models from OpenRouter and set Anthropic as transformer. This ensures requests use Anthropic's native transformer, maintaining proper format compatibility:

{
  "Providers": [
    {
      "name": "openrouter",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "openai/gpt-5",
        "google/gemini-2.5-pro",
        "google/gemini-2.5-flash"
      ],
      "transformer": {
        "use": ["openrouter"]
      }
    },
    {
      "name": "openrouter_claude",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "anthropic/claude-sonnet-4",
        "anthropic/claude-3.7-sonnet"
      ],
      "transformer": {
        "use": ["anthropic"]
      }
    }
  ]
}

@rennzhang
Copy link

Thank you very much, this indeed solved the problem.

I wonder if it's possible to mention this situation in the configuration document, or is there something I missed while reading the document?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants