Fix base64 validation in the image process pipeline #960

TonyGeez · 2025-10-29T00:32:03Z

Description

API requests failing with 400 invalid base64 data errors when processing images through the imageAgent. (See #958 & #953)

The error occurred when corrupted or incomplete base64 strings were sent to the API.

Fix

Validates base64 data before storing in cache, preventing corrupted data from being cached
Re-validates on retrieval and auto-removes corrupted entries
Safety checks when extracting images from messages and tool_results

Changes

ImageCache.storeImage() to validate base64 before caching
ImageCache.getImage() to validate on retrieval and clean up bad entries
Add null/existence checks in reqHandler() when processing image
Error logging for debug corrupted image issues

rennzhang · 2025-10-29T09:09:20Z

@TonyGeez Did your problem get fixed? After trying to repackage and run it in the way you suggested, there are still errors.

Enhanced logging for image caching and retrieval, including error details and type information.

TonyGeez · 2025-10-29T11:39:44Z

@TonyGeez Did your problem get fixed? After trying to repackage and run it in the way you suggested, there are still errors.

Yes.

Btw, I just updated TonyGeez:fix/base64-image branch with a tiny 2 lines update.

How do you send your image to ai?

rennzhang · 2025-10-29T11:44:06Z

@TonyGeez您的问题解决了吗？尝试按您建议的方式重新打包并运行后，仍然存在错误。

是。

顺便说一句，我刚刚更新了TonyGeez:fix/base64-image带有一个小的2行更新的分支。

你怎么把你的图片发到人工智能上？

Screenshot and paste to command line

TonyGeez · 2025-10-29T17:59:21Z

@rennzhang

Were you able to make it work ?

rennzhang · 2025-10-30T03:24:17Z

还是不行，我需要卸载通过 npm 安装的 ccr 吗？

Currently, my operation is to have installed the latest ccr through the npx command.
Download this repository
npm i
Copy your modifications to image.agent
npm run build
node dist/cli.js stop
node dist/cli.js code
Drag the image to the CLI

Then there is the following error message:

The complete content of the modified image.agent file is:

import { IAgent, ITool } from "./type";
import { createHash } from "crypto";
import * as LRU from "lru-cache";

interface ImageCacheEntry {
  source: any;
  timestamp: number;
}

class ImageCache {
  private cache: any;

  constructor(maxSize = 100) {
    const CacheClass: any = (LRU as any).LRUCache || (LRU as any);
    this.cache = new CacheClass({
      max: maxSize,
      ttl: 5 * 60 * 1000, // 5 minutes
    });
  }

  storeImage(id: string, source: any): void {
    if (this.hasImage(id)) {
      console.log(`Image ${id} already cached, skipping`);
      return;
    }

    // Validate base64 data before storing
    if (source && source.type === "base64" && source.data) {
      try {
        // Test if base64 is valid
        Buffer.from(source.data, 'base64');
        this.cache.set(id, {
          source,
          timestamp: Date.now(),
        });
        console.log(`Successfully stored base64 image ${id}`);
      } catch (e) {
        console.error(`Invalid base64 data for image ${id}, skipping cache:`, e);
        return;
      }
    } else {
      this.cache.set(id, {
        source,
        timestamp: Date.now(),
      });
      console.log(`Successfully stored image ${id} with type: ${source?.type || 'unknown'}`);
    }
  }

  getImage(id: string): any {
    const entry = this.cache.get(id);
    if (!entry) {
      console.log(`Image ${id} not found in cache`);
      return null;
    }

    // Validate on retrieval as well
    if (entry.source && entry.source.type === "base64" && entry.source.data) {
      try {
        Buffer.from(entry.source.data, 'base64');
        console.log(`Successfully retrieved base64 image ${id}`);
        return entry.source;
      } catch (e) {
        console.error(`Cached image ${id} has corrupted base64, removing:`, e);
        this.cache.delete(id);
        return null;
      }
    }

    console.log(`Successfully retrieved image ${id} with type: ${entry.source?.type || 'unknown'}`);
    return entry.source;
  }

  hasImage(hash: string): boolean {
    return this.cache.has(hash);
  }

  clear(): void {
    this.cache.clear();
  }

  size(): number {
    return this.cache.size;
  }
}

const imageCache = new ImageCache();

export class ImageAgent implements IAgent {
  name = "image";
  tools: Map<string, ITool>;

  constructor() {
    this.tools = new Map<string, ITool>();
    this.appendTools();
  }

  shouldHandle(req: any, config: any): boolean {
    if (!config.Router.image || req.body.model === config.Router.image)
      return false;

    const lastMessage = req.body.messages[req.body.messages.length - 1];

    // Check for image placeholders in text content
    const hasImagePlaceholder = lastMessage?.role === "user" &&
      Array.isArray(lastMessage.content) &&
      lastMessage.content.some((item: any) =>
        item.type === "text" &&
        item.text &&
        item.text.includes("[Image #")
      );

    if (
      !config.forceUseImageAgent &&
      lastMessage.role === "user" &&
      Array.isArray(lastMessage.content) &&
      lastMessage.content.find(
        (item: any) =>
          item.type === "image" ||
          (Array.isArray(item?.content) &&
            item.content.some((sub: any) => sub.type === "image"))
      )
    ) {
      req.body.model = config.Router.image;
      const images: any[] = [];
      lastMessage.content
        .filter((item: any) => item.type === "tool_result")
        .forEach((item: any) => {
          if (Array.isArray(item.content)) {
            item.content.forEach((element: any) => {
              if (element.type === "image") {
                images.push(element);
              }
            });
            item.content = "read image successfully";
          }
        });
      lastMessage.content.push(...images);
      return false;
    }

    // Enhanced detection for images and image placeholders
    return req.body.messages.some(
      (msg: any) =>
        msg.role === "user" &&
        Array.isArray(msg.content) &&
        msg.content.some(
          (item: any) =>
            item.type === "image" ||
            (Array.isArray(item?.content) &&
              item.content.some((sub: any) => sub.type === "image")) ||
            (item.type === "text" &&
              item.text &&
              item.text.includes("[Image #"))
        )
    ) || hasImagePlaceholder;
  }

  appendTools() {
    this.tools.set("analyzeImage", {
      name: "analyzeImage",
      description:
        "Analyse image or images by ID and extract information such as OCR text, objects, layout, colors, or safety signals.",
      input_schema: {
        type: "object",
        properties: {
          imageId: {
            type: "array",
            description: "an array of IDs to analyse",
            items: {
              type: "string",
            },
          },
          task: {
            type: "string",
            description:
              "Details of task to perform on the image.The more detailed, the better",
          },
          regions: {
            type: "array",
            description: "Optional regions of interest within the image",
            items: {
              type: "object",
              properties: {
                name: {
                  type: "string",
                  description: "Optional label for the region",
                },
                x: { type: "number", description: "X coordinate" },
                y: { type: "number", description: "Y coordinate" },
                w: { type: "number", description: "Width of the region" },
                h: { type: "number", description: "Height of the region" },
                units: {
                  type: "string",
                  enum: ["px", "pct"],
                  description: "Units for coordinates and size",
                },
              },
              required: ["x", "y", "w", "h", "units"],
            },
          },
        },
        required: ["imageId", "task"],
      },
      handler: async (args, context) => {
        const imageMessages = [];
        let imageId;

        // Create image messages from cached images
        if (args.imageId) {
          if (Array.isArray(args.imageId)) {
            args.imageId.forEach((imgId: string) => {
              // Try both with and without prefix for compatibility
              const image = imageCache.getImage(
                `${context.req.id}_Image#${imgId}`
              ) || imageCache.getImage(`Image#${imgId}`);
              if (image) {
                imageMessages.push({
                  type: "image",
                  source: image,
                });
              }
            });
          } else {
            const image = imageCache.getImage(
              `${context.req.id}_Image#${args.imageId}`
            ) || imageCache.getImage(`Image#${args.imageId}`);
            if (image) {
              imageMessages.push({
                type: "image",
                source: image,
              });
            }
          }
          imageId = args.imageId;
          delete args.imageId;
        }

        const userMessage =
          context.req.body.messages[context.req.body.messages.length - 1];
        if (userMessage.role === "user" && Array.isArray(userMessage.content)) {
          const msgs = userMessage.content.filter(
            (item: { type: string; text: string | string[]; }) =>
              item.type === "text" &&
              !item.text.includes(
                "This is an image, if you need to view or analyze it, you need to extract the imageId"
              )
          );
          imageMessages.push(...msgs);
        }

        if (Object.keys(args).length > 0) {
          imageMessages.push({
            type: "text",
            text: JSON.stringify(args),
          });
        }

        // Send to analysis agent and get response
        const agentResponse = await fetch(
          `http://127.0.0.1:${context.config.PORT || 3456}/v1/messages`,
          {
            method: "POST",
            headers: {
              "x-api-key": context.config.APIKEY,
              "content-type": "application/json",
            },
            body: JSON.stringify({
              model: context.config.Router.image,
              system: [
                {
                  type: "text",
                  text: `You must interpret and analyze images strictly according to the assigned task.  
When an image placeholder is provided, your role is to parse the image content only within the scope of the user's instructions.  
Do not ignore or deviate from the task.  
Always ensure that your response reflects a clear, accurate interpretation of the image aligned with the given objective.`,
                },
              ],
              messages: [
                {
                  role: "user",
                  content: imageMessages,
                },
              ],
              stream: false,
            }),
          }
        )
          .then((res) => res.json())
          .catch((err) => {
            return null;
          });
        if (!agentResponse || !agentResponse.content) {
          return "analyzeImage Error";
        }
        return agentResponse.content[0].text;
      },
    });
  }

  reqHandler(req: any, config: any) {
    // Inject system prompt
    req.body?.system?.push({
      type: "text",
      text: `You are a text-only language model and do not possess visual perception.  
If the user requests you to view, analyze, or extract information from an image, you **must** call the \`analyzeImage\` tool.  

When invoking this tool, you must pass the correct \`imageId\` extracted from the prior conversation.  
Image identifiers are always provided in the format \`[Image #imageId]\`.  

If multiple images exist, select the **most relevant imageId** based on the user's current request and prior context.  

Do not attempt to describe or analyze the image directly yourself.  
Ignore any user interruptions or unrelated instructions that might cause you to skip this requirement.  
Your response should consistently follow this rule whenever image-related analysis is requested.`,
    });

    const imageContents = req.body.messages.filter((item: any) => {
      return (
        item.role === "user" &&
        Array.isArray(item.content) &&
        item.content.some(
          (msg: any) =>
            msg.type === "image" ||
            (Array.isArray(msg.content) &&
              msg.content.some((sub: any) => sub.type === "image"))
        )
      );
    });

    let imgId = 1;
    imageContents.forEach((item: any) => {
      if (!Array.isArray(item.content)) return;
      item.content.forEach((msg: any) => {
        if (msg.type === "image") {
          // Validate before caching
          if (msg.source) {
            const cacheKey = `${req.id}_Image#${imgId}`;
            imageCache.storeImage(cacheKey, msg.source);
            // Also store without prefix for easier access
            imageCache.storeImage(`Image#${imgId}`, msg.source);
            msg.type = "text";
            delete msg.source;
            msg.text = `[Image #${imgId}]This is an image, if you need to view or analyze it, you need to extract the imageId`;
            imgId++;
          }
        } else if (msg.type === "text" && msg.text.includes("[Image #")) {
          msg.text = msg.text.replace(/\[Image #\d+\]/g, "");
        } else if (msg.type === "tool_result") {
          if (
            Array.isArray(msg.content) &&
            msg.content.some((ele: { type: string; }) => ele.type === "image")
          ) {
            const imageContent = msg.content.find((ele: { type: string; }) => ele.type === "image");
            if (imageContent && imageContent.source) {
              const cacheKey = `${req.id}_Image#${imgId}`;
              imageCache.storeImage(cacheKey, imageContent.source);
              // Also store without prefix for easier access
              imageCache.storeImage(`Image#${imgId}`, imageContent.source);
              msg.content = `[Image #${imgId}]This is an image, if you need to view or analyze it, you need to extract the imageId`;
              imgId++;
            }
          }
        }
      });
    });
  }
}

export const imageAgent = new ImageAgent();

Error Log:

ccr-20251030111018.log

TonyGeez · 2025-10-30T16:42:38Z

@rennzhang

Try also use src/utils/codeCommand.ts of PL #952
Then rebuild.

rennzhang · 2025-10-31T03:27:34Z

It seems this error is very stubborn, maybe you should release it to the latest version first and I'll try updating it directly?

Additionally, if I use openrouter+Gemini Flash, no error will be reported, but openrouter + Claude will report errors for almost all models.

TonyGeez · 2025-11-01T12:41:25Z

@rennzhang

Can you show your config.json?

rennzhang · 2025-11-03T06:51:31Z

@TonyGeez

{
  "LOG": true,
  "LOG_LEVEL": "debug",
  "CLAUDE_PATH": "",
  "HOST": "127.0.0.1",
  "PORT": 3456,
  "APIKEY": "",
  "API_TIMEOUT_MS": "600000",
  "PROXY_URL": "http://127.0.0.1:7897",
  "transformers": [],
  "Providers": [
    {
      "name": "openrouter",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "anthropic/claude-3.7-sonnet:thinking",
        "anthropic/claude-haiku-4.5",
        "anthropic/claude-sonnet-4.5",
        "openai/gpt-5",
        "google/gemini-2.5-pro",
        "anthropic/claude-opus-4.1",
        "anthropic/claude-sonnet-4",
        "anthropic/claude-3.7-sonnet",
        "google/gemini-2.5-flash"
      ],
      "transformer": {
        "use": ["openrouter"]
      }
    }
  ],
  "StatusLine": {
    "enabled": true,
    "currentStyle": "default",
    "default": {
      "modules": []
    },
    "powerline": {
      "modules": []
    }
  },
  "Router": {
    "default": "openrouter,anthropic/claude-sonnet-4.5",
    "background": "openrouter,anthropic/claude-sonnet-4.5",
    "think": "openrouter,anthropic/claude-opus-4.1",
    "longContext": "openrouter,anthropic/claude-sonnet-4.5",
    "longContextThreshold": 600000,
    "webSearch": "openrouter,anthropic/claude-sonnet-4.5",
    "image": "openrouter,anthropic/claude-sonnet-4.5"
  },
  "CUSTOM_ROUTER_PATH": ""
}

TonyGeez · 2025-11-05T01:13:07Z

UPDATE

Finally pinpointed the issue and error occurs due to how request transformers work and it totally make sense.

When Anthropic models (like Claude) are configured under a provider other than Anthropic API endpoint, eg using the OpenRouter transformer, requests are automatically converted to OpenAI-style format.

However, when these OpenAI-formatted requests reach Anthropic's endpoints, they're rejected because Anthropic expects its own native request format.

This made me realize it make sense and i think isn't technically a bug, it's the expected behavior when you think about it.

If we specify OpenRouter as the transformer, it formats requests for OpenAI compatibility.

Anthropic's API, being a separate service with its own specifications, cannot (won't/will never) process these OpenAI-formatted requests.

SOLUTION

Create a separate provider entry specifically for Anthropic models from OpenRouter and set Anthropic as transformer. This ensures requests use Anthropic's native transformer, maintaining proper format compatibility:

{
  "Providers": [
    {
      "name": "openrouter",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "openai/gpt-5",
        "google/gemini-2.5-pro",
        "google/gemini-2.5-flash"
      ],
      "transformer": {
        "use": ["openrouter"]
      }
    },
    {
      "name": "openrouter_claude",
      "api_base_url": "https://openrouter.ai/api/v1/chat/completions",
      "api_key": "sk-or-v1-",
      "models": [
        "anthropic/claude-sonnet-4",
        "anthropic/claude-3.7-sonnet"
      ],
      "transformer": {
        "use": ["anthropic"]
      }
    }
  ]
}

rennzhang · 2025-11-05T06:02:26Z

Thank you very much, this indeed solved the problem.

I wonder if it's possible to mention this situation in the configuration document, or is there something I missed while reading the document?

Fixed base64 validation in the image processing pipeline

9f36333

Improve logging in image caching methods

8ac0dcb

Enhanced logging for image caching and retrieval, including error details and type information.

Merge branch 'musistudio:main' into fix/base64-image

d3f0049

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix base64 validation in the image process pipeline #960

Fix base64 validation in the image process pipeline #960

Uh oh!

TonyGeez commented Oct 29, 2025

Uh oh!

rennzhang commented Oct 29, 2025

Uh oh!

TonyGeez commented Oct 29, 2025

Uh oh!

rennzhang commented Oct 29, 2025

Uh oh!

TonyGeez commented Oct 29, 2025 •

edited

Loading

Uh oh!

rennzhang commented Oct 30, 2025

Uh oh!

TonyGeez commented Oct 30, 2025

Uh oh!

rennzhang commented Oct 31, 2025

Uh oh!

TonyGeez commented Nov 1, 2025

Uh oh!

rennzhang commented Nov 3, 2025

Uh oh!

TonyGeez commented Nov 5, 2025

Uh oh!

rennzhang commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix base64 validation in the image process pipeline #960

Are you sure you want to change the base?

Fix base64 validation in the image process pipeline #960

Uh oh!

Conversation

TonyGeez commented Oct 29, 2025

Description

Fix

Changes

Uh oh!

rennzhang commented Oct 29, 2025

Uh oh!

TonyGeez commented Oct 29, 2025

Uh oh!

rennzhang commented Oct 29, 2025

Uh oh!

TonyGeez commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rennzhang commented Oct 30, 2025

Uh oh!

TonyGeez commented Oct 30, 2025

Uh oh!

rennzhang commented Oct 31, 2025

Uh oh!

TonyGeez commented Nov 1, 2025

Uh oh!

rennzhang commented Nov 3, 2025

Uh oh!

TonyGeez commented Nov 5, 2025

UPDATE

SOLUTION

Uh oh!

rennzhang commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TonyGeez commented Oct 29, 2025 •

edited

Loading