Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,21 @@ response = generator.run(messages)
print(response)
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.amazon_bedrock import AmazonBedrockChatGenerator

generator = AmazonBedrockChatGenerator(model="anthropic.claude-3-5-sonnet-20240620-v1:0")

image = ImageContent.from_file_path("apple.jpg")
message = ChatMessage.from_user(content_parts=["Describe the image using 10 words at most.", image])

response = generator.run(messages=[message])
print(response)
```

### In a pipeline

In a RAG pipeline:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,20 @@ message = ChatMessage.from_user("What's Natural Language Processing? Be brief.")
print(generator.run([message]))
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator

image = ImageContent.from_file_path("path/to/image.jpg")
messages = [ChatMessage.from_user(content_parts=["What's in this image?", image])]

generator = AnthropicChatGenerator()
result = generator.run(messages)
print(result)
```

### In a pipeline

You can also use `AnthropicChatGenerator`with the Anthropic chat models in your pipeline.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,29 @@ response = client.run(
print(response)
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack.components.generators.chat import AzureOpenAIChatGenerator

llm = AzureOpenAIChatGenerator(
azure_endpoint="<Your Azure endpoint>",
azure_deployment="gpt-4o-mini",
)

image = ImageContent.from_file_path("apple.jpg", detail="low")
user_message = ChatMessage.from_user(content_parts=[
"What does the image show? Max 5 words.",
image
])

response = llm.run([user_message])["replies"][0].text
print(response)

# Fresh red apple on straw.
```

### In a pipeline

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,24 @@ message = ChatMessage.from_user("What's Natural Language Processing? Be brief.")
print(generator.run([message]))
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.cohere import CohereChatGenerator

# Create an image from file path or base64
image = ImageContent.from_file_path("path/to/your/image.jpg")

# Create a multimodal message with both text and image
messages = [ChatMessage.from_user(content_parts=["What's in this image?", image])]

# Use a multimodal model like Command A Vision
generator = CohereChatGenerator(model="command-a-vision-07-2025")
response = generator.run(messages)
print(response)
```

#### In a Pipeline

You can also use `CohereChatGenerator` to use cohere chat models in your pipeline.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,21 @@ response = chat_generator.run(messages=messages)
print(response["replies"][0].text)
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator

chat_generator = GoogleGenAIChatGenerator()

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

response = chat_generator.run(messages=messages)
print(response["replies"][0].text)
```

You can also easily use function calls. First, define the function locally and convert into a [Tool](https://www.notion.so/docs/tool):

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,31 @@ messages = [ChatMessage.from_user("Who is the best American actor?")]
result = generator.run(messages)
```

### With multimodal (image + text) inputs

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.llama_cpp import LlamaCppChatGenerator

# Create an image from file path or base64
image = ImageContent.from_file_path("path/to/your/image.jpg")

# Create a multimodal message with both text and image
messages = [ChatMessage.from_user(content_parts=["What's in this image?", image])]

# Initialize with multimodal support
generator = LlamaCppChatGenerator(
model="llava-v1.5-7b-q4_0.gguf",
chat_handler_name="Llava15ChatHandler", # Use llava-1-5 handler
model_clip_path="mmproj-model-f16.gguf", # CLIP model
n_ctx=4096 # Larger context for image processing
)
generator.warm_up()

result = generator.run(messages)
print(result)
```

The `generation_kwargs` can also be passed to the `run` method of the generator directly:

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,21 @@ response = llm.run(
print("\n\n Model used: ", response["replies"][0].meta["model"])
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.meta_llama import MetaLlamaChatGenerator

llm = MetaLlamaChatGenerator(model="Llama-4-Scout-17B-16E-Instruct-FP8")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

response = llm.run(messages)
print(response["replies"][0].text)
```

### In a pipeline

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,21 @@ message = ChatMessage.from_user("What's Natural Language Processing? Be brief.")
print(generator.run([message]))
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.mistral import MistralChatGenerator

generator = MistralChatGenerator(model="pixtral-12b-2409")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

response = generator.run(messages)
print(response)
```

#### In a Pipeline

Below is an example RAG Pipeline where we answer questions based on the URL contents. We add the contents of the URL into our `messages` in the `ChatPromptBuilder` and generate an answer with the `MistralChatGenerator`.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,21 @@ print(result["replies"])
print(result["meta"])
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.nvidia import NvidiaChatGenerator

generator = NvidiaChatGenerator(model="meta/llama-3.2-11b-vision-instruct")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

result = generator.run(messages)
print(result["replies"])
```

### In a Pipeline

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,21 @@ print(generator.run(messages=messages))
}
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.ollama import OllamaChatGenerator

generator = OllamaChatGenerator(model="llava", url="http://localhost:11434")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

response = generator.run(messages=messages)
print(response)
```

### In a Pipeline

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,21 @@ response = client.run(
print("\n\n Model used: ", response["replies"][0].meta["model"])
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.openrouter import OpenRouterChatGenerator

client = OpenRouterChatGenerator(model="anthropic/claude-3-5-sonnet")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

response = client.run(messages)
print(response["replies"][0].text)
```

### In a pipeline

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,21 @@ result = generator.run([ChatMessage.from_user("Tell me a joke.")])
print(result)
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.stackit import STACKITChatGenerator

generator = STACKITChatGenerator(model="meta-llama/Llama-3.2-11B-Vision-Instruct")

image = ImageContent.from_file_path("apple.jpg")
messages = [ChatMessage.from_user(content_parts=["What does the image show? Max 5 words.", image])]

result = generator.run(messages)
print(result)
```

### In a pipeline

You can also use `STACKITChatGenerator` in your pipeline.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,24 @@ message = ChatMessage.from_user("What's Natural Language Processing? Be brief.")
print(generator.run([message]))
```

With multimodal inputs:

```python
from haystack.dataclasses import ChatMessage, ImageContent
from haystack_integrations.components.generators.watsonx.chat.chat_generator import WatsonxChatGenerator

# Create an image from file path or base64
image = ImageContent.from_file_path("path/to/your/image.jpg")

# Create a multimodal message with both text and image
messages = [ChatMessage.from_user(content_parts=["What's in this image?", image])]

# Use a multimodal model
generator = WatsonxChatGenerator(model="meta-llama/llama-3-2-11b-vision-instruct")
response = generator.run(messages)
print(response)
```

#### In a Pipeline

You can also use `WatsonxChatGenerator` to use IBM watsonx.ai chat models in your pipeline.
Expand Down
Loading