Skip to content

Conversation

@benironside
Copy link
Contributor

@benironside benironside commented Nov 7, 2025

Resolves #3474 by creating a tutorial for how to connect a custom LLM running in vLLM to Elastic.

Technical reviewers, I left a few questions for you in comments. Also:

  • Has this been tested with the Obs/Search Assistant, or are these instructions security-only.
  • Is this supported in v9.0+?
  • @dhru42 I could use some insight into how the use-case for this guide differs from the existing self-managed LLM guide

@github-actions
Copy link

github-actions bot commented Nov 7, 2025


## Requirements

* Docker or Podman.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a few places throughout the guide that referred to a Docker container. I changed those to just refer to a container since it seems Podman is an acceptable alternative too.


1. Configure your host server with the necessary GPU resources.
2. Run the desired model in a vLLM container.
3. Use a reverse proxy like Nginx to securely expose the endpoint to {{ecloud}}.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just Elastic Cloud that this works with? Not other deployment types?

1. When you want to invoke a tool, never describe the call in text.
2. Always return the invocation in the `tool_calls` field.
3. The `content` field must remain empty for any assistant message that performs a tool call.
4. Only use tool calls defined in the "tools" parameter.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: Following https://github.com/elastic/sdh-security-team/issues/1417 to confirm if this system prompt fix works

@dhru42
Copy link
Contributor

dhru42 commented Nov 10, 2025

@dhru42 I could use some insight into how the use-case for this guide differs from the existing self-managed LLM guide

can we make the existing page generic then link to two methods:

  1. Connect to your own local LLM with LM Studio (exists already)
  2. Connect to your own local LLM with vLLM (the google doc i shared)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Internal]: Document how to connect OSS model using vLLM

3 participants