[Core] Cache `vllm_is_batch_invariant` #28304

lgeiger · 2025-11-07T15:36:28Z

vllm_is_batch_invariant is called many times during inference which is noticeable in a profile. We should cache it such that we don't need to repeatedly call os.getenv().

`vllm_is_batch_invariant` is called many times during inference which is noticeable in a profile. We should cache it such that we don't need to repeatedly call `os.getenv()`. Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>

gemini-code-assist

Code Review

This pull request introduces a performance optimization by caching the result of vllm_is_batch_invariant to avoid repeated os.getenv() calls. The approach is sound, but the implementation uses functools.cache, which is only available in Python 3.9+ and would break compatibility with Python 3.8. I've suggested using functools.lru_cache(maxsize=None) instead, which is equivalent and compatible. I've also pointed out that caching introduces global state that can affect test reliability and suggested adding a pytest fixture to clear the cache between tests to ensure a robust test suite.

vllm/model_executor/layers/batch_invariant.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

vllm/model_executor/layers/batch_invariant.py

PaulZhang12 · 2025-11-10T16:40:35Z

Can you see if the batch invariant tests pass? We might need a more elegant solution, I removed the caching logic in #27856. It is hard to override the vllm_is_batch_invariant in the same environment, say you want to test an LLM instance with batch invariance and without, as it caches an envvar read

heheda12345

LGTM!

Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>

Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: Bram Wasti <bwasti@meta.com>

[Core] Cache vllm_is_batch_invariant

e5fe085

`vllm_is_batch_invariant` is called many times during inference which is noticeable in a profile. We should cache it such that we don't need to repeatedly call `os.getenv()`. Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>

gemini-code-assist bot reviewed Nov 7, 2025

View reviewed changes

vllm/model_executor/layers/batch_invariant.py Show resolved Hide resolved

vllm/model_executor/layers/batch_invariant.py Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Nov 7, 2025

View reviewed changes

vllm/model_executor/layers/batch_invariant.py Show resolved Hide resolved

heheda12345 approved these changes Nov 11, 2025

View reviewed changes

heheda12345 enabled auto-merge (squash) November 11, 2025 07:56

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 11, 2025

Merge branch 'main' into cache-batch-invariant

54d2e29

heheda12345 merged commit ac0bb2c into vllm-project:main Nov 12, 2025
47 checks passed

lgeiger deleted the cache-batch-invariant branch November 12, 2025 09:18

yewentao256 mentioned this pull request Nov 13, 2025

[Bug] Batch invariant: Fix flash attn MLA RuntimeError: scheduler_metadata must have shape (metadata_size) #27884

Merged

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025

[Core] Cache vllm_is_batch_invariant (vllm-project#28304)

3d5ec87

Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: George D. Torres <gdavtor@gmail.com>

bwasti pushed a commit to bwasti/vllm that referenced this pull request Nov 17, 2025

[Core] Cache vllm_is_batch_invariant (vllm-project#28304)

1cec1f0

Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com> Signed-off-by: Bram Wasti <bwasti@meta.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Cache `vllm_is_batch_invariant` #28304

[Core] Cache `vllm_is_batch_invariant` #28304

Uh oh!

lgeiger commented Nov 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

PaulZhang12 commented Nov 10, 2025

Uh oh!

heheda12345 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Core] Cache vllm_is_batch_invariant #28304

[Core] Cache vllm_is_batch_invariant #28304

Uh oh!

Conversation

lgeiger commented Nov 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

PaulZhang12 commented Nov 10, 2025

Uh oh!

heheda12345 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Core] Cache `vllm_is_batch_invariant` #28304

[Core] Cache `vllm_is_batch_invariant` #28304