[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 #29036

LucasWilkinson · 2025-11-19T21:30:23Z

Fix vllm serve deepseek-ai/DeepSeek-R1 --speculative-config '{"method": "mtp", "num_speculative_tokens": 2}' -tp 8 falsely asserting; should only assert when sequence parallelism is enabled. (Introduced in #28315)

Main

vllm serve meta-llama/Meta-Llama-3-8B-Instruct -tp 4 --port 3333 --speculative-config '{"method": "eagle", "model": "yuhuili/EAGLE-LLaMA3-Instruct-8B", "num_speculative_tokens": 2}'
...
RuntimeError: Worker failed with error 'Can't determine cudagraph shapes that are both a multiple of 3 (num_speculative_tokens + 1) required by spec-decode and 4 (tensor_parallel_size) required by sequence parallelism please adjust num_speculative_tokens or disable sequence parallelism'

PR

Boots fine

gemini-code-assist

Code Review

This pull request addresses a bug where an assertion related to sequence parallelism was incorrectly triggered even when sequence parallelism was disabled. The change correctly adds a check for self.pass_config.enable_sequence_parallelism before applying constraints on cudagraph capture sizes related to tensor_parallel_size. This ensures that the logic is only applied when sequence parallelism is active, resolving the false assertion. The fix is correct and well-targeted.

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> (cherry picked from commit 8f4f77a)

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: LuminolT <lumischen01@gmail.com>

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: jiang1.li <jiang1.li@intel.com>

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

LucasWilkinson requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, tlrmchlsmth, yewentao256 and youkaichao as code owners November 19, 2025 21:30

LucasWilkinson added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 19, 2025

gemini-code-assist bot reviewed Nov 19, 2025

View reviewed changes

LucasWilkinson changed the title ~~[BugFix] Fix false assertion with MTP=2 and TP=8~~ [BugFix] Fix false assertion with spec-decode and TP>2 Nov 19, 2025

fix mtp

bf20165

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

LucasWilkinson force-pushed the lwilkinson/mtp-fix branch from 100b3fd to bf20165 Compare November 19, 2025 21:35

LucasWilkinson changed the title ~~[BugFix] Fix false assertion with spec-decode and TP>2~~ [BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 Nov 19, 2025

mgoin approved these changes Nov 19, 2025

View reviewed changes

vllm-bot merged commit 8f4f77a into vllm-project:main Nov 19, 2025
10 of 17 checks passed

MatthewBonanni deleted the lwilkinson/mtp-fix branch November 19, 2025 21:44

khluu pushed a commit that referenced this pull request Nov 19, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (#29036)

275de34

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> (cherry picked from commit 8f4f77a)

Victor49152 pushed a commit to Victor49152/vllm that referenced this pull request Nov 20, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (vllm…

3c5220c

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

LuminolT pushed a commit to LuminolT/vllm that referenced this pull request Nov 21, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (vllm…

c74a896

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: LuminolT <lumischen01@gmail.com>

bigPYJ1151 pushed a commit that referenced this pull request Nov 25, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (#29036)

45990d1

Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com> Signed-off-by: jiang1.li <jiang1.li@intel.com>

bringlein pushed a commit to bringlein/vllm that referenced this pull request Nov 26, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (vllm…

5519044

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (vllm…

5768b19

…-project#29036) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 #29036

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 #29036

Uh oh!

LucasWilkinson commented Nov 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 #29036

[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 #29036

Uh oh!

Conversation

LucasWilkinson commented Nov 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Main

PR

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LucasWilkinson commented Nov 19, 2025 •

edited by github-actions bot

Loading