Skip to content

Commit 8f4f77a

Browse files
[BugFix] Fix false assertion with spec-decode=[2,4,..] and TP>2 (#29036)
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
1 parent 22e44ad commit 8f4f77a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/config/compilation.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -921,7 +921,7 @@ def adjust_cudagraph_sizes_for_spec_decode(
921921
self, uniform_decode_query_len: int, tensor_parallel_size: int
922922
):
923923
multiple_of = uniform_decode_query_len
924-
if tensor_parallel_size > 1:
924+
if tensor_parallel_size > 1 and self.pass_config.enable_sequence_parallelism:
925925
multiple_of = max(uniform_decode_query_len, tensor_parallel_size)
926926
if (
927927
multiple_of % uniform_decode_query_len != 0

0 commit comments

Comments
 (0)