Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 17 additions & 23 deletions tensorrt_llm/sampling_params.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,14 +368,6 @@ def _setup(
if self.end_id is None:
self.end_id = tokenizer.eos_token_id
self.pad_id = tokenizer.pad_token_id
# kimi_k2 model uses the eos_token_id in generation config
if (
hf_model_config is not None
and hf_model_config.model_type == "kimi_k2"
and generation_config is not None
and isinstance(generation_config.eos_token_id, int)
):
self.end_id = generation_config.eos_token_id

if self.pad_id is None:
self.pad_id = self.end_id
Expand All @@ -395,24 +387,26 @@ def _encode(tokenizer, text, add_special_tokens):
strs = [self.stop] if isinstance(self.stop, str) else self.stop
self._stop_word_ids = [_encode(tokenizer, s, add_special_tokens) for s in strs]

# add generation_config to stop word list, only in qwen3-next now
if (
hf_model_config is not None
and hf_model_config.model_type == "qwen3_next"
and generation_config is not None
and isinstance(generation_config.eos_token_id, List)
and all(isinstance(i, int) for i in generation_config.eos_token_id)
):
if self._stop_word_ids:
# Add eos_token_id in generation_config to _stop_word_ids
# Refer to https://huggingface.co/docs/hub/en/transformers#transformers-repository-files and
# https://github.com/huggingface/transformers/blob/1ae4d917ed3badbdb1ffc167e0529f5a6d3c080d/src/transformers/generation/stopping_criteria.py#L451C1-L451C42
# The eos_token_id in generation_config are really mean to stop the text generation.
Comment on lines +390 to +393
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammatical error in comment.

Line 393 contains a grammatical error: "are really mean to stop" should be "really means to stop" or "are really meant to stop".

Apply this diff:

-        # The eos_token_id in generation_config are really mean to stop the text generation.
+        # The eos_token_id in generation_config are really meant to stop the text generation.
🤖 Prompt for AI Agents
In tensorrt_llm/sampling_params.py around lines 390 to 393, the comment has a
grammatical error: change the phrase "are really mean to stop the text
generation." to a correct form such as "are really meant to stop the text
generation." Update the comment line to use "meant" (or alternatively "mean") so
the sentence reads grammatically correct while preserving the original meaning.

if generation_config is not None and generation_config.eos_token_id is not None:
if isinstance(generation_config.eos_token_id, int):
generation_eos_token_ids = [generation_config.eos_token_id]
else: # always List[int]
generation_eos_token_ids = generation_config.eos_token_id

if self._stop_word_ids is None:
self._stop_word_ids = [generation_eos_token_ids]
else:
all_stop_tokens_id = set(i for sublist in self._stop_word_ids for i in sublist)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use plural variable name for consistency.

The variable all_stop_tokens_id contains multiple token IDs (it's a set of IDs), so it should be named all_stop_token_ids for consistency with Python naming conventions.

Apply this diff:

-                all_stop_tokens_id = set(i for sublist in self._stop_word_ids for i in sublist)
+                all_stop_token_ids = set(i for sublist in self._stop_word_ids for i in sublist)
                 from_generation_stop_token_ids = [
-                    i for i in generation_eos_token_ids if i not in all_stop_tokens_id
+                    i for i in generation_eos_token_ids if i not in all_stop_token_ids
                 ]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
all_stop_tokens_id = set(i for sublist in self._stop_word_ids for i in sublist)
all_stop_token_ids = set(i for sublist in self._stop_word_ids for i in sublist)
from_generation_stop_token_ids = [
i for i in generation_eos_token_ids if i not in all_stop_token_ids
]
🤖 Prompt for AI Agents
In tensorrt_llm/sampling_params.py around line 403, rename the variable
all_stop_tokens_id to the plural form all_stop_token_ids to reflect that it
holds multiple token IDs; update its declaration and every subsequent reference
in the file to use all_stop_token_ids to maintain consistency with naming
conventions and avoid NameError.

from_generation_stop_tokens = [
i for i in generation_config.eos_token_id if i not in all_stop_tokens_id
from_generation_stop_token_ids = [
i for i in generation_eos_token_ids if i not in all_stop_tokens_id
]

if from_generation_stop_tokens:
self._stop_word_ids.append(from_generation_stop_tokens)
else:
self._stop_word_ids = [generation_config.eos_token_id]
if from_generation_stop_token_ids:
self._stop_word_ids.append(from_generation_stop_token_ids)

return self

Expand Down
Loading