[Bugfix] use module-level import for patched function in Qwen3Next #4354

zjchenn · 2025-11-22T02:21:30Z

What this PR does / why we need it?

Problem: The Qwen3Next model implementation currently imports chunk_gated_delta_rule directly using from ... import ...

In frameworks like verl, the model file is often imported before vllm-ascend initializes and applies its patches. This causes the model to permanently hold a reference to the original (unpatched) vLLM kernel, resulting in execution errors on Ascend devices even if the patch runs later.

Solution: Changed the import style to from vllm...ops import chunk and call chunk.chunk_gated_delta_rule().

This ensures that the function lookup happens at runtime (dynamic dispatch), allowing the model to correctly pick up the patched function regardless of import order.

Does this PR introduce any user-facing change?

No. This is an internal bug fix to resolve import reference issues.

How was this patch tested?

vLLM version: v0.11.0
vLLM main: vllm-project/vllm@2918c1b

github-actions · 2025-11-22T02:21:39Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request correctly addresses a bug related to function patching in specific execution environments. By changing the import of chunk_gated_delta_rule from a direct function import to a module-level import, the code ensures that the function call is resolved at runtime. This allows for monkey-patching to work as intended, preventing the model from holding a reference to an old, unpatched function. The change is minimal, targeted, and effectively solves the described problem. The implementation is sound and follows Python best practices for creating patchable code. The changes look good and I have no further comments.

zjchenn · 2025-11-22T02:29:58Z

@wangxiyuan hi, please help me have a review

zjchenn · 2025-11-24T07:29:16Z

/cc @MengqingCao @yiz-liu @weijinqian0

Signed-off-by: zjchenn <zjchenn@gmail.com>

MengqingCao

LGTM, thx for this fix!

…llm-project#4354) ### What this PR does / why we need it? **Problem**: The Qwen3Next model implementation currently imports chunk_gated_delta_rule directly using `from ... import ...` In frameworks like `verl`, the model file is often imported before `vllm-ascend` initializes and applies its patches. This causes the model to permanently hold a reference to the original (unpatched) vLLM kernel, resulting in execution errors on Ascend devices even if the patch runs later. **Solution**: Changed the import style to `from vllm...ops import chunk` and call `chunk.chunk_gated_delta_rule().` This ensures that the function lookup happens at runtime (dynamic dispatch), allowing the model to correctly pick up the patched function regardless of import order. - vLLM version: v0.11.0 - vLLM main: vllm-project/vllm@2918c1b Signed-off-by: zjchenn <zjchenn@gmail.com>

gemini-code-assist bot reviewed Nov 22, 2025

View reviewed changes

zjchenn force-pushed the fix/module-level-import-for-qwen3next branch from 0db2bd3 to caeb8cf Compare November 22, 2025 02:27

zjchenn changed the title ~~[bugfix] use module-level import for 'chunk_gated_delta_rule' in Qwen3Next~~ [Bugfix] use module-level import for 'chunk_gated_delta_rule' in Qwen3Next Nov 22, 2025

zjchenn force-pushed the fix/module-level-import-for-qwen3next branch 2 times, most recently from 58e92af to d0f9b3b Compare November 22, 2025 07:15

zjchenn changed the title ~~[Bugfix] use module-level import for 'chunk_gated_delta_rule' in Qwen3Next~~ [Bugfix] use module-level import for patched function in Qwen3Next Nov 24, 2025

weijinqian0 added ready read for review ready-for-test start test by label for PR labels Nov 25, 2025

fix: use module-level import for patched function in Qwen3Next

112061f

Signed-off-by: zjchenn <zjchenn@gmail.com>

zjchenn force-pushed the fix/module-level-import-for-qwen3next branch from 456e709 to 112061f Compare November 25, 2025 07:56

yiz-liu approved these changes Nov 25, 2025

View reviewed changes

MengqingCao approved these changes Nov 25, 2025

View reviewed changes

MengqingCao merged commit 463910e into vllm-project:main Nov 25, 2025
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] use module-level import for patched function in Qwen3Next #4354

[Bugfix] use module-level import for patched function in Qwen3Next #4354

Uh oh!

zjchenn commented Nov 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

zjchenn commented Nov 22, 2025

Uh oh!

zjchenn commented Nov 24, 2025

Uh oh!

MengqingCao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Bugfix] use module-level import for patched function in Qwen3Next #4354

[Bugfix] use module-level import for patched function in Qwen3Next #4354

Uh oh!

Conversation

zjchenn commented Nov 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

zjchenn commented Nov 22, 2025

Uh oh!

zjchenn commented Nov 24, 2025

Uh oh!

MengqingCao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zjchenn commented Nov 22, 2025 •

edited by github-actions bot

Loading