add Qwen2.5-VL tutorials #4364

MrFan-yes · 2025-11-23T09:42:35Z

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: shifan <609471158@qq.com>

github-actions · 2025-11-23T09:42:44Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request adds a new README file with instructions for running the Qwen2.5-VL-32B model. The documentation is comprehensive, but I've found a couple of critical issues in the example commands that would prevent users from running them successfully. Specifically, there's a port mismatch between the server and client commands, and an inconsistent command-line argument. My review includes suggestions to fix these issues.

gemini-code-assist · 2025-11-23T09:43:26Z

docs/source/tutorials/Qwen2.5-VL.md

+Once your server is started, you can query the model with input prompts:
+
+```shell
+curl http://localhost:8000/v1/chat/completions \


The vllm serve command is configured to listen on port 8888, but this curl command is attempting to connect to port 8000. This will cause a connection failure. The port in this command should be updated to match the server's port.

Suggested change

curl http://localhost:8000/v1/chat/completions \

curl http://localhost:8888/v1/chat/completions \

gemini-code-assist · 2025-11-23T09:43:26Z

docs/source/tutorials/Qwen2.5-VL-32B.md

+    --quantization ascend \
+    --async-scheduling \
+    --tensor-parallel-size 2 \
+    --max_model_len 15000 \


The command-line argument --max_model_len uses an underscore. For consistency with other arguments (e.g., --tensor-parallel-size) and standard vLLM CLI usage, it should be --max-model-len with a hyphen. This will prevent potential errors and confusion for users.

Suggested change

--max_model_len 15000 \

--max-model-len 15000 \

Signed-off-by: shifan <609471158@qq.com>

1092626063 · 2025-12-01T08:59:03Z

docs/source/tutorials/index.md

 multi_npu_moge
 multi_npu_qwen3_moe
 multi_npu_quantization
+multi_npu_qwen2.5_vl


please rename to “Qwen2.5-VL”, such as "single node", "multi nodes" deployment should add in this readme.

1092626063 · 2025-12-01T09:00:02Z

docs/source/tutorials/multi_npu_qwen2.5_vl.md

@@ -0,0 +1,165 @@
+# Multi-NPU (Qwen2.5-VL-32B-Instruct-W8A8)


rename the title to "Qwen2.5-VL"

Signed-off-by: shifan <609471158@qq.com>

add Qwen2.5-VL README

95fb708

Signed-off-by: shifan <609471158@qq.com>

github-actions bot added the documentation Improvements or additions to documentation label Nov 23, 2025

gemini-code-assist bot reviewed Nov 23, 2025

View reviewed changes

MrFan-yes added 3 commits November 25, 2025 21:30

update Qwen2.5-VL README

06107f0

Signed-off-by: shifan <609471158@qq.com>

add multi_npu_qwen2.5_vl tutorials

ff0c8ee

Signed-off-by: shifan <609471158@qq.com>

update multi_npu_qwen2.5_vl tutorials

b408c6d

Signed-off-by: shifan <609471158@qq.com>

MrFan-yes changed the title ~~add Qwen2.5-VL README~~ add multi_npu_qwen2.5_vl tutorials Nov 29, 2025

MrFan-yes added 3 commits November 29, 2025 15:53

update multi_npu_qwen2.5_vl tutorials

747be25

Signed-off-by: shifan <609471158@qq.com>

modify multi_npu_qwen2.5_vl tutorials

028f6c0

Signed-off-by: shifan <609471158@qq.com>

modify multi_npu_qwen2.5_vl tutorials

858b362

Signed-off-by: shifan <609471158@qq.com>

1092626063 reviewed Dec 1, 2025

View reviewed changes

rename to Qwen2.5-VL

2d8e42e

Signed-off-by: shifan <609471158@qq.com>

MrFan-yes changed the title ~~add multi_npu_qwen2.5_vl tutorials~~ add Qwen2.5-VL tutorials Dec 2, 2025

MrFan-yes requested a review from 1092626063 December 2, 2025 07:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add Qwen2.5-VL tutorials #4364

add Qwen2.5-VL tutorials #4364

MrFan-yes commented Nov 23, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 23, 2025

Uh oh!

gemini-code-assist bot Nov 23, 2025

Uh oh!

1092626063 Dec 1, 2025

Uh oh!

MrFan-yes Dec 2, 2025

Uh oh!

1092626063 Dec 1, 2025

Uh oh!

MrFan-yes Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	curl http://localhost:8000/v1/chat/completions \
	curl http://localhost:8888/v1/chat/completions \

add Qwen2.5-VL tutorials #4364

Are you sure you want to change the base?

add Qwen2.5-VL tutorials #4364

Conversation

MrFan-yes commented Nov 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

1092626063 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

MrFan-yes Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

1092626063 Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

MrFan-yes Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MrFan-yes commented Nov 23, 2025 •

edited by github-actions bot

Loading