Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a6fa3b1
adding nf4 to export models for NPU, NPU usage for windows in demo
przepeck Oct 30, 2025
96a3d5f
adding section about deploying docker container on NPU
przepeck Oct 31, 2025
ed7f573
Adding labels to accuracy metrics
przepeck Oct 31, 2025
e64022e
corrected model name
przepeck Nov 5, 2025
a5229ec
adding missing step for benchmarking
przepeck Nov 7, 2025
b231435
Test accuracy labels corrected
przepeck Nov 12, 2025
0bd2d9f
fixed console description
przepeck Nov 13, 2025
950cbcf
fixed indent in accuracy descriptions
przepeck Nov 13, 2025
72cea71
deleted redundant path
przepeck Nov 14, 2025
bce000a
Merge branch 'main' into przepeck/agentic_ai_demo_npu
przepeck Nov 14, 2025
f480de6
remove max_num_batched_tokens 99999 as it was workaround for fixed issue
przepeck Nov 14, 2025
fab4abc
removing cache size param and adding cache dir to the windows commands
przepeck Nov 18, 2025
47b5e80
changing images to weekly
przepeck Nov 18, 2025
95784a0
corrected windows package note
przepeck Nov 18, 2025
0c6efa7
adjusting add_to_config to config changes
przepeck Nov 18, 2025
d317588
changing parameters for phi4 to be usable on long context
przepeck Nov 19, 2025
b3453dc
deleting Phi4 for NPU usecase
przepeck Nov 20, 2025
d5b2203
review changes
przepeck Nov 24, 2025
a1c5a72
adding qwen3 coder to agentic demo
przepeck Nov 27, 2025
c12ac0c
Merge branch 'main' into przepeck/agentic_ai_demo_npu
przepeck Nov 27, 2025
4194362
Update demos/common/export_models/export_model.py
przepeck Nov 27, 2025
f3f907c
Update demos/continuous_batching/agentic_ai/README.md
przepeck Nov 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions demos/common/export_models/export_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,8 +398,8 @@ def export_text_generation_model(model_repository_path, source_model, model_name
print("Exporting LLM model to ", llm_model_path)
if not os.path.isdir(llm_model_path) or args['overwrite_models']:
if task_parameters['target_device'] == 'NPU':
if precision != 'int4':
print("NPU target device requires int4 precision. Changing to int4")
if precision != 'int4' and precision != 'nf4':
print("NPU target device requires int4 or nf4 precision. Changing to int4")
precision = 'int4'
if task_parameters['extra_quantization_params'] == "":
print("Using default quantization parameters for NPU: --sym --ratio 1.0 --group-size -1")
Expand Down
Loading