-
Notifications
You must be signed in to change notification settings - Fork 300
[GENAI]Introduce add_extension to genai. #2952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
7b3ad31
2f60301
80efe54
4a02c8e
31078ab
281fe8c
ad64976
e449070
4f60a60
698e5ec
8c56ca8
c1e0a7c
57bd6a1
a4ef9c8
7260493
689b08c
12e53a1
072b0b0
c05b092
7e1ac0f
ba86cc4
4705288
323e4c2
82b65bc
3b6391d
0542ae3
02d5310
97a88fc
200fb05
f49ebb9
611c1fb
34499ec
a92fe1f
75900d0
bf6bd7f
8a80895
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -259,6 +259,7 @@ ov::genai::LLMPipeline::LLMPipeline( | |
|
|
||
| bool is_npu_requested = ov::genai::utils::is_npu_requested(device, user_properties); | ||
| auto [properties, attention_backend] = utils::extract_attention_backend(user_properties, is_npu_requested); | ||
| utils::add_extensions_to_core(properties); | ||
|
||
|
|
||
| if (is_npu_requested) { | ||
| m_pimpl = StatefulPipeline::create( | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -78,7 +78,9 @@ StatefulLLMPipeline::StatefulLLMPipeline( | |
| m_max_prompt_len = kv_desc.max_prompt_len; | ||
| m_max_kv_cache_size = kv_desc.max_prompt_len + kv_desc.min_response_len; | ||
| } else { | ||
| compiled_model = utils::singleton_core().compile_model(model, device, *filtered_properties); | ||
| auto properties_without_extensions = *filtered_properties; | ||
| utils::add_extensions_to_core(properties_without_extensions); | ||
|
Comment on lines
+81
to
+82
|
||
| compiled_model = utils::singleton_core().compile_model(model, device, properties_without_extensions); | ||
| } | ||
| m_model_runner = compiled_model.create_infer_request(); | ||
| ov::genai::utils::print_compiled_model_properties(compiled_model, "Stateful LLM model"); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent use of properties vs properties_without_draft_model. At line 49,
add_extensions_to_coreis called onproperties_without_draft_model, but at line 50,utils::read_modelis called with the originalpropertiesinstead ofproperties_without_draft_model. While this may work becauseread_modelonly extracts GGUF properties and ignores EXTENSIONS, it's inconsistent with the pattern used in other constructors (see lines 93 and 138 where the same variable is used for both). Consider changing line 50 to useproperties_without_draft_modelfor consistency.