-
Notifications
You must be signed in to change notification settings - Fork 13.6k
llama: introduce support for model-embedded sampling parameters #17120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
So, you're suggesting that parameters should be added manually before conversion? How likely is that to happen? AFAIK most models come with recommended (though, some are likely to just be copy-pasted from somewhere) settings in Edit: or is that automatically added to metadata? |
You're right, I didn't spot that. Well I guess I have to rework the code such that it pulls |
|
I think |
|
@Green-Sky Will include @CISC RE |
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Does |
Doesn't look like it. Followed some of Ollama's supported parameters: https://ollama.readthedocs.io/en/modelfile/#parameter |
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
ref: #17088
This PR introduces the feature to allow sampler parameters to be set from GGUF KV metadata allowing model creators to embed recommended sampler settings unless explicitly overridden using the CLI flags.
Handy for users who do not want to tinker with the settings but want the recommended settings applied.
Priority of Sampler Parameters
--temp 0.6)general.sampler.temp = 0.6)common_params_samplingIntroduced Metadata
general.sampler.sequencegeneral.sampler.top_kgeneral.sampler.top_pgeneral.sampler.min_pgeneral.sampler.xtc_probabilitygeneral.sampler.xtc_thresholdgeneral.sampler.tempgeneral.sampler.penalty_last_ngeneral.sampler.penalty_repeatgeneral.sampler.mirostatgeneral.sampler.mirostat_taugeneral.sampler.mirostat_etaPlease let me know if we should introduce more sampling parameters.
Embedding From Safetensors into GGUF
By default, it will attempt to find the
generation_config.jsonwithin the model directory and automatically add recommended sampler parameters into the GGUF metadata. If a sampling parameter is not available within the file, users can also specify--metadata metadata.json.Note that
--metadata metadata.jsontakes precedence overgeneration_config.jsonand will overwrite metadata if duplicate keys are found.