Commit d3005d2
committed
[Bugfix] Parse gpt-oss refusals w/ newer openai-harmony
The output generated by gpt-oss models does not always strictly follow
its expected harmony chat template format. This commonly - but not
exclusively - happens when gpt-oss-120b generates refusals for content
that violates its built-in safety guidelines.
To fix this, a non-strict mode was added to the openai-harmony library
to allow attempted recovery of malformed message headers in the model
output, such as a missing `<|message|>` special token before the
assistant text.
This will resolve some cases where the error
`openai_harmony.HarmonyError: unexpected tokens remaining in message
header` was previously thrown. It will not resolve all of those, as not
every malformed message output can be recovered. Other ongoing work
around using structured output for the Harmony format can help prevent
these kinds of things in the first place, once that work lands and in
the cases where the user and/or server decide to enable it.
I believe it should be safe to enable this non-strict mode by default in
vLLM, as the code paths that enables in the openai-harmony library only
gets triggered once it's already detected malformed output. So, there
shouldn't be any performance penalty in the common case. And, in the
event that the malformed content cannot be properly recovered, the
openai-harmony library will still end up throwing an error.
This is related to #23567 as well as openai/harmony#80.1 parent 0f872b7 commit d3005d2
File tree
4 files changed
+28
-3
lines changed- requirements
- tests/entrypoints
- vllm/entrypoints
4 files changed
+28
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
| 50 | + | |
51 | 51 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
620 | 620 | | |
621 | 621 | | |
622 | 622 | | |
623 | | - | |
| 623 | + | |
624 | 624 | | |
625 | 625 | | |
626 | 626 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
7 | 9 | | |
8 | 10 | | |
9 | 11 | | |
| |||
264 | 266 | | |
265 | 267 | | |
266 | 268 | | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
503 | 503 | | |
504 | 504 | | |
505 | 505 | | |
506 | | - | |
| 506 | + | |
507 | 507 | | |
508 | 508 | | |
509 | 509 | | |
| |||
0 commit comments