Expose raw logits in trainer output

I was wondering if you’d be open to adding an option in both the trainer and sampler modules to return the raw logits alongside the logprobs.
This would be really useful for experiments that need to directly manipulate or analyze logits — for example, applying controlled perturbations, studying logit sharpness, or implementing custom regularization and reward shaping methods.
If the internal pipeline already keeps logits before computing logprobs, it might be straightforward to expose them via a flag such as return_logits=True in both components.
This feature would make Tinker more flexible for advanced RLHF and off-policy research setups, where having access to logits is often essential for fine-grained diagnostics and exploration.
Thanks for considering!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expose raw logits in trainer output #77

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expose raw logits in trainer output #77

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions