Skip to content

Expose raw logits in trainer output #77

@thunderous77

Description

@thunderous77

I was wondering if you’d be open to adding an option in both the trainer and sampler modules to return the raw logits alongside the logprobs.
This would be really useful for experiments that need to directly manipulate or analyze logits — for example, applying controlled perturbations, studying logit sharpness, or implementing custom regularization and reward shaping methods.
If the internal pipeline already keeps logits before computing logprobs, it might be straightforward to expose them via a flag such as return_logits=True in both components.
This feature would make Tinker more flexible for advanced RLHF and off-policy research setups, where having access to logits is often essential for fine-grained diagnostics and exploration.
Thanks for considering!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions