-
Notifications
You must be signed in to change notification settings - Fork 144
Description
I was wondering if you’d be open to adding an option in both the trainer and sampler modules to return the raw logits alongside the logprobs.
This would be really useful for experiments that need to directly manipulate or analyze logits — for example, applying controlled perturbations, studying logit sharpness, or implementing custom regularization and reward shaping methods.
If the internal pipeline already keeps logits before computing logprobs, it might be straightforward to expose them via a flag such as return_logits=True in both components.
This feature would make Tinker more flexible for advanced RLHF and off-policy research setups, where having access to logits is often essential for fine-grained diagnostics and exploration.
Thanks for considering!