Skip to content

Conversation

@bantonsson
Copy link
Contributor

@bantonsson bantonsson commented Sep 30, 2025

Motivation

Since OtelData is no longer public, there needs to be a way to access the OpenTelemetry context from other Layers and ensure that the context is created.

Solution

Introduce OpenTelemetryContext that given a SpanRef and the Dispatch for the layers will get or create the OpenTelemetry context.

Note

Both the test and the example has a workaround for tokio-rs/tracing#3379 which makes the code unnecessarily complicated.

@mladedav
Copy link
Collaborator

I'll have a closer look later, but on first look, I have some reservations. But maybe they'll be solvable in this design.

The first thing I noticed when skimming this is that SpanRefExt::context should definitely take &mut self because it internally takes an exclusive lock on its extensions which would definitely lead to people deadlocking themselves through that.

I don't think this makes sense to rush this now so I'll leave it out of today's release and go with the simpler version that does not start the context automatically.

@bantonsson
Copy link
Contributor Author

Thanks for noticing @mladedav. That is definitely a problem. No need to rush it.

I don't think that a &mut self will protect against the locking issue. I think that the ExtensionMut needs to be passed into the function to make it explicit.

@mladedav
Copy link
Collaborator

You're right, for some reason I though when you have ExtensionMut it mutably borrows the SpanRef but it's just normal borrow (and on top of that you can just get another SpanRef by callign ctx.span(&span.id)).

Passing the extensions explicitly is also probably better since the user will most likely already have the lock or will want to get it later.

@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch from ee2e9ea to 5194d96 Compare October 7, 2025 16:20
@bantonsson
Copy link
Contributor Author

@mladedav I've reworked this a bit so the ExtensionsMut is now passed in explicitly.

@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch 3 times, most recently from 0fb4415 to 6d0d359 Compare November 5, 2025 09:33
@bantonsson
Copy link
Contributor Author

@mladedav I think this is ready for review whenever you have the time.

@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch from 6d0d359 to f62c2bf Compare November 25, 2025 14:14
@bantonsson
Copy link
Contributor Author

There seems to be a release coming soon for tracing-subscriber with the fix for on_register_dispatch that would clean up the tests and examples.

/// - When implementing advanced tracing middleware that manages multiple dispatches
pub fn context(
extensions: &mut ExtensionsMut<'_>,
dispatch: &Option<Dispatch>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just require &Dispatch? It seems that if it's None, we return None so I don't think there is a reason to take an Option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just convenience. The WeakDispatch::upgrade() method returns an &Option<Dispatch> since the underlying Dispatch could have been dropped. That shouldn't happen if this is used from layers that are stacked together, so requiring the end user to check it seems unnecessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns an owned Option<Dispatch>. Taking at least Option<&Dispatch> with dispatch.upgarde().as_ref() instead of &dispatch.upgarde() would be slightly better (clippy lint with rationale), but I still think that this should not take an Option if we don't intend to do anything with that.

I don't think it's unreasonable to require a value that we really need.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll change the API to not take an Option.

Comment on lines +987 to +991
fn get_activated_context_extensions(
dispatch: &tracing::Dispatch,
extensions: &mut ExtensionsMut<'_>,
f: &mut dyn FnMut(&mut OtelData),
) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can f directly take opentelemetry::Context?

Right now, we give a wrapper, that contains an end time (if someone wants to change that, they can directly get the otel data, no need to do anything around context activation) and a state which we pretty much guarantee to contain a case with opentelemetry::Context inside.

This then leads to stuff like

                    |data: &mut OtelData| {
                        if let OtelDataState::Context { current_cx } = &data.state {
                            cx = Some(current_cx.clone());
                        }
                    },

which is the OpenTelemetryContext::context implementation and is a bit weird considering we just made sure the context was started.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to do that since we'll probably want to hide OtelData again in the next breaking release. We made it public again just so people got access to trace/span IDs inside tracing layers and with this PR getting just the Context is sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these things are pub(crate) so OtelData doesn't leak, but I can change the signature.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do, it still doesn't make sense that the provided callback needs to match over data.state when the whole point is guaranteeing that there is a context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s already changed. That method is now only used internally.

@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch from bda9f4d to 6673afc Compare December 1, 2025 10:35
Cargo.toml Outdated
tracing = { version = "0.1.35", default-features = false, features = ["std"] }
tracing-core = "0.1.28"
tracing-subscriber = { version = "0.3.0", default-features = false, features = [
# The 0.3.22 version contains the fix for on_register_dispatch in Layer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't include the comment, it may be useful for reviewing, but I don't see a reason to keep this kind of information in Cargo.toml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll remove the comment.

Comment on lines +987 to +991
fn get_activated_context_extensions(
dispatch: &tracing::Dispatch,
extensions: &mut ExtensionsMut<'_>,
f: &mut dyn FnMut(&mut OtelData),
) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do, it still doesn't make sense that the provided callback needs to match over data.state when the whole point is guaranteeing that there is a context.

/// - When implementing advanced tracing middleware that manages multiple dispatches
pub fn context(
extensions: &mut ExtensionsMut<'_>,
dispatch: &Option<Dispatch>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns an owned Option<Dispatch>. Taking at least Option<&Dispatch> with dispatch.upgarde().as_ref() instead of &dispatch.upgarde() would be slightly better (clippy lint with rationale), but I still think that this should not take an Option if we don't intend to do anything with that.

I don't think it's unreasonable to require a value that we really need.

@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch from 6673afc to 5b7476e Compare December 2, 2025 09:59
@bantonsson bantonsson force-pushed the ban/spanrefext-for-otel-context branch from 5b7476e to c18a70e Compare December 2, 2025 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants