Skip to content

Conversation

@drogus
Copy link
Collaborator

@drogus drogus commented Nov 14, 2025

Description of Changes

We have some metrics measuring various lower level things like index scans, but at the moment we have no easy way to figure out which columns might need an index. This commit introduces three new metrics that provide that information by labeling count, latency, and number of rows canned along with the scan type (index scan, table scan, mixed scan) and info about unindexed columns.

API and ABI breaking changes

None

Expected complexity level and risk

2

I'm honestly not sure. I don't think it's overly complex, but it adds some overhead in the subscriptions initial query path.

Testing

  • I've tested the changes locally

@drogus drogus force-pushed the drogus/subscription-query-metrics branch 2 times, most recently from dd12d06 to 7b59fe4 Compare November 14, 2025 16:35
We have some metrics measuring various lower level things like index
scans, but at the moment we have no easy way to figure out which columns
might need an index. This commit introduces three new metrics that
provide that information by labeling count, latency, and number of rows
canned along with the scan type (index scan, table scan, mixed scan) and
info about unindexed columns.
@drogus drogus force-pushed the drogus/subscription-query-metrics branch from 7b59fe4 to 9cf4564 Compare November 14, 2025 16:37
Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to be able to identify columns that should be indexed, the only operation you care about is full table scan -> filter. It's unclear to me what the purpose of all of the different scan strategies is, and how they're actionable.

@drogus
Copy link
Collaborator Author

drogus commented Nov 15, 2025

@joshua-spacetime I guess I wanted to add scan type in case we want to use these metrics for something else, but couldn't a partial index scan also benefit from adding a compound index? Like, you have x, y columns, but you have an index only on x, but for each x value you have 10k rows with different y values. So then SELECT * FROM points WHERE x = 10 AND y = 10 would use an index on the x column, but then it would have to load 10k rows and filter them. Am I missing something?

@bfops bfops added the release-any To be landed in any release window label Nov 17, 2025
@drogus drogus force-pushed the drogus/subscription-query-metrics branch from 4eb3416 to 9cf4564 Compare November 20, 2025 01:25
Copy link
Collaborator

@joshua-spacetime joshua-spacetime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with a small comment.

Also, right now these metrics are not recorded for (add|remove)_single_subscription. I don't believe these methods are invoked from the sdk at all, so it's probably fine. I just wanted to make sure this was what you intended.

As far as performance goes, I don't see any problem adding these metrics. It's just a small amount of overhead added to an initial subscribe. And in terms of prometheus, we already have table-level metrics, so it won't explode cardinality.

@drogus drogus force-pushed the drogus/subscription-query-metrics branch from e3477bb to f18cb14 Compare November 24, 2025 12:14
@cloutiertyler cloutiertyler added this pull request to the merge queue Nov 24, 2025
Merged via the queue into master with commit 827bb9b Nov 24, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-any To be landed in any release window

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants