Skip to content

Conversation

@Pijukatel
Copy link
Collaborator

@Pijukatel Pijukatel commented Nov 7, 2025

Description

  • Fix BasicCrawler.statistics.state.crawler_runtime to be properly updated on each BasicCrawler.statistics.calculate() call when the statistics are still active.
  • Do not update BasicCrawler.statistics.state.crawler_runtime on BasicCrawler.statistics.calculate() when the statistics are already deactivated - to avoid confusing differences between logged and persisted state.

Issues

Closes: #1541

Testing

  • Added test.

Checklist

  • CI passed

@github-actions github-actions bot added this to the 127th sprint - Tooling team milestone Nov 7, 2025
@github-actions github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Nov 7, 2025
@Pijukatel Pijukatel added adhoc Ad-hoc unplanned task added during the sprint. bug Something isn't working. labels Nov 7, 2025
@Pijukatel Pijukatel removed the adhoc Ad-hoc unplanned task added during the sprint. label Nov 7, 2025
async def __aenter__(self) -> Self:
self._active = True
await self._state.initialize()
self._after_initialize()
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what this was for these dummy statistics. Could you please double-check @janbuchar ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably so that record_* methods wouldn't randomly fail during context pipeline execution because of incorrectly initialized state. I assume it's not necessary anymore?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen any failure in tests, and those methods execute. But the same can be seen on master when deleting this line, so I guess it was made redundant by some other change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably 🤞

@Pijukatel Pijukatel requested review from janbuchar and vdusek and removed request for vdusek November 7, 2025 13:37
@Pijukatel Pijukatel marked this pull request as ready for review November 10, 2025 09:18
Copy link
Collaborator

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit, just one comment readability issue

async def __aenter__(self) -> Self:
self._active = True
await self._state.initialize()
self._after_initialize()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was probably so that record_* methods wouldn't randomly fail during context pipeline execution because of incorrectly initialized state. I assume it's not necessary anymore?

# Flag to indicate the context state.
self._active = False

# Pre-existing runtime offset when importing existing statistics.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does importing existing statistics mean here? Like restoring serialized state from KVS?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated comment.

Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Pijukatel Pijukatel merged commit 0d6c3f6 into master Nov 11, 2025
19 checks passed
@Pijukatel Pijukatel deleted the fix-missing-runtime branch November 11, 2025 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crawler runtime stats updated only in the end

4 participants