fix: Fix `crawler_runtime` not being updated during run and only in the end #1540

Pijukatel · 2025-11-07T10:19:51Z

Description

Fix BasicCrawler.statistics.state.crawler_runtime to be properly updated on each BasicCrawler.statistics.calculate() call when the statistics are still active.
Do not update BasicCrawler.statistics.state.crawler_runtime on BasicCrawler.statistics.calculate() when the statistics are already deactivated - to avoid confusing differences between logged and persisted state.

Issues

Closes: #1541

Testing

Added test.

Checklist

CI passed

Pijukatel · 2025-11-07T13:35:03Z

src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawler.py

    async def __aenter__(self) -> Self:
        self._active = True
        await self._state.initialize()
-        self._after_initialize()


I am not sure what this was for these dummy statistics. Could you please double-check @janbuchar ?

It was probably so that record_* methods wouldn't randomly fail during context pipeline execution because of incorrectly initialized state. I assume it's not necessary anymore?

I haven't seen any failure in tests, and those methods execute. But the same can be seen on master when deleting this line, so I guess it was made redundant by some other change?

Probably 🤞

janbuchar

Seems legit, just one comment readability issue

janbuchar · 2025-11-10T20:59:49Z

src/crawlee/crawlers/_adaptive_playwright/_adaptive_playwright_crawler.py

    async def __aenter__(self) -> Self:
        self._active = True
        await self._state.initialize()
-        self._after_initialize()


It was probably so that record_* methods wouldn't randomly fail during context pipeline execution because of incorrectly initialized state. I assume it's not necessary anymore?

janbuchar · 2025-11-10T21:03:34Z

src/crawlee/statistics/_statistics.py

        # Flag to indicate the context state.
        self._active = False

+        # Pre-existing runtime offset when importing existing statistics.


What does importing existing statistics mean here? Like restoring serialized state from KVS?

Yes, updated comment.

vdusek

LGTM

Fix runtime not being updated during run and only in the end

c0bbdb4

github-actions bot assigned Pijukatel Nov 7, 2025

github-actions bot added this to the 127th sprint - Tooling team milestone Nov 7, 2025

github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Nov 7, 2025

Pijukatel added adhoc Ad-hoc unplanned task added during the sprint. bug Something isn't working. labels Nov 7, 2025

Ensure logging consistency with exported state

ee62ae6

Pijukatel removed the adhoc Ad-hoc unplanned task added during the sprint. label Nov 7, 2025

Revert testing change

7c180bb

Pijukatel commented Nov 7, 2025

View reviewed changes

Pijukatel requested review from janbuchar and vdusek and removed request for vdusek November 7, 2025 13:37

Pijukatel marked this pull request as ready for review November 10, 2025 09:18

janbuchar approved these changes Nov 10, 2025

View reviewed changes

vdusek approved these changes Nov 11, 2025

View reviewed changes

Update comment

c65fb3b

Pijukatel merged commit 0d6c3f6 into master Nov 11, 2025
19 checks passed

Pijukatel deleted the fix-missing-runtime branch November 11, 2025 10:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Fix `crawler_runtime` not being updated during run and only in the end #1540

fix: Fix `crawler_runtime` not being updated during run and only in the end #1540

Uh oh!

Pijukatel commented Nov 7, 2025 •

edited

Loading

Uh oh!

Pijukatel Nov 7, 2025

Uh oh!

janbuchar Nov 10, 2025

Uh oh!

Pijukatel Nov 11, 2025

Uh oh!

janbuchar Nov 11, 2025

Uh oh!

janbuchar left a comment

Uh oh!

janbuchar Nov 10, 2025

Uh oh!

janbuchar Nov 10, 2025

Uh oh!

Pijukatel Nov 11, 2025

Uh oh!

vdusek left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Fix crawler_runtime not being updated during run and only in the end #1540

fix: Fix crawler_runtime not being updated during run and only in the end #1540

Uh oh!

Conversation

Pijukatel commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues

Testing

Checklist

Uh oh!

Pijukatel Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

janbuchar Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Pijukatel Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

janbuchar Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

janbuchar left a comment

Choose a reason for hiding this comment

Uh oh!

janbuchar Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

janbuchar Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Pijukatel Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

vdusek left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Fix `crawler_runtime` not being updated during run and only in the end #1540

fix: Fix `crawler_runtime` not being updated during run and only in the end #1540

Pijukatel commented Nov 7, 2025 •

edited

Loading