Skip to content

Conversation

@ikreymer
Copy link
Member

@ikreymer ikreymer commented Dec 5, 2025

  • Ensure the /sitemap.xml is parsed even if robots.txt exists, but no sitemaps listed there.
  • Resolve relative URLs listed in robots.txt, eg. 'Sitemap: /my-sitemap.xml'
  • Simplify sitemap detection logic, check robots first, then sitemap.xml OR alternate url if provided via --useSitemap
  • Have two main methods, parseSitemap() and parseSitemapFromRobots() that handle the parsing.
  • follow-up to sitemapper refactor to fix concurrency: #930

- robots.txt and sitemap.xml exist, but no sitemap listed in robots, still parse sitemap.xml
- simplify detection logic to be able to check both robots and sitemap, or queue custom url
@ikreymer ikreymer requested a review from tw4l December 5, 2025 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants