Skip to content

Conversation

@xl0
Copy link

@xl0 xl0 commented Nov 7, 2025

I've had a look at #1883 and fixed a bunch of formatting issues in the generated llms.txt

const REGEX_PATTERNS = {
	multipleNewlines: /\n{3,}/g,
	bulletSpacing: /- \n\s+/g,
	multiLineBullets: /(- [^\n]*)(?:\n\s+([^\n-][^\n]*))/g,
	startLineSpaces: /(\n|^)[ \t]+\n/g,
	endLineSpaces: /\n[ \t]+($|\n)/g,
	inlineCodeBefore: /(\S+)\s*\n\s*(`[^`]+?`)/g,
	inlineCodeAfter: /(`[^`]+?`)\s*\n\s*(\S+)/g,
	parenCodeStart: /\(\s*\n\s*(`[^`]+?`)/g,
	parenCodeEnd: /(`[^`]+?`)\s*\n\s*\)/g,
	escapedBackticks: /\\`([^`]+?)\\`/g,
	codeBlockIndent: /```([a-z]*)\n\t/g,
	htmlComments: /<!--.*?-->/gs,
} as const;

As expected, overlapping regex leads to multiple issues:

  • code blocks collide with other elements:
- You can add scoped styles, transitions, actions, etc. directly to the element ## How It Works
1. 
   An **outer wrapper element** with `{...wrapperProps}` 2. 
   An **inner content element** with `{...props}` ```svelte
- `Combobox.Content` - `DatePicker.Content` - `DateRangePicker.Content` - `DropdownMenu.Content` - `LinkPreview.Content` - `Menubar.Content` - `Popover.Content` - `Select.Content` - `Tooltip.Content` ## Examples
  • Numbered lists get an extra newline:
1. 
   The component passes all internal props and your custom props passed to the component via the `props` snippet parameter
2. 
   You decide which element receives these props
  • Tables get those humonguous -----:
| Property                                                                                    | Type                                                                                                                                                                                                                                                              | Description                                                                                                                                                                                                                                     | Details                                                                                                                                                                                        |
| ------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `type` required | `enum`- 'single' \| 'multiple'                                                          | The type of accordion. If set to `'multiple'`, the accordion will allow multiple items to be open at the same time. If set to `single`, the accordion will only allow a single item to be open.`Default:  undefined`           |                                      |

This should be a mode reasonable

| Property          | Type                                                                  | Description                                                                                                                                                                                                                       | Details |
| ----------------- | --------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
| `type` required   | `enum` - 'single' \| 'multiple'                                       | The type of accordion. If set to `'multiple'`, the accordion will allow multiple items to be open at the same time. If set to `single`, the accordion will only allow a single item to be open.`Default:  —— undefined`           |         |
  • htmlComments discards comments in the examples, not just the html code.

In addition, we are not handling the double html encoding in prop tables, which is an unrelated issue:

| `child`                                                   | `Snippet`- type SnippetProps = \&#123; props: Record\&lt;string, unknown\&gt;; \&#125;; | Use render delegation to render your own element. See [Child Snippet](/docs/child-snippet) docs for more information.`Default:  undefined`                                                                                     |  | | Data Attribute                           | 

This should be

| `child`           | `Snippet` - type SnippetProps = { props: Record\<string, unknown>; }; | Use render delegation to render your own element. See [Child Snippet](/docs/child-snippet) docs for more information.`Default:  —— undefined`                                                                                     |         |

(\< is added by remark to avoid opening a tag)

Unrelated issue - the `code` was not rendered properly from the frontmatter descrition:
image

Fixed, which also fixes it for llms.txt:
image

This PR fixes all the issue. I had a good look at the generated llms.txt, and among a large amount of improvements I don't see any new issues.

Here's a commit I made in a separate branch to compare the before and after: xl0@2f17d25

Instead of relying on regex, I worked on the remark AST for the few instances that actually required work.

I also took the liberty to get rid of the unicode replacement stuff. LLMs have no problems with unicode tokenization, and we were missing a bunch of uncode characters mainly in examples:
image

@changeset-bot
Copy link

changeset-bot bot commented Nov 7, 2025

⚠️ No Changeset found

Latest commit: 4b6e2b1

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link
Contributor

github-actions bot commented Nov 7, 2025

built with Refined Cloudflare Pages Action

⚡ Cloudflare Pages Deployment

Name Status Preview Last Commit
bits-ui ✅ Ready (View Log) Visit Preview 4b6e2b1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant