Commit 21b05d2
authored
refactor: custom lexer (#437)
- adds a new `tokenizer` crate that turns a string into simple tokens
- adds a new `lexer` + `lexer_codegen` that uses the tokeniser to lex
into a new `SyntaxKind` enum
the new implementation is
- much more performant (no extra string allocations, no call to C
library)
- works with broken strings (!!!!)
- custom-made to our use-case (eg the `LineEnding` variant comes with a
count)
in a follow-up, we will be able to:
- parse custom parameters that popular tools use
- pre-process to remove unsupported stuff
- parse non-sql content (e.g. commands) via a simple custom parser
todos:
- [x] use new lexer in splitter
- [ ] make sure we support all the different parameter formats popular
tools use -> will do it in a follow-up
- [x] tests1 parent adb7a9e commit 21b05d2
File tree
70 files changed
+3462
-2854
lines changed- .claude
- crates
- pgt_diagnostics/src/display
- pgt_lexer_codegen
- postgres/17-6.1.0
- src
- pgt_lexer
- src
- codegen
- pgt_query_ext_codegen/src
- pgt_query_ext
- src
- pgt_statement_splitter
- src
- parser
- splitter
- tests
- pgt_tokenizer
- src
- snapshots
- pgt_workspace/src/workspace/server
- docs/codegen/src
- xtask/rules_check/src
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
70 files changed
+3462
-2854
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
0 commit comments