Files
Nim/lib
Pedro Batista 4f6b727d9e pegs: accept UTF-8 bytes in bare identifier terminals (#25829)
## Summary
- Fixes `std/pegs` lexing for bare UTF-8 terminals such as `\i café`.
- The lexer previously stopped at the first non-ASCII byte, so
`pkTerminalIgnoreCase` never saw the full term despite its rune-aware
`fastRuneAt`/`toLower` matching.
- This now keeps non-ASCII bytes in identifier-style terminals while
ASCII non-ident characters still terminate the symbol.

## Behavior
Before: `match("CAFÉ", peg"\i café")` failed because the terminal was
lexed as `caf`.
After: `match("CAFÉ", peg"\i café")`, `match("Café", peg"\i café")`, and
`findAll` over mixed-case occurrences pass.

`std/pegs` documents `useUnicode = true` as proper UTF-8 support, and
quoted terminals already preserved the same bytes; this makes bare
terminals consistent with that path.

I did not find an existing relevant issue or PR in searches for
pegs/unicode/utf8/getSymbol/pkTerminalIgnoreCase.
2026-05-19 23:27:48 +02:00
..
2017-02-20 17:24:19 +02:00
2013-03-16 23:53:07 +01:00
2021-06-03 14:00:53 +02:00
2026-05-12 23:20:10 +02:00