Files
Nim/lib/pure
Pedro Batista d77a1abbf0 pegs: accept UTF-8 bytes in bare identifier terminals (#25829)
## Summary
- Fixes `std/pegs` lexing for bare UTF-8 terminals such as `\i café`.
- The lexer previously stopped at the first non-ASCII byte, so
`pkTerminalIgnoreCase` never saw the full term despite its rune-aware
`fastRuneAt`/`toLower` matching.
- This now keeps non-ASCII bytes in identifier-style terminals while
ASCII non-ident characters still terminate the symbol.

## Behavior
Before: `match("CAFÉ", peg"\i café")` failed because the terminal was
lexed as `caf`.
After: `match("CAFÉ", peg"\i café")`, `match("Café", peg"\i café")`, and
`findAll` over mixed-case occurrences pass.

`std/pegs` documents `useUnicode = true` as proper UTF-8 support, and
quoted terminals already preserved the same bytes; this makes bare
terminals consistent with that path.

I did not find an existing relevant issue or PR in searches for
pegs/unicode/utf8/getSymbol/pkTerminalIgnoreCase.

(cherry picked from commit 4f6b727d9e)
2026-05-22 08:57:41 +02:00
..
2015-10-01 12:05:45 -07:00
2024-05-16 23:22:49 +02:00
2026-01-09 08:48:46 +01:00
2021-01-09 00:24:41 +01:00
2025-01-15 10:17:51 +01:00
2022-12-03 21:25:49 +08:00
2026-02-21 12:58:25 +01:00
2025-01-15 10:17:51 +01:00