Files
Nim/tests/stdlib
Pedro Batista 4f6b727d9e pegs: accept UTF-8 bytes in bare identifier terminals (#25829)
## Summary
- Fixes `std/pegs` lexing for bare UTF-8 terminals such as `\i café`.
- The lexer previously stopped at the first non-ASCII byte, so
`pkTerminalIgnoreCase` never saw the full term despite its rune-aware
`fastRuneAt`/`toLower` matching.
- This now keeps non-ASCII bytes in identifier-style terminals while
ASCII non-ident characters still terminate the symbol.

## Behavior
Before: `match("CAFÉ", peg"\i café")` failed because the terminal was
lexed as `caf`.
After: `match("CAFÉ", peg"\i café")`, `match("Café", peg"\i café")`, and
`findAll` over mixed-case occurrences pass.

`std/pegs` documents `useUnicode = true` as proper UTF-8 support, and
quoted terminals already preserved the same bytes; this makes bare
terminals consistent with that path.

I did not find an existing relevant issue or PR in searches for
pegs/unicode/utf8/getSymbol/pkTerminalIgnoreCase.
2026-05-19 23:27:48 +02:00
..
2026-02-10 13:21:35 +01:00
2020-11-13 16:15:13 +08:00
2025-11-12 20:33:26 +08:00
2021-01-07 19:16:26 +01:00
2022-09-29 12:16:42 +02:00
2022-10-22 13:42:46 +02:00
2025-11-12 20:33:26 +08:00
2021-10-24 11:51:57 +02:00
2023-06-09 16:03:28 +02:00
2020-10-18 12:57:13 -04:00
2024-11-25 10:51:03 +01:00
2025-09-09 20:05:12 +02:00
2023-06-08 08:02:57 +02:00