2 Commits

Author SHA1 Message Date
Alexander Kernozhitsky
b172b34a24 Treat CJK Ideographs as letters in isAlpha() (#23651)
Because of the bug in `tools/parse_unicodedata.nim`, CJK Ideographs were
not considered letters in `isAlpha()`, even though they have category
Lo. This is because they are specified as range in `UnicodeData.txt`,
not as separate characters:

```
4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;
9FEF;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;
```

The parser was not prepared to parse such ranges and thus omitted almost
all CJK Ideographs from consideration.

To fix this, we need to consider ranges from `UnicodeData.txt` in
`tools/parse_unicodedata.nim`.
2024-05-29 06:42:07 +02:00
Miran
aeb30a72c0 update unicode.nim (#10921)
* update unicode.nim

* create a script to create the needed unicode data
* make unicode.nim compatible with Unicode v12.0.0
* slightly improve unicode.nim documentation (fixes #4795)

* more documentation
2019-03-31 08:36:04 +02:00