Because of the bug in `tools/parse_unicodedata.nim`, CJK Ideographs were
not considered letters in `isAlpha()`, even though they have category
Lo. This is because they are specified as range in `UnicodeData.txt`,
not as separate characters:
```
4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;
9FEF;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;
```
The parser was not prepared to parse such ranges and thus omitted almost
all CJK Ideographs from consideration.
To fix this, we need to consider ranges from `UnicodeData.txt` in
`tools/parse_unicodedata.nim`.