Araq
3caf108425
system/unicode: check for buffer overflows; refs #5284
2017-02-08 15:22:36 +01:00
Andreas Rumpf
54cc702351
update stdlib to not use deprecated symbols
2016-08-25 17:21:48 +02:00
Hans Raaf
3cea6e8a96
Added iterator for utf8 strings
2016-07-13 00:25:31 +02:00
Joey Payne
e0203a4463
Add useful unicode procs for string manipulation
...
Added: isUpper, isLower, isAlpha, isWhiteSpace, toUpper,
toLower, and capitalize
Renamed strutils procs that are similar to avoid conflicts
2016-07-01 07:37:35 -06:00
Joey Payne
f6e30981a8
Add new procs for string manipulation
...
Add center, isTitle, title, partition, rpartition, rsplit, swapCase,
translate, and expandTabs
2016-06-13 20:54:23 -06:00
Hans Raaf
2791915d7f
Optimized end offsets and added tests.
...
I hope this also shows that there are use cases. I still think the user
should get warned about performance issues with those procs, which I
added to the doc comments.
2016-06-02 17:47:33 +02:00
Hans Raaf
ac6de565ec
More work in optimizing, names and added substr().
...
This is work in progress. I added an unicode substring. Tried to handle
edgecases more consistent too.
2016-06-02 17:43:10 +02:00
Hans Raaf
1138cf5234
Some procs to deal with Rune position base indexing.
...
It can't be perfect but at least one can index on rune position
efficiently.
2016-06-02 17:43:10 +02:00
Parashurama
a98705dddc
change 'Rune' type in unicode module to 'int32'
2016-06-02 00:02:27 +02:00
theduke
25b605a3a2
validateUtf8: catch overlong ascii
...
Make unicode.validateUtf8() check for overlong ascii representations, which are 2 bytes long and start with c0 or c1.
2015-11-26 16:05:24 +01:00
Araq
ab6f8f6e5b
fixesunicode.lastRune
2015-09-29 19:30:44 +02:00
Araq
73279aba39
added unicode.lastRun, unicode.graphemeLen
2015-09-21 15:49:46 +02:00
Adam Strzelecki
43bddf62dd
lib: Trim .nim files trailing whitespace
...
via OSX: find . -name '*.nim' -exec sed -i '' -E 's/[[:space:]]+$//' {} +
2015-09-04 23:03:56 +02:00
apense
48b0de8ab4
Corrected proc name in assertion
2015-07-09 13:49:47 -04:00
apense
5fd7b7850a
Corrected documentation
2015-07-09 13:45:20 -04:00
apense
c334e89ee7
Renamed to toRunes
2015-07-04 15:07:29 -04:00
apense
64b3395ade
Added new proc
...
In reference to #2353
2015-07-03 21:33:12 -04:00
apense
0ee1672d69
Updated whitespace ranges
...
Ranges sourced from <http://www.unicode.org/Public/7.0.0/ucd/PropList.txt >_. Wikipedia also uses these ranges on its information page <http://en.wikipedia.org/wiki/Whitespace_character#Unicode >_. 0xfeff isn't included in the list, but it is a no-break space, so I guess it makes sense. 0x200b is actually a format character, but it is a zero-width space. To fit Unicode, both 0x200b and 0xfeff would be removed.
2015-06-08 19:48:57 -04:00
Araq
d3fc6e1f28
marshalling can be done at compile-time
2015-04-25 23:17:00 +02:00
def
22b4e4c2f2
Use more Natural and Positive numbers in proc parameters
...
- Didn't go through all modules, only the main ones I thought of
- Building the compiler and tests still work
2015-04-06 02:24:17 +02:00
def
bacb91002a
make toUTF8 support up to 6 bytes
2015-03-03 21:25:28 +01:00
def
512db9aea6
Fix documentation a bit in unicode
2015-02-14 19:57:32 +01:00
def
ae7ca46a09
Optimize unicode.reversed
...
Runs about 18 times faster:
- combining characters with boolean logic instead of binary search
- No more temporary sequence
- Optimize for ASCII characters
2015-01-15 23:11:02 +01:00
def
0a82b6eb62
Add reversed proc to unicode module
2015-01-02 23:52:46 +01:00
Araq
11b6958755
big rename
2014-08-27 23:42:51 +02:00
Araq
36afdca87f
resolved conflicts with master
2014-01-18 01:16:45 +01:00
Araq
438703f59e
case consistency: next steps
2013-12-29 01:13:51 +01:00
Araq
92b8fac94a
case consistency part 4
2013-12-27 23:10:36 +01:00
Araq
2df9b442c6
case consistency part 1
2013-12-27 15:48:53 +01:00
Satish BD
40bd63f83b
Define $ operator for TRune
2013-12-26 00:55:17 +02:00
Satish BD
69b816f07c
Define $ operator for TRune
2013-12-26 00:41:43 +02:00
Araq
98cf1c412a
garbage-in-garbage-out principle for unicode errors; fixes #674
2013-11-19 14:39:27 +01:00
Grzegorz Adam Hankiewicz
75be9c8d55
Implements $ proc for a sequence of TRunes.
2013-03-11 23:49:03 +01:00
Zahary Karadjov
b11fe5d0b4
more uint related fixes
2012-06-14 17:33:00 +03:00
Araq
3628731064
unicode: invalid utf-8 bytes are preserved
2012-04-13 18:52:54 +02:00
Araq
4f1b89c30c
year 2012 for most copyright headers
2012-01-02 23:07:35 +01:00
Araq
c8dda8cc6f
attempt to fix tunidecode test; GC cares for seq->openArray conversions
2011-11-21 01:33:18 +01:00
Araq
e424e13bd9
various bugfixes for generics; added generic sort proc
2011-03-03 02:01:22 +01:00
Andreas Rumpf
8098e2a421
inlining of the write barrier for dlls
2010-08-08 22:45:21 +02:00
Andreas Rumpf
cb21b0e7a7
unicode.nim compiles again
2010-05-29 00:48:51 +02:00
Andreas Rumpf
6c20509121
explicit types for generic routines
2010-05-28 23:32:46 +02:00
Andreas Rumpf
40ea1d0330
fixed pango/pangoutils new wrappers
2010-02-26 01:26:16 +01:00
rumpf_a@web.de
6bc16904ed
bugfixes for unicode; xmlparser; htmlparser; scanner
2010-02-20 19:21:38 +01:00
rumpf_a@web.de
40a5d6c3b9
continued work on html/xmlparser
2010-02-14 00:29:35 +01:00
Andreas Rumpf
eca05d2a33
cleanup of library docs
2010-02-04 00:47:59 +01:00
Andreas Rumpf
ac421c37ba
bind table
2009-11-12 19:34:21 +01:00
Andreas Rumpf
3f3dda5a77
implemented multi methods
2009-09-23 23:38:00 +02:00
Andreas Rumpf
66a7e3d37c
added tools and web dirs
2009-09-15 23:22:22 +02:00
Andreas Rumpf
4d4b3b1c04
version0.7.10
2009-06-08 08:06:25 +02:00