Commit Graph

49 Commits

Author SHA1 Message Date
Araq
3caf108425 system/unicode: check for buffer overflows; refs #5284 2017-02-08 15:22:36 +01:00
Andreas Rumpf
54cc702351 update stdlib to not use deprecated symbols 2016-08-25 17:21:48 +02:00
Hans Raaf
3cea6e8a96 Added iterator for utf8 strings 2016-07-13 00:25:31 +02:00
Joey Payne
e0203a4463 Add useful unicode procs for string manipulation
Added: isUpper, isLower, isAlpha, isWhiteSpace, toUpper,
toLower, and capitalize

Renamed strutils procs that are similar to avoid conflicts
2016-07-01 07:37:35 -06:00
Joey Payne
f6e30981a8 Add new procs for string manipulation
Add center, isTitle, title, partition, rpartition, rsplit, swapCase,
translate, and expandTabs
2016-06-13 20:54:23 -06:00
Hans Raaf
2791915d7f Optimized end offsets and added tests.
I hope this also shows that there are use cases. I still think the user
should get warned about performance issues with those procs, which I
added to the doc comments.
2016-06-02 17:47:33 +02:00
Hans Raaf
ac6de565ec More work in optimizing, names and added substr().
This is work in progress. I added an unicode substring. Tried to handle
edgecases more consistent too.
2016-06-02 17:43:10 +02:00
Hans Raaf
1138cf5234 Some procs to deal with Rune position base indexing.
It can't be perfect but at least one can index on rune position
efficiently.
2016-06-02 17:43:10 +02:00
Parashurama
a98705dddc change 'Rune' type in unicode module to 'int32' 2016-06-02 00:02:27 +02:00
theduke
25b605a3a2 validateUtf8: catch overlong ascii
Make unicode.validateUtf8() check for overlong ascii representations, which are 2 bytes long and start with c0 or c1.
2015-11-26 16:05:24 +01:00
Araq
ab6f8f6e5b fixesunicode.lastRune 2015-09-29 19:30:44 +02:00
Araq
73279aba39 added unicode.lastRun, unicode.graphemeLen 2015-09-21 15:49:46 +02:00
Adam Strzelecki
43bddf62dd lib: Trim .nim files trailing whitespace
via OSX: find . -name '*.nim' -exec sed -i '' -E 's/[[:space:]]+$//' {} +
2015-09-04 23:03:56 +02:00
apense
48b0de8ab4 Corrected proc name in assertion 2015-07-09 13:49:47 -04:00
apense
5fd7b7850a Corrected documentation 2015-07-09 13:45:20 -04:00
apense
c334e89ee7 Renamed to toRunes 2015-07-04 15:07:29 -04:00
apense
64b3395ade Added new proc
In reference to #2353
2015-07-03 21:33:12 -04:00
apense
0ee1672d69 Updated whitespace ranges
Ranges sourced from <http://www.unicode.org/Public/7.0.0/ucd/PropList.txt>_. Wikipedia also uses these ranges on its information page <http://en.wikipedia.org/wiki/Whitespace_character#Unicode>_. 0xfeff isn't included in the list, but it is a no-break space, so I guess it makes sense. 0x200b is actually a format character, but it is a zero-width space. To fit Unicode, both 0x200b and 0xfeff would be removed.
2015-06-08 19:48:57 -04:00
Araq
d3fc6e1f28 marshalling can be done at compile-time 2015-04-25 23:17:00 +02:00
def
22b4e4c2f2 Use more Natural and Positive numbers in proc parameters
- Didn't go through all modules, only the main ones I thought of
- Building the compiler and tests still work
2015-04-06 02:24:17 +02:00
def
bacb91002a make toUTF8 support up to 6 bytes 2015-03-03 21:25:28 +01:00
def
512db9aea6 Fix documentation a bit in unicode 2015-02-14 19:57:32 +01:00
def
ae7ca46a09 Optimize unicode.reversed
Runs about 18 times faster:
- combining characters with boolean logic instead of binary search
- No more temporary sequence
- Optimize for ASCII characters
2015-01-15 23:11:02 +01:00
def
0a82b6eb62 Add reversed proc to unicode module 2015-01-02 23:52:46 +01:00
Araq
11b6958755 big rename 2014-08-27 23:42:51 +02:00
Araq
36afdca87f resolved conflicts with master 2014-01-18 01:16:45 +01:00
Araq
438703f59e case consistency: next steps 2013-12-29 01:13:51 +01:00
Araq
92b8fac94a case consistency part 4 2013-12-27 23:10:36 +01:00
Araq
2df9b442c6 case consistency part 1 2013-12-27 15:48:53 +01:00
Satish BD
40bd63f83b Define $ operator for TRune 2013-12-26 00:55:17 +02:00
Satish BD
69b816f07c Define $ operator for TRune 2013-12-26 00:41:43 +02:00
Araq
98cf1c412a garbage-in-garbage-out principle for unicode errors; fixes #674 2013-11-19 14:39:27 +01:00
Grzegorz Adam Hankiewicz
75be9c8d55 Implements $ proc for a sequence of TRunes. 2013-03-11 23:49:03 +01:00
Zahary Karadjov
b11fe5d0b4 more uint related fixes 2012-06-14 17:33:00 +03:00
Araq
3628731064 unicode: invalid utf-8 bytes are preserved 2012-04-13 18:52:54 +02:00
Araq
4f1b89c30c year 2012 for most copyright headers 2012-01-02 23:07:35 +01:00
Araq
c8dda8cc6f attempt to fix tunidecode test; GC cares for seq->openArray conversions 2011-11-21 01:33:18 +01:00
Araq
e424e13bd9 various bugfixes for generics; added generic sort proc 2011-03-03 02:01:22 +01:00
Andreas Rumpf
8098e2a421 inlining of the write barrier for dlls 2010-08-08 22:45:21 +02:00
Andreas Rumpf
cb21b0e7a7 unicode.nim compiles again 2010-05-29 00:48:51 +02:00
Andreas Rumpf
6c20509121 explicit types for generic routines 2010-05-28 23:32:46 +02:00
Andreas Rumpf
40ea1d0330 fixed pango/pangoutils new wrappers 2010-02-26 01:26:16 +01:00
rumpf_a@web.de
6bc16904ed bugfixes for unicode; xmlparser; htmlparser; scanner 2010-02-20 19:21:38 +01:00
rumpf_a@web.de
40a5d6c3b9 continued work on html/xmlparser 2010-02-14 00:29:35 +01:00
Andreas Rumpf
eca05d2a33 cleanup of library docs 2010-02-04 00:47:59 +01:00
Andreas Rumpf
ac421c37ba bind table 2009-11-12 19:34:21 +01:00
Andreas Rumpf
3f3dda5a77 implemented multi methods 2009-09-23 23:38:00 +02:00
Andreas Rumpf
66a7e3d37c added tools and web dirs 2009-09-15 23:22:22 +02:00
Andreas Rumpf
4d4b3b1c04 version0.7.10 2009-06-08 08:06:25 +02:00