Problem: Characters below 256 that are not one byte are not always
recognized as word characters.
Solution: Make vim_iswordc() and vim_iswordp() work the same way. Add a test
for this. (Ozaki Kiichi)
4019cf90b8
Store text in ScreenLines as UTF-8, so it can be sent as-is to the UI
layer. `utfc_char2bytes(off,buf)` is removed, as `ScreenLines[off]` now
already contains this representation.
To recover the codepoints that the screen arrays previously contained, use
utfc_ptr2char (or utf_ptr2char to ignore composing chars).
NB: This commit does NOT change how screen.c processes incoming UTF-8 data
from buffers, cmdline, messages etc. Any algorithm that operates on UCS-4
(like arabic shaping, treatment of non-printable chars)
is left unchanged for now.
Patch-by: oni-link <knil.ino@gmail.com>
Closes#6203https://s3.amazonaws.com/archive.travis-ci.org/jobs/206794197/log.txt
References #3161
[ RUN ] ...d/neovim/neovim/test/functional/terminal/buffer_spec.lua @ 199: terminal buffer term_close() use-after-free #4393
./test/functional/helpers.lua:187: attempt to perform arithmetic on local 'written' (a nil value)
stack traceback:
./test/functional/helpers.lua:187: in function 'nvim_feed'
./test/functional/helpers.lua:329: in function 'execute'
...d/neovim/neovim/test/functional/terminal/buffer_spec.lua:206: in function <...d/neovim/neovim/test/functional/terminal/buffer_spec.lua:199>
[ ERROR ] ...d/neovim/neovim/test/functional/terminal/buffer_spec.lua @ 199: terminal buffer term_close() use-after-free #4393 (199.47 ms)
==================== File /home/travis/build/neovim/neovim/build/log/ubsan.15466 ====================
= =================================================================
= ==15466==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x621000029101 at pc 0x000000ea7ba0 bp 0x7ffd5bb628c0 sp 0x7ffd5bb628b8
= READ of size 1 at 0x621000029101 thread T0
= #0 0xea7b9f in utf_head_off /home/travis/build/neovim/neovim/src/nvim/mbyte.c:1637:7
= #1 0xeaaf53 in mb_adjustpos /home/travis/build/neovim/neovim/src/nvim/mbyte.c:1840:16
= #2 0xeaab48 in mb_adjust_cursor /home/travis/build/neovim/neovim/src/nvim/mbyte.c:1825:3
= #3 0x11000d0 in normal_finish_command /home/travis/build/neovim/neovim/src/nvim/normal.c:928:5
= #4 0x1077df1 in normal_execute /home/travis/build/neovim/neovim/src/nvim/normal.c:1147:3
= #5 0x16ff943 in state_enter /home/travis/build/neovim/neovim/src/nvim/state.c:58:26
= #6 0x102d8db in normal_enter /home/travis/build/neovim/neovim/src/nvim/normal.c:463:3
= #7 0xdf3398 in main /home/travis/build/neovim/neovim/src/nvim/main.c:540:3
= #8 0x2b973e8b4f44 in __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:287
= #9 0x447445 in _start (/home/travis/build/neovim/neovim/build/bin/nvim+0x447445)
=
= 0x621000029101 is located 1 bytes to the right of 4096-byte region [0x621000028100,0x621000029100)
= allocated by thread T0 here:
= #0 0x4f17b8 in malloc (/home/travis/build/neovim/neovim/build/bin/nvim+0x4f17b8)
= #1 0xf1f374 in try_malloc /home/travis/build/neovim/neovim/src/nvim/memory.c:84:15
= #2 0xf1f534 in xmalloc /home/travis/build/neovim/neovim/src/nvim/memory.c:118:15
= #3 0xebe6a8 in mf_alloc_bhdr /home/travis/build/neovim/neovim/src/nvim/memfile.c:646:17
= #4 0xebc394 in mf_new /home/travis/build/neovim/neovim/src/nvim/memfile.c:297:12
= #5 0xed1368 in ml_new_data /home/travis/build/neovim/neovim/src/nvim/memline.c:2704:16
= #6 0xece6ab in ml_open /home/travis/build/neovim/neovim/src/nvim/memline.c:349:8
= #7 0x6438ad in open_buffer /home/travis/build/neovim/neovim/src/nvim/buffer.c:109:7
= #8 0xa6ec8d in do_ecmd /home/travis/build/neovim/neovim/src/nvim/ex_cmds.c:2489:24
= #9 0xb5a0f9 in do_exedit /home/travis/build/neovim/neovim/src/nvim/ex_docmd.c:6723:9
= #10 0xb791f8 in ex_edit /home/travis/build/neovim/neovim/src/nvim/ex_docmd.c:6651:3
= #11 0xb28b43 in do_one_cmd /home/travis/build/neovim/neovim/src/nvim/ex_docmd.c:2198:5
= #12 0xb077a7 in do_cmdline /home/travis/build/neovim/neovim/src/nvim/ex_docmd.c:601:20
= #13 0x10905db in nv_colon /home/travis/build/neovim/neovim/src/nvim/normal.c:4495:18
= #14 0x1077de8 in normal_execute /home/travis/build/neovim/neovim/src/nvim/normal.c:1144:3
= #15 0x16ff943 in state_enter /home/travis/build/neovim/neovim/src/nvim/state.c:58:26
= #16 0x102d8db in normal_enter /home/travis/build/neovim/neovim/src/nvim/normal.c:463:3
= #17 0xdf3398 in main /home/travis/build/neovim/neovim/src/nvim/main.c:540:3
= #18 0x2b973e8b4f44 in __libc_start_main /build/eglibc-oGUzwX/eglibc-2.19/csu/libc-start.c:287
=
= SUMMARY: AddressSanitizer: heap-buffer-overflow /home/travis/build/neovim/neovim/src/nvim/mbyte.c:1637:7 in utf_head_off
stack traceback:
./test/helpers.lua:80: in function 'check_logs'
./test/functional/helpers.lua:639: in function <./test/functional/helpers.lua:638>
[----------] 9 tests from /home/travis/build/neovim/neovim/test/functional/terminal/buffer_spec.lua (2263.12 ms total)
The existing code would cause utf8len_tab to be declared as non-extern
when main.cpp included globals.h as well as in mbyte.c. This causes the
following warning
Linking C executable ../../bin/nvim
/usr/bin/ld: Warning: size of symbol `utf8len_tab' changed from 256 in CMakeFiles/nvim.dir/main.c.o to 320 in CMakeFiles/nvim.dir/mbyte.c.o
Moving the definition to globals.h and using INIT() ensures the array is
only defined in main.cpp and other places globals.h is included see an
extern declaration.
Eliminate mb_init():
Set "enc_utf" and "has_mbyte" early. Eliminate "enc_unicode" and "enc_latin1like".
init_chartab() and screenalloc() are already invoked elsewhere
in the initialization process.
The EncodingChanged autocmd cannot be triggered.
At initialization, there is no spellfiles to reload
`utf_ambiguous_width` expects the Unicode character, but in 9e1c6596 I
just passed the first UTF-8 byte to the function. This led to various
display problems because now many multi-cell characters weren't falling
into that part of the branch.
Also, to better align with the existing Vim code, remove the forced
cursor update. Setting the flag will cause it to happen in the next
UI_CALL.
Thanks to qvacua for all the help investigating the issue!
Closes#5448
Problem: Display problems when the 'ambiwidth' and 'emoji' options are not
set properly or the terminal doesn't behave as expected.
Solution: After drawing an ambiguous width character always position the
cursor.
cb0700844c
Problem: Handling emoji characters as full width has problems with
backwards compatibility.
Solution: Remove ambiguous and double width characters from the emoji table.
Use a separate table for the character class.
(partly by Yashuhiro Matsumoto)
b86f10ee10
Problem: Emoji characters are not considered as a kind of word character.
Solution: Give emoji characters a word class number. (Yashuhiro Matsumoto)
4077b33a83
Problem: Although emoji characters are ambiguous width, best is to treat
them as full width.
Solution: Update the Unicode character tables. Add the 'emoji' options.
(Yasuhiro Matsumoto)
3848e00e01
move `call_shell` to misc1.c
Move some fns to state.c
Move some fns to option.c
Move some fns to memline.c
Move `vim_chdir*` fns to file_search.c
Move some fns to new module, bytes.c
Move some fns to fileio.c
To get an UTF-8 character, utf_ptr2char() is used.
But this function can read more than maxlen bytes, if an incomplete
byte sequence is used(first byte specifies a length > maxlen).