mirror of
https://github.com/neovim/neovim.git
synced 2025-10-09 11:26:37 +00:00
feat(mbyte): support extended grapheme clusters including more emoji
Use the grapheme break algorithm from utf8proc to support grapheme clusters from recent unicode versions. Handle variant selector VS16 turning some codepoints into double-width emoji. This means we need to use ptr2cells rather than char2cells when possible.
This commit is contained in:
@@ -646,6 +646,12 @@ widespread as file format.
|
||||
A composing or combining character is used to change the meaning of the
|
||||
character before it. The combining characters are drawn on top of the
|
||||
preceding character.
|
||||
|
||||
Nvim largely follows the definition of extended grapheme clusters in UAX#29
|
||||
in the Unicode standard, with some modifications: An ascii char will always
|
||||
start a new cluster. In addition 'arabicshape' enables the combining of some
|
||||
arabic letters, when they are shaped to be displayed together in a single cell.
|
||||
|
||||
Too big combined characters cannot be displayed, but they can still be
|
||||
inspected using the |g8| and |ga| commands described below.
|
||||
When editing text a composing character is mostly considered part of the
|
||||
|
@@ -200,6 +200,12 @@ These existing features changed their behavior.
|
||||
top lines are calculated using screen line numbers which take virtual lines
|
||||
into account.
|
||||
|
||||
• The implementation of grapheme clusters (or combining chars |mbyte-combining|)
|
||||
was upgraded to closely follow extended grapheme clusters as defined by UAX#29
|
||||
in the unicode standard. Noteworthily, this enables proper display of many
|
||||
more emoji characters than before, including those encoded with multiple
|
||||
emoji codepoints combined with ZWJ (zero width joiner) codepoints.
|
||||
|
||||
==============================================================================
|
||||
REMOVED FEATURES *news-removed*
|
||||
|
||||
|
@@ -2217,9 +2217,12 @@ A jump table for the options with a short description can be found at |Q_op|.
|
||||
global
|
||||
When on all Unicode emoji characters are considered to be full width.
|
||||
This excludes "text emoji" characters, which are normally displayed as
|
||||
single width. Unfortunately there is no good specification for this
|
||||
and it has been determined on trial-and-error basis. Use the
|
||||
|setcellwidths()| function to change the behavior.
|
||||
single width. However, such "text emoji" are treated as full-width
|
||||
emoji if they are followed by the U+FE0F variant selector.
|
||||
|
||||
Unfortunately there is no good specification for this and it has been
|
||||
determined on trial-and-error basis. Use the |setcellwidths()|
|
||||
function to change the behavior.
|
||||
|
||||
*'encoding'* *'enc'*
|
||||
'encoding' 'enc' string (default "utf-8")
|
||||
|
9
runtime/lua/vim/_meta/options.lua
generated
9
runtime/lua/vim/_meta/options.lua
generated
@@ -1829,9 +1829,12 @@ vim.go.ead = vim.go.eadirection
|
||||
|
||||
--- When on all Unicode emoji characters are considered to be full width.
|
||||
--- This excludes "text emoji" characters, which are normally displayed as
|
||||
--- single width. Unfortunately there is no good specification for this
|
||||
--- and it has been determined on trial-and-error basis. Use the
|
||||
--- `setcellwidths()` function to change the behavior.
|
||||
--- single width. However, such "text emoji" are treated as full-width
|
||||
--- emoji if they are followed by the U+FE0F variant selector.
|
||||
---
|
||||
--- Unfortunately there is no good specification for this and it has been
|
||||
--- determined on trial-and-error basis. Use the `setcellwidths()`
|
||||
--- function to change the behavior.
|
||||
---
|
||||
--- @type boolean
|
||||
vim.o.emoji = true
|
||||
|
Reference in New Issue
Block a user