refactor(grid): change schar_T representation to be more compact

Previously, a screen cell would occupy 28+4=32 bytes per cell as we always made space for up to MAX_MCO+1 codepoints in a cell. As an example, even a pretty modest 50*80 screen would consume 50*80*2*32 = 256000, i e a quarter megabyte With the factor of two due to the TUI side buffer, and even more when using msg_grid and/or ext_multigrid. This instead stores a 4-byte union of either: - a valid UTF-8 sequence up to 4 bytes - an escape char which is invalid UTF-8 (0xFF) plus a 24-bit index to a glyph cache This avoids allocating space for huge composed glyphs _upfront_, while still keeping rendering such glyphs reasonably fast (1 hash table lookup + one plain index lookup). If the same large glyphs are using repeatedly on the screen, this is still a net reduction of memory/cache consumption. The only case which really gets worse is if you blast the screen full with crazy emojis and zalgo text and even this case only leads to 4 extra bytes per char. When only <= 4-byte glyphs are used, plus the 4-byte attribute code, i e 8 bytes in total there is a factor of four reduction of memory use. Memory which will be quite hot in cache as the screen buffer is scanned over in win_line() buffer text drawing A slight complication is that the representation depends on host byte order. I've tested this manually by compling and running this in qemu-s390x and it works fine. We might add a qemu based solution to CI at some point.
2025-09-29 14:38:32 +00:00 · 2023-09-13 13:39:18 +02:00
parent 46402c16c0
commit 8da986ea87
25 changed files with 439 additions and 171 deletions
--- a/src/nvim/grid.h
+++ b/src/nvim/grid.h
@@ -33,30 +33,25 @@ EXTERN colnr_T *linebuf_vcol INIT(= NULL);
 // screen grid.

 /// Put a ASCII character in a screen cell.
-static inline void schar_from_ascii(char *p, const char c)
-{
-  p[0] = c;
-  p[1] = 0;
-}
+///
+/// If `x` is a compile time constant, schar_from_ascii(x) will also be.
+/// But the specific value varies per plattform.
+#ifdef ORDER_BIG_ENDIAN
+# define schar_from_ascii(x) ((schar_T)((x) << 24))
+#else
+# define schar_from_ascii(x) ((schar_T)(x))
+#endif

 /// Put a unicode character in a screen cell.
-static inline int schar_from_char(char *p, int c)
+static inline schar_T schar_from_char(int c)
 {
-  int len = utf_char2bytes(c, p);
-  p[len] = NUL;
-  return len;
-}
-
-/// compare the contents of two screen cells.
-static inline int schar_cmp(char *sc1, char *sc2)
-{
-  return strncmp(sc1, sc2, sizeof(schar_T));
-}
-
-/// copy the contents of screen cell `sc2` into cell `sc1`
-static inline void schar_copy(char *sc1, char *sc2)
-{
-  xstrlcpy(sc1, sc2, sizeof(schar_T));
+  schar_T sc = 0;
+  if (c >= 0x200000) {
+    // TODO(bfredl): this must NEVER happen, even if the file contained overlong sequences
+    c = 0xFFFD;
+  }
+  utf_char2bytes(c, (char *)&sc);
+  return sc;
 }

 #ifdef INCLUDE_GENERATED_DECLARATIONS