Problem: Some edge cases to the old (pre-#26406) and current "b_signcols"
structure result in an incorrectly sized "auto" 'signcolumn'.
Solution: * Implement a simpler 'signcolumn' validation strategy by immediately
counting the number of signs in a range upon sign insertion and
deletion. Decrease in performance here but there is a clear path
forward to decreasing this performance hit by moving signs to a
dedicated marktree, or by adding meta-data to the existing
marktree which may be queried more efficiently?
* Also replace "max_count" and keep track of the number of lines with
a certain number of signs. This makes it so that it is no longer
necessary to scan the entire buffer when the maximum number of signs
decreases. This likely makes the commit a net increase in performance.
* To ensure correctness we also have re-initialize the count for an
edited region that spans multiple lines. Such an edit may move the
signs within it. Thus we count and decrement before splicing the
marktree and count and increment after.
Problem: The entire marktree needs to be traversed each time a sign is
removed from the sentinel line.
Solution: Remove sentinel line and instead keep track of the number of
lines that hold up the 'signcolumn' in "max_count". Adjust this
number for added/removed signs, and set it to 0 when the
maximum number of signs on a line changes. Only when
"max_count" is decremented to 0 due to sign removal do we need
to check the entire buffer.
Also replace "invalid_top" and "invalid_bot" with a map of
invalid ranges, further reducing the number of lines to be
checked.
Also improve tree traversal when counting the number of signs.
Instead of looping over the to be checked range and counting
the overlap for each row, keep track of the overlap in an
array and add this to the count.
We already have an extensive suite of static analysis tools we use,
which causes a fair bit of redundancy as we get duplicate warnings. PVS
is also prone to give false warnings which creates a lot of work to
identify and disable.
Previously, a screen cell would occupy 28+4=32 bytes per cell
as we always made space for up to MAX_MCO+1 codepoints in a cell.
As an example, even a pretty modest 50*80 screen would consume
50*80*2*32 = 256000, i e a quarter megabyte
With the factor of two due to the TUI side buffer, and even more when
using msg_grid and/or ext_multigrid.
This instead stores a 4-byte union of either:
- a valid UTF-8 sequence up to 4 bytes
- an escape char which is invalid UTF-8 (0xFF) plus a 24-bit index to a
glyph cache
This avoids allocating space for huge composed glyphs _upfront_, while
still keeping rendering such glyphs reasonably fast (1 hash table lookup
+ one plain index lookup). If the same large glyphs are using repeatedly
on the screen, this is still a net reduction of memory/cache
consumption. The only case which really gets worse is if you blast
the screen full with crazy emojis and zalgo text and even this case
only leads to 4 extra bytes per char.
When only <= 4-byte glyphs are used, plus the 4-byte attribute code,
i e 8 bytes in total there is a factor of four reduction of memory use.
Memory which will be quite hot in cache as the screen buffer is scanned
over in win_line() buffer text drawing
A slight complication is that the representation depends on host byte
order. I've tested this manually by compling and running this
in qemu-s390x and it works fine. We might add a qemu based solution
to CI at some point.
Memfile used a private implementation of an open hash table with intrusive collision chains, but there is
no reason to assume the standard khash_t based Map won't work just fine.
Yes, we are taking full ownership and maintenance over memline and memfile.
No one is going to maintain it for us.
Trust the plan.
This involves two redesigns of the map.c implementations:
1. Change of macro style and code organization
The old khash.h and map.c implementation used huge #define blocks with a
lot of backslash line continuations.
This instead uses the "implementation file" .c.h pattern. Such a file is
meant to be included multiple times, with different macros set prior to
inclusion as parameters. we already use this pattern e.g. for
eval/typval_encode.c.h to implement different typval encoders reusing a
similar structure.
We can structure this code into two parts. one that only depends on key
type and is enough to implement sets, and one which depends on both key
and value to implement maps (as a wrapper around sets, with an added
value[] array)
2. Separate the main hash buckets from the key / value arrays
Change the hack buckets to only contain an index into separate key /
value arrays
This is a common pattern in modern, state of the art hashmap
implementations. Even though this leads to one more allocated array, it
is this often is a net reduction of memory consumption. Consider
key+value consuming at least 12 bytes per pair. On average, we will have
twice as many buckets per item.
Thus old implementation:
2*12 = 24 bytes per item
New implementation
1*12 + 2*4 = 20 bytes per item
And the difference gets bigger with larger items.
One might think we have pulled a fast one here, as wouldn't the average size of
the new key/value arrays be 1.5 slots per items due to amortized grows?
But remember, these arrays are fully dense, and thus the accessed memory,
measured in _cache lines_, the unit which actually matters, will be the
fully used memory but just rounded up to the nearest cache line
boundary.
This has some other interesting properties, such as an insert-only
set/map will be fully ordered by insert only. Preserving this ordering
in face of deletions is more tricky tho. As we currently don't use
ordered maps, the "delete" operation maintains compactness of the item
arrays in the simplest way by breaking the ordering. It would be
possible to implement an order-preserving delete although at some cost,
like allowing the items array to become non-dense until the next rehash.
Finally, in face of these two major changes, all code used in khash.h
has been integrated into map.c and friends. Given the heavy edits it
makes no sense to "layer" the code into a vendored and a wrapper part.
Rather, the layered cake follows the specialization depth: code shared
for all maps, code specialized to a key type (and its equivalence
relation), and finally code specialized to value+key type.
This reduces the total number of khash_t instantiations from 22 to 8.
Make the khash internal functions take the size of values as a runtime
parameter. This is abstracted with typesafe Map containers which
are still specialized for both key, value type.
Introduce `Set(key)` type for when there is no value.
Refactor shada.c to use Map/Set instead of khash directly.
This requires `map_ref` operation to be more flexible.
Return pointers to both key and value, plus an indicator for new_item.
As a bonus, `map_key` is now redundant.
Instead of Map(cstr_t, FileMarks), use a pointer map as the FileMarks struct is
humongous.
Make `event_strings` actually work like an intern pool instead of wtf it
was doing before.
Allow Include What You Use to remove unnecessary includes and only
include what is necessary. This helps with reducing compilation times
and makes it easier to visualise which dependencies are actually
required.
Work on https://github.com/neovim/neovim/issues/549, but doesn't close
it since this only works fully for .c files and not headers.
* feat(api): `group` can be either string or int
This affects the following API functions:
- `vim.api.nvim_create_autocmd`
- `vim.api.nvim_get_autocmds`
- `vim.api.nvim_do_autocmd`
closes#17552
* refactor: add two maps for fast lookups
* fix: delete augroup info from id->name map
When in "stupid_legacy_mode", the value in name->id map would be updated
to `AUGROUP_DELETED`, but the entry would still remain in id->name. This
would create a problem in `augroup_name` function which would return the
name of the augroup instead of `--DELETED--`.
The id->name map is only used for fast loopup in `augroup_name` function
so there's no point in keeping the entry of deleted augroup in it.
Co-authored-by: TJ DeVries <devries.timothyj@gmail.com>
marktree.c was originally constructed as a "generic" datatype,
to make the prototyping of its internal logic as simple as possible
and also as the usecases for various kinds of extmarks/decorations was not yet decided.
As a consequence of this, various extra indirections and allocations was
needed to use marktree to implement extmarks (ns/id pairs) and
decorations of different kinds (some which is just a single highlight
id, other an allocated list of virtual text/lines)
This change removes a lot of indirection, by making Marktree specialized
for the usecase. In particular, the namespace id and mark id is stored
directly, instead of the 64-bit global id particular to the Marktree
struct. This removes the two maps needed to convert between global and
per-ns ids.
Also, "small" decorations are stored inline, i.e. those who
doesn't refer to external heap memory anyway. That is highlights (with
priority+flags) are stored inline, while virtual text, which anyway
occurs a lot of heap allocations, do not. (previously a hack was used
to elide heap allocations for highlights with standard prio+flags)
TODO(bfredl): the functionaltest-lua CI version of gcc is having
severe issues with uint16_t bitfields, so splitting up compound
assignments and redundant casts are needed. Clean this up once we switch
to a working compiler version.
* refactor: format with uncrustify
* fixup(dundar): fix functions comments
* fixup(dundar): remove space between variable and ++/--
* fixup(dundar): better workaround for macro attributes
This is done to be able to better use uncrustify rules for macros
* fixup(justin): make preprocessors follow neovim style guide
Note: the reason for removing them is not that there after this refactor
is no use of them, but rather that having them available is an
anti-pattern: they manange an _extra_ heap allocation which has
nothing to do with the functionality of the map itself (khash
manages the real buffers internally). In case there happens to
be a reason to allocate the map structure itself later, this
should be made explicit using xcalloc/xfree calls.
syn_name2id and syn_check_group go brr.
Note: this has impact mostly when using multiple filetypes,
as the old syn_name2id was optimized to return latest
added groups quickly (which will be the latest filetype)
Add new "splice" interface for tracking buffer changes at the byte
level. This will later be reused for byte-resolution buffer updates.
(Implementation has been started, but using undocumented "_on_bytes"
option now as interface hasn't been finalized).
Use this interface to improve many edge cases of extmark adjustment.
Changed tests indicate previously incorrect behavior. Adding tests for
more edge cases will be follow-up work (overlaps on_bytes tests)
Don't consider creation/deletion of marks an undoable event by itself.
This behavior was never documented, and imposes complexity for little gain.
Add nvim__buf_add_decoration temporary API for direct access to the new
implementation. This should be refactored into a proper API for
decorations, probably involving a huge dict.
fixes#11598
- Minimum required libuv is now v1.12
- Because `uv_os_getenv` requires allocating, we must manage a map
(`envmap` in `env.c`) to maintain the old behavior of `os_getenv` .
- free() map-items after removal. khash.h does not make copies of
anything, so even its keys must be memory-managed by the caller.
closes#8398closes#9267
Namespaces is a lightweight concept that should be used to group
objects for purposes of bulk operations and introspection. This is
initially used for highlights and virtual text in buffers, and is
planned to also be used for extended marks. There is no plan use them
for privileges or isolation, neither to introduce nanespace-level
options.
Change implementation to compare sequences of bytes instead of C
strings.
The former implementation would return "false positives" in these cases:
"\0a" and "\0b": C strings are NUL-terminated, so it would compare two
empty strings.
"a" and "ab": If both strings aren't the same length, it would
compare only up to the length of the shorter one.
Fixes#5046.
API functions exposed via msgpack-rpc now fall into two categories:
- async functions, which are executed as soon as the request is parsed
- sync functions, which are invoked in nvim main loop when processing the
`K_EVENT special key
Only a few functions which can be safely executed in any context are marked as
async.