Problem: fuzzy.c can be further improved
Solution: Fix memory leak and refactor it (glepnir).
Optimize performance and memory allocation:
- Fix memory leak in fuzzy_match_in_list.
- using single memory allocation in match_positions
- Improve has_match performance and add null pointer checks
closes: vim/vim#1801259799f3afa
Co-authored-by: glepnir <glephunter@gmail.com>
Problem: on_detach may be called after buf_freeall and other important things,
plus its textlock restrictions are insufficient. This can cause issues such as
leaks, internal errors and crashes.
Solution: disable buffer updates in buf_freeall, before autocommands (like the
order after #35355 and when do_ecmd reloads a buffer). Don't do so in
free_buffer_stuff; it's not safe to run user code there, and buf_freeall already
runs before then; just free them to avoid leaks if buf_freeall autocommands
registered more for some reason.
Fixes#28084Fixes#33967Fixes#35116
Problem:
Buffer-updates on_detach callback is invoked before buf_freeall(), which
deletes autocmds of the buffer (via apply_autocmds(EVENT_BUFWIPEOUT,
...)). Due to this, buffer-local autocmds executed in on_detach (e.g.,
LspDetach) are not actually invoked.
Solution:
Call buf_updates_unload() before buf_freeall().
Problem: Unicode has deprecated some code-points
Solution: Update the digraph tables to align with the Unicode v16
release (David Friant)
This commit updates the digraphs Left-Pointing Angle Bracket '</'
and Right-Pointing Angle Bracket '/>' to account for the fact that
the old Unicode codepoints for them (2329 and 232A, respectively)
have been deprecated. As per the Miscellaneous Technical code chart
(https://www.unicode.org/charts/PDF/U2300.pdf), the old digraphs
have been reassigned to the CJK Left Angle Bracket and Right Angle
Bracket (3008 and 3009) with their declaration moved to the
appropriate block.
This commit also introduces the new digraphs '<[' and ']>' to
represent the Mathematical Left Angle Bracket and Mathematical
Right Angle Bracket (27E8 and 27E9) to replace the deprecated code
points in the Technical block.
Tests have been added and, I believe, the documentation has been
updated accordingly.
closes: vim/vim#17990c08b94b072
Co-authored-by: David Friant <friant@HPEnvyx360.friant.dev>
Problem:
Detection of the pynvim module is currently done by finding the first
Python interpreter in the `PATH` and checking if it can import pynvim.
This has several problems:
- Activation of an unrelated Python virtual environment will break
automatic detection, unless pynvim is also installed in that
environment.
- Installing pynvim to the expected location is difficult. User
installation into the system-wide or user-wide Python site area is now
deprecated. On Ubuntu 24.04 with Python 3.12, for example, the
command `pip install --user pynvim` now fails with the error message
`error: externally-managed-environment`.
- Users may create a dedicated virtual environment in which to install
pynvim, but Nvim won't detect it; instead, they must either activate
it before launching Nvim (which interferes with the user of other
virtual environments) or else hard-code the variable
`g:python3_host_prog` in their `init.vim` to the path of the correct
Python interpreter. Neither option is desirable.
Solution:
Expose pynvim's Python interpreter on the `PATH` under the
name `pynvim-python`. Typical user-flow:
1. User installs either uv or pipx.
2. User installs pynvim via:
```
uv tool install --upgrade pynvim
# Or:
pipx install --upgrade pynvim
```
With corresponding changes in pynvim https://github.com/neovim/pynvim/issues/593
the above user-flow is all that's needed for Nvim to detect the
installed location of pynvim, even if an unrelated Python virtual
environments is activated. It uses standard Python tooling to automate
the necessary creation of a Python virtual environment for pyenv and the
publication of `pynvim-python` to a directory on `PATH`.
Problem:
Additional include directories in DEPS_INCLUDE_FLAGS variable are not
quoted. Paths with spaces break the resulting compile command.
Solution:
Enclose values in double quotes.
Note: normally we should avoid manual quoting, but in this case we can't
because of how `DEPS_INCLUDE_FLAGS` is used in `BuildLuv.cmake`
and `BuildLpeg.cmake`.
Problem: Buffer menu does not handle unicode names correctly
(after v9.1.1622)
Solution: Fix the BMHash() function (Yee Cheng Chin)
The Buffers menu uses a BMHash() function to generate a sortable number
to be used for the menu index. It used a naive (and incorrect) way of
encoding multiple ASCII values into a single integer, but assumes each
character to be only in the ASCII 32-96 range. This means if we use
non-ASCII file names (e.g. Unicode values like CJK or emojis) we get
integer underflow and overflow, causing the menu index to wrap around.
Vim's GUI implementations internally use a signed 32-bit integer for the
`gui_mch_add_menu_item()` function and so we need to make sure the menu
index is in the (0, 2^31-1) range.
To do this, if the file name starts with a non-ASCII value, we just use
the first character's value and set the high bit so it sorts after the
other ASCII ones. Otherwise, we just take the first 5 characters, and
use 5 bit for each character to encode a 30-bit number that can be
sorted.
This means Unicode file names won't be sorted beyond the first
character. This is likely going to be fine as there are lots of ways to
query buffers.
related: vim/vim#17403closes: vim/vim#179288f9de4991e
Co-authored-by: Yee Cheng Chin <ychin.git@gmail.com>
Problem: tests: fuzzy buffer name completion test doesn't match
successfully (after 9.1.1627).
Solution: Update pattern to account for the change in case sensitivity.
Also mark Test_search_stat_option() as flaky as it can still
sometimes fail (zeertzjq).
closes: vim/vim#17992891353671a
Some terminals support the CSI 8 t sequence to resize their own window
area. Neovim will emit this sequence when the TUI is resized
programatically, but it should not be emitted when the TUI is resized
because the host terminal was resized, as that results in an infinite
resize loop.
Don't print ABI version of duplicated parsers that are later in the
runtime path (see [#35326]).
Change the sorting from `name > path` to `name > rtpath_index`, this
ensures the first (loaded) parser is first in the list and any
subsequent parsers can be considered "not loaded".
This is fuzzy at best since `vim.treesitter.language.add` can take a
path to a parser and change the load order.
The correct solution is for `vim.treesitter.language.inspect` to return
the parser path so we can compare against it and/or for it to also be
able to take a path to a parser so we can inspect it without loading it
first.
These are not needed after #35129 but making uncrustify still play nice
with them was a bit tricky.
Unfortunately `uncrustify --update-config-with-doc` breaks strings
with backslashes. This issue has been reported upstream,
and in the meanwhile auto-update on every single run has been disabled.
These are special names by convention, and giving them distinct
highlighting is a nice visual clue (using Identifier by default).
This group is named "pythonClassVar" to match the name used by
python-syntax. Some third-party color schemes are aware of this
name and customized their colors accordingly.
closes: vim/vim#179681ee1d9b43d
Co-authored-by: Jon Parise <jon@indelible.org>
Problem: fuzzy.c has a few issues
Solution: Use Vims memory management, update style
(glepnir)
Problem:
- Missing cleanup of lmatchpos lists causing memory leaks
- Missing error handling for list operations
- Use of malloc() instead of Vim's alloc() functions
- Inconsistent C-style comments
- Missing null pointer checks for memory allocation
- Incorrect use of vim_free() for list objects
Solution:
- Add proper cleanup of lmatchpos in done section using list_free()
- Set lmatchpos to NULL after successful transfer to avoid confusion
- Add error handling for list_append_tv() failures
- Replace malloc() with alloc() and add null pointer checks
- Convert C-style comments to C++ style for consistency
- Fix vim_free() calls to use list_free() for list objects
closes: vim/vim#1798417a6d696bd
Co-authored-by: glepnir <glephunter@gmail.com>
Problem: fuzzy-matching can be improved
Solution: Implement a better fuzzy matching algorithm
(Girish Palya)
Replace fuzzy matching algorithm with improved fzy-based implementation
The
[current](https://www.forrestthewoods.com/blog/reverse_engineering_sublime_texts_fuzzy_match/)
fuzzy matching algorithm has several accuracy issues:
* It struggles with CamelCase
* It fails to prioritize matches at the beginning of strings, often
ranking middle matches higher.
After evaluating alternatives (see my comments
[here](https://github.com/vim/vim/issues/17531#issuecomment-3112046897)
and
[here](https://github.com/vim/vim/issues/17531#issuecomment-3121593900)),
I chose to adopt the [fzy](https://github.com/jhawthorn/fzy) algorithm,
which:
* Resolves the aforementioned issues.
* Performs better.
Implementation details
This version is based on the original fzy
[algorithm](https://github.com/jhawthorn/fzy/blob/master/src/match.c),
with one key enhancement: **multibyte character support**.
* The original implementation supports only ASCII.
* This patch replaces ascii lookup tables with function calls, making it
compatible with multibyte character sets.
* Core logic (`match_row()` and `match_positions()`) remains faithful to
the original, but now operates on codepoints rather than single-byte
characters.
Performance
Tested against a dataset of **90,000 Linux kernel filenames**. Results
(in milliseconds) show a **\~2x performance improvement** over the
current fuzzy matching algorithm.
```
Search String Current Algo FZY Algo
-------------------------------------------------
init 131.759 66.916
main 83.688 40.861
sig 98.348 39.699
index 109.222 30.738
ab 72.222 44.357
cd 83.036 54.739
a 58.94 62.242
b 43.612 43.442
c 64.39 67.442
k 40.585 36.371
z 34.708 22.781
w 38.033 30.109
cpa 82.596 38.116
arz 84.251 23.964
zzzz 35.823 22.75
dimag 110.686 29.646
xa 43.188 29.199
nha 73.953 31.001
nedax 94.775 29.568
dbue 79.846 25.902
fp 46.826 31.641
tr 90.951 55.883
kw 38.875 23.194
rp 101.575 55.775
kkkkkkkkkkkkkkkkkkkkkkkkkkkkk 48.519 30.921
```
```vim
vim9script
var haystack = readfile('/Users/gp/linux.files')
var needles = ['init', 'main', 'sig', 'index', 'ab', 'cd', 'a', 'b',
'c', 'k',
'z', 'w', 'cpa', 'arz', 'zzzz', 'dimag', 'xa', 'nha', 'nedax',
'dbue',
'fp', 'tr', 'kw', 'rp', 'kkkkkkkkkkkkkkkkkkkkkkkkkkkkk']
for needle in needles
var start = reltime()
var tmp = matchfuzzy(haystack, needle)
echom $'{needle}' (start->reltime()->reltimefloat() * 1000)
endfor
```
Additional changes
* Removed the "camelcase" option from both matchfuzzy() and
matchfuzzypos(), as it's now obsolete with the improved algorithm.
related: neovim/neovim#34101fixesvim/vim#17531closes: vim/vim#179007e0df5eee9
Co-authored-by: Girish Palya <girishji@gmail.com>