Problem:
parser_gc() calls ts_parser_delete() but leaves the userdata pointer
pointing to freed memory. If the GC finalizer runs at an unexpected time
(e.g. inside nvim_buf_get_lines #39411), a stale pointer could cause a crash.
Solution:
- NULL out `*ud` after ts_parser_delete() in parser_gc()
- Update parser_check() to handle NULL with a clear error message,
guarding all parser methods against UAF
Co-authored-by: Lewis Russell <lewis6991@gmail.com>
Signed-off-by: Szymon Wilczek <swilczek.lx@gmail.com>
Problem:
`get_node_text()` returned inconsistent results between buffer and
string sources when a node's range ends at `end_col == 0` (i.e. the node
ends with a newline). The buffer path dropped the trailing newline; the
string path included it correctly.
Solution:
Append `'\n'` in `buf_range_get_text()` when `end_col == 0` and
`start_row ~= end_row`. The `start_row ~= end_row` guard excludes
zero-width nodes at column 0, which should return `""`.
Remove the workaround in the `#trim!` directive that manually
compensated for the missing newline.
Strip whitespace in `resolve_lang()` so injection language nodes ending
at `end_col == 0` (e.g. `">lua\n"`) still resolve correctly.
After an edit, LanguageTree:_edit() updates the current trees. When the
LanguageTree manages explicit regions, _edit() also refreshes _regions
from tree:included_ranges(true), so those regions have the edited byte
offsets.
A later injection pass may call set_included_regions() with a different
number of child regions. That path discards the old trees and emits
changedtree callbacks for them. invalidate(true) does the same when a
buffer is reloaded. Before this change, both discard paths called
tree:included_ranges(true) for every old tree, even if _edit() had just
collected those exact ranges.
That duplicate range extraction is expensive with many injection trees.
Realistic shapes include generated C files with many macro bodies parsed
by the C preproc_arg injection, Markdown documents with many fenced blocks
of the same language, and template files with many embedded-language
islands. The stock highlighter registers recursive changedtree callbacks,
so this is on the normal highlighting edit path.
Track whether _regions currently came from tree:included_ranges(true)
with _regions_from_tree_ranges. _do_changedtree_callbacks() reuses
_regions only in that state; otherwise it falls back to calling
tree:included_ranges(true). Clear the marker when regions are replaced by
injection ranges, when a tree is reparsed, or when trees are discarded.
This avoids keeping a second copy of the ranges while preserving callback
precision: changedtree still receives tree:included_ranges(true) for the
old tree, not the broader managed region.
Benchmark on 100k C macro injections, one-line edit, recursive
changedtree callback:
- HEAD median: edited parse 84.2 ms, child region replacement 58.7 ms
- This change: edited parse 34.6 ms, child region replacement 8.5 ms
That is about 2.4x faster for the edit parse and 6.9x faster for child
region replacement in this workload.
Add a regression test that replacing injection regions still fires
changedtree and still reports the old tree's exact included ranges.
AI-assisted: Codex
Problem:
Cannot remove a `@conceal` highlight when defined in highlights.scm.
Solution:
Support a `@noconceal` highlight that works similarly to `@nospell` where it
overrides the conceal set on the range to remove it. Additionally, can
set the conceal metadata field to false for the same behavior.
Problem:
Can't expand treesitter-incremental-selection to the next and previous
sibling nodes.
Solution:
Pressing `]N` in visual mode will expand the selection to the next
sibling node, and `[N` will do the same with the previous node.
Problem:
`test/functional/plugin/lsp_spec.lua` had grown into a large catch-all file that mixed core LSP client lifecycle coverage, `vim.lsp.buf.*` behavior, and `vim.lsp.util.*` behavior in one place.
Solution:
Split the large tests into more focused test files without changing test coverage or intended behavior.
After this change, `lsp_spec.lua` is more focused on core LSP client/config/dynamic-registration behavior.
Problem:
`TSNode:id()` returns the underlying c pointer as a string, which may include
NUL bytes. In PUC Lua, `('%s'):format('\0a\0')` returns `''` and not `'\0a\0'`
(i.e. treats the string as a c-string (which terminates at the NUL byte)).
This resulted in two different nodes being able to have the same id.
Solution:
Use concatenation `..` instead of `string.format()`.
Problem: Treesitter highlighting regressed on 32-bit builds because ranges that should cover the whole buffer were corrupted when passed into Lua.
Solution: Round-trip those range values through Lua and validate them so treesitter sees the same ranges on 32 and 64-bit builds.
Hyphenated language names are silently dropped when used as injections
(see #38132).
This combines the normalization of language aliases into `resolve_lang`,
and also adds the normalization of hyphens to underscores, which allows
for handling of injected language tags with hyphens in their names.
Fixes#38132.
Problem:
:InspectTree don't show luadoc injection lua file. Since luadoc share
the same "root" with comment in their common primary (lua) tree.
Current logic simply show the largest (comment injection) and ignore all
smaller one (luadoc injection).
Solution:
Handle different lang injections separately. Then sort them by
byte_length to ensure the draw tree consistent.
This is a better way to prevent parallel tests from interfering with
each other, as there are many ways files can be created and deleted in
tests, so enforcing different file names is hard.
Using $TMPDIR can also work in most cases, but 'backipskip' etc. have
special defaults for $TMPDIR.
Symlink runtime/, src/, test/ and README.md to Xtest_xdg dir to make
tests more convenient (and symlinking test/ is required for busted).
Also, use README.md instead of test/README.md in the Ex mode inccommand
test, as test/README.md no longer contains 'N' char.
Problem: Spell navigation skips words on the first line because
_on_spell_nav passes an empty range (0,0) to the highlighter.
Solution: Use math.max(erow, srow + 1) to ensure a valid search window.
Signed-off-by: ashab-k <ashabkhan2000@gmail.com>
This commit changes `languagetree.lua` so that it creates a scratch
buffer under the hood when dealing with string parsers. This will make
it much easier to just use extmarks whenever we need to track injection
trees in `languagetree.lua`. This also allows us to remove the
`treesitter.c` code for parsing a string directly.
Note that the string parser's scratch buffer has `set noeol nofixeol` so
that the parsed source exactly matches the passed in string.
**Problem(?):** Buffers that (for whatever reason) aren't meant to have
a final newline are still parsed with a final newline in `treesitter.c`.
**Solution:** Don't add the newline to the last buffer line if it
shouldn't be there. (This more closely matches the approach of
`read_buffer_into()`.)
This allows us to, say, use a scratch buffer with `noeol` and `nofixeol`
behind the scenes in `get_string_parser()`.
...which would allow us to track injection trees with extmarks in that
case.
...which would allow us to not drop previous trees after reparsing a
different range with `get_parser():parse()`.
...which would prevent flickering when editing a buffer that has 2+
windows to it in view at a time.
...which would allow us to keep our sanity!!!
(one step at a time...)
Problem:
with `foldmethod=expr foldexpr=v:lua.vim.treesitter.foldexpr()
foldminlines=0`, deleting lines at the end of the buffer always
reports an invalid top error, because the top value (i.e. the
start line number of the deletion) is always 1 greater than
the total line number of the modified buffer.
Solution:
remove the ml_line_count validation
Problem: many FileType autocommands assume curbuf is the same as the target
buffer; this can cause &syntax to be restored for the wrong buffer in some cases
when TSHighlighter:destroy is called.
Solution: run nvim_exec_autocmds in the context of the target buffer via
nvim_buf_call.
This commit changes the `offset!` directive so that instead of setting a
`metadata.range` value for the entire pattern, it will set a
`metadata.offset` value. This offset will be applied to the range only
in `vim.treesitter.get_range()`, rather than at directive application
time. This allows the offset to be applied to any and all nodes captured
by the given pattern, and removes the requirement that `#offset!` be
applied to only a single node.
The downside of this change is that plugins which read from
`metadata.range` may be thrown off course, but such plugins should
prefer `vim.treesitter.get_range()` when retrieving ranges anyway.
Note that `#trim!` still sets `metadata.range`, and
`vim.treesitter.get_range()` still reads from `metadata.range`, if it
exists.
FAILED
test/functional/treesitter/fold_spec.lua
@
720
:
treesitter foldexpr doesn't open folds that are not touched
test/functional/treesitter/fold_spec.lua:767: Row 1 did not match.
Expected:
|*{1:-}^t1 |
|*{1:-}# h2 |
|*{1:│}t2 |
|{3:~ }|
|{3:~ }|
|{3:~ }|
|{3:~ }|
|1 line less; before #2 {MATCH:.*}|
Actual:
|*{1: }^t1 |
|*{1:+}{2:+-- 2 lines: # h2·····················}|
|*{3:~ }|
|{3:~ }|
|{3:~ }|
|{3:~ }|
|{3:~ }|
|1 line less; before #2 0 seconds ago |
To print the expect() call that would assert the current screen state, use
screen:snapshot_util(). In case of non-deterministic failures, use
screen:redraw_debug() to show all intermediate screen states.
Snapshot:
screen:expect([[
{1: }^t1 |
{1:+}{2:+-- 2 lines: # h2·····················}|
{3:~ }|*5
1 line less; before #2 0 seconds ago |
]])
Problem: Last diff folds not merged (after v8.1.1922)
Solution: loop over all windows in the current tabpage and update all
folds (Gary Johnson)
This commit fixes a bug where the last two folds of a diff are not
merged when the last difference between the two diff'd buffers is
resolved.
Normally, when two buffers are diff'd, folding is used to show only the
text that differs and to hide the text that is the same between the two
buffers. When a difference is resolved by making a block of text the
same in both buffers, the folds are updated to merge that block with the
folds above and below it into one closed fold.
That updating of the folds did not occur when the block of text was the
last diff block in the buffers.
The bug was introduced by this patch on August 24, 2019:
patch 8.1.1922: in diff mode global operations can be very slow
Problem: In diff mode global operations can be very slow.
Solution: Do not call diff_redraw() many times, call it once when
redrawing. And also don't update folds multiple times.
Unfortunately, folds were then not updated often enough.
The problem was fixed by adding a short loop to the ex_diffgetput()
function in diff.c to update all the folds in the current tab when the
last difference is removed.
A test for this was added to test_diffmode.vim. Two of the reference
screen dumps for another test in that file,
Test_diffget_diffput_linematch(), had to be changed to have all the
folds closed rather than to have the last diff block remain open.
closes: vim/vim#174573fa0d3514b
Co-authored-by: Gary Johnson <garyjohn@spocom.com>
Problem: conceal_lines cache is invalidated in `on_buf`
which is too late for code calculating text height after a
buffer change but before a redraw (like `lsp/util.lua`).
Solution: Replace `on_buf` with `on_bytes` handler that invalidates
the cache and clears the marks.
Problem: 'showmode' ext_messages state is not cleared after insert_expand mode.
Solution: Replace empty message with more idiomatic way to clear the showmode.
Problem:
treesitter injected language ranges sometimes cross over the capture
boundaries when `@combined`.
Solution:
Clip child regions to not spill out of parent regions within
languagetree.lua, and only apply highlights within those regions in
highlighter.lua.
Co-authored-by: Cormac Relf <web@cormacrelf.net>
Simplify the logic for retrieving the injection ranges for the language
tree. The trees are now also sorted by starting position, regardless of
whether they are part of a combined injection or not. This would be
helpful if ranges are ever to be stored in an interval tree or other
kind of sorted tree structure.
Problem: Cannot disable individual captures and patterns in treesitter queries.
Solution:
* Expose the corresponding tree-sitter API functions for `TSQuery` object.
* Add documentation for `TSQuery`.
* Return the pattern ID from `get_captures_at_pos()` (and hence `:Inspect!`).
Problem: No way to check the version of a treesitter parser.
Solution: Add version metadata (ABI 15 parsers only) as well as parser state count and supertype information (ABI 15) in `vim.treesitter.language.inspect()`. Also graduate the `abi_version` field, as this is now the official upstream name.
---------
Co-authored-by: Christian Clason <c.clason@uni-graz.at>
Problem:
Indenting text is a common task in plugins/scripts for
presentation/formatting, yet vim has no way of doing it (especially
"dedent", and especially non-buffer text).
Solution:
Introduce `vim.text.indent()`. It sets the *exact* indentation because
that's a more difficult (and thus more useful) task than merely
"increasing the current indent" (which is somewhat easy with a `gsub()`
one-liner).