Commit Graph

111 Commits

Author SHA1 Message Date
Riley Bruins
8543aa406c feat(treesitter): allow LanguageTree:is_valid() to accept a range
When given, only that range will be checked for validity rather than the
entire tree. This is used in the highlighter to save CPU cycles since we
only need to parse a certain region at a time anyway.
2025-02-02 12:13:25 -08:00
Riley Bruins
9508d6a814 refactor(treesitter): use coroutines for resuming _parse() logic
This means that all work previously done by a `_parse()` iteration will
be kept in future iterations. This prevents it from running indefinitely
in some cases where the file is very large and there are 2+ injections.
2025-02-02 11:51:24 -08:00
Riley Bruins
77be44563a refactor(treesitter): always return valid range from parse() #32273
Problem:
When running an initial parse, parse() returns an empty table rather
than an actual range. In `languagetree.lua`, we manually check if
a parse was incremental to determine the changed parse region.

Solution:
- Always return a range (in the C side) from parse().
- Simplify the language tree code a bit.
- Logger no longer shows empty ranges on the initial parse.
2025-02-02 03:46:26 -08:00
Riley Bruins
02ea0e77a1 refactor(treesitter): drop LanguageTree._has_regions #32274
This simplifies some logic in `languagetree.lua`, removing the need for
`_has_regions`, and removing side effects in `:included_regions()`.

Before:
- Edit is made which sets `_regions = nil`
- Upon the next call to `included_regions()` (usually right after we
  marked `_regions` as `nil` due to an `_iter_regions()` call), if
  `_regions` is nil, we repopulate the table (as long as the tree
  actually has regions)

After:
- Edit is made which resets `_regions` if it exists
- `included_regions()` no longer needs to perform this logic itself, and
  also no longer needs to read a `_has_regions` variable
2025-02-02 03:42:47 -08:00
Riley Bruins
096ae3bfd7 fix(treesitter): nil access when running string parser async 2025-02-01 17:02:52 +01:00
dundargoc
e71d2c817d docs: misc
Co-authored-by: Dustin S. <dstackmasta27@gmail.com>
Co-authored-by: Ferenc Fejes <fejes@inf.elte.hu>
Co-authored-by: Maria José Solano <majosolano99@gmail.com>
Co-authored-by: Yochem van Rosmalen <git@yochem.nl>
Co-authored-by: brianhuster <phambinhanctb2004@gmail.com>
Co-authored-by: zeertzjq <zeertzjq@outlook.com>
2025-01-30 13:46:06 +01:00
notomo
e7ebc5c13d fix(treesitter): stop async parsing if buffer is invalid
Problem: Error occurs if delete buffer in the middle of parsing.
Solution: Check if buffer is valid in parsing.
2025-01-29 09:15:13 +01:00
Riley Bruins
b88874d33c fix(treesitter): empty queries can disable injections (#31748)
**Problem:** Currently, if users want to efficiently disable injections,
they have to delete the injection query files at their runtime path.
This is because we only check for existence of the files before running
the query over the entire buffer.

**Solution:** Check for existence of query files, *and* that those files
actually have captures. This will allow users to just comment out
existing queries (or better yet, just add their own injection query to
`~/.config/nvim` which contains only comments) to disable running the
query over the entire buffer (a potentially slow operation)
2025-01-28 18:59:04 +01:00
Lewis Russell
6aa42e8f92 fix: resolve all remaining LuaLS diagnostics 2025-01-27 16:37:50 +00:00
Jaehwang Jung
27c8806953 docs(treesitter): expose LanguageTree:parent() #32108
Plugins may want to climb up the LanguageTree.

Also add missing type annotations for other methods.
2025-01-20 04:17:36 -08:00
Jaehwang Jung
6696ea7f10 fix(treesitter): clean up parsing queue 2025-01-19 15:53:17 +01:00
Riley Bruins
45e606b1fd feat(treesitter): async parsing
**Problem:** Parsing can be slow for large files, and it is a blocking
operation which can be disruptive and annoying.

**Solution:** Provide a function for asynchronous parsing, which accepts
a callback to be run after parsing completes.

Co-authored-by: Lewis Russell <lewis6991@gmail.com>
Co-authored-by: Luuk van Baal <luukvbaal@gmail.com>
Co-authored-by: VanaIgr <vanaigranov@gmail.com>
2025-01-12 08:10:47 -08:00
Riley Bruins
30de00687b refactor(treesitter): simplify condition #31889 2025-01-06 16:56:53 -08:00
luukvbaal
cdc9baeaf8 fix(treesitter): remove redundant on_bytes callback #31041
Problem:  Treesitter highlighter implements an on_bytes callback that
          just re-marks a buffer range for redraw. The edit that
          prompted the callback will already have done that.
Solution: Remove redundant on_bytes callback from the treesitter
          highlighter module.
2024-11-16 16:25:10 -08:00
Christian Clason
99e0facf3a feat(treesitter)!: use return values in language.add()
Problem: No clear way to check whether parsers are available for a given
language.

Solution: Make `language.add()` return `true` if a parser was
successfully added and `nil` otherwise. Use explicit `assert` instead of
relying on thrown errors.
2024-09-29 15:27:16 +02:00
Gregory Anders
6913c5e1d9 feat(treesitter)!: default to correct behavior for quantified captures (#30193)
For context, see https://github.com/neovim/neovim/pull/24738. Before
that PR, Nvim did not correctly handle captures with quantifiers. That
PR made the correct behavior opt-in to minimize breaking changes, with
the intention that the correct behavior would eventually become the
default. Users can still opt-in to the old (incorrect) behavior for now,
but this option will eventually be removed completely.

BREAKING CHANGE: Any plugin which uses `Query:iter_matches()` must
update their call sites to expect an array of nodes in the `match`
table, rather than a single node.
2024-09-01 18:01:53 +00:00
Yi Ming
0a1212ef94 docs(treesitter): generate inline docs for Ranges
docs(treesitter): in-place parameter description

docs(treesitter): remove internal type names

docs(treesitter): add missing private annotation
2024-08-06 18:18:34 +02:00
Riley Bruins
bd3b6ec836 feat(treesitter): add node_for_range function
This is identical to `named_node_for_range` except that it includes
anonymous nodes. This maintains consistency in the API because we
already have `descendant_for_range` and `named_descendant_for_range`.
2024-07-29 17:15:46 +02:00
Lewis Russell
5e49ef0af3 refactor(lua): improve type annotations 2024-06-11 12:45:43 +01:00
dundargoc
a664246171 feat: remove deprecated features
Remove following functions:
- vim.lsp.util.extract_completion_items
- vim.lsp.util.get_progress_messages
- vim.lsp.util.parse_snippet()
- vim.lsp.util.text_document_completion_list_to_complete_items
- LanguageTree:for_each_child
- health#report_error
- health#report_info
- health#report_ok
- health#report_start
- health#report_warn
- vim.health.report_error
- vim.health.report_info
- vim.health.report_ok
- vim.health.report_start
- vim.health.report_warn
2024-05-16 18:30:59 +02:00
Justin M. Keyes
01b6bff7e9 docs: news
Set dev_xx.txt help files to use "flow" layout.
2024-05-15 23:19:26 +02:00
Christian Clason
26b5405d18 fix(treesitter): enforce lowercase language names (#28546)
* fix(treesitter): enforce lowercase language names

Problem: On case-insensitive file systems (e.g., macOS), `has_parser`
will return `true` for uppercase aliases, which will then try to inject
the uppercase language unsuccessfully.

Solution: Enforce and assume parser names to be lowercase when
resolving language names.
2024-04-28 16:27:47 +02:00
altermo
00e6651880 fix(treesitter): use tree range instead of tree root node range 2024-04-10 15:54:52 +01:00
Christian Clason
6cfca21bac feat(treesitter): add @injection.filename
Problem: Injecting languages for file redirects (e.g., in bash) is not
possible.

Solution: Add `@injection.filename` capture that is piped through
`vim.filetype.match({ filename = node_text })`; the resulting filetype
(if not `nil`) is then resolved as a language (either directly or
through the list maintained via `vim.treesitter.language.register()`).

Note: `@injection.filename` is a non-standard capture introduced by
Helix; having two editors implement it makes it likely to be upstreamed.
2024-04-02 11:13:16 +02:00
Lewis Russell
14e4b6bbd8 refactor(lua): type annotations 2024-03-16 19:26:10 +00:00
Lewis Russell
12faaf40f4 fix(treesitter): highlight injections properly
`on_line_impl` doesn't highlight single lines, so using pattern indexes
to offset priority doesn't work.
2024-03-14 06:55:19 +00:00
Lewis Russell
ade1b12f49 docs: support inline markdown
- Tags are now created with `[tag]()`
- References are now created with `[tag]`
- Code spans are no longer wrapped
2024-03-09 11:21:55 +00:00
Lewis Russell
85b13751a5 refactor(types): more fixes (2) 2024-03-06 16:03:33 +00:00
Lewis Russell
a5fe8f59d9 docs: improve/add documentation of Lua types
- Added `@inlinedoc` so single use Lua types can be inlined into the
  functions docs. E.g.

  ```lua
  --- @class myopts
  --- @inlinedoc
  ---
  --- Documentation for some field
  --- @field somefield integer

  --- @param opts myOpts
  function foo(opts)
  end
  ```

  Will be rendered as

  ```
  foo(opts)

    Parameters:
      - {opts} (table) Object with the fields:
               - somefield (integer) Documentation
                 for some field
  ```

- Marked many classes with with `@nodoc` or `(private)`.
  We can eventually introduce these when we want to.
2024-03-01 23:02:18 +00:00
Lewis Russell
9beb40a4db feat(docs): replace lua2dox.lua
Problem:

The documentation flow (`gen_vimdoc.py`) has several issues:
- it's not very versatile
- depends on doxygen
- doesn't work well with Lua code as it requires an awkward filter script to convert it into pseudo-C.
- The intermediate XML files and filters makes it too much like a rube goldberg machine.

Solution:

Re-implement the flow using Lua, LPEG and treesitter.

- `gen_vimdoc.py` is now replaced with `gen_vimdoc.lua` and replicates a portion of the logic.
- `lua2dox.lua` is gone!
- No more XML files.
- Doxygen is now longer used and instead we now use:
  - LPEG for comment parsing (see `scripts/luacats_grammar.lua` and `scripts/cdoc_grammar.lua`).
  - LPEG for C parsing (see `scripts/cdoc_parser.lua`)
  - Lua patterns for Lua parsing (see `scripts/luacats_parser.lua`).
  - Treesitter for Markdown parsing (see `scripts/text_utils.lua`).
- The generated `runtime/doc/*.mpack` files have been removed.
   - `scripts/gen_eval_files.lua` now instead uses `scripts/cdoc_parser.lua` directly.
- Text wrapping is implemented in `scripts/text_utils.lua` and appears to produce more consistent results (the main contributer to the diff of this change).
2024-02-27 14:41:17 +00:00
Thomas Vigouroux
bd5008de07 fix(treesitter): correctly handle query quantifiers (#24738)
Query patterns can contain quantifiers (e.g. (foo)+ @bar), so a single
capture can map to multiple nodes. The iter_matches API can not handle
this situation because the match table incorrectly maps capture indices
to a single node instead of to an array of nodes.

The match table should be updated to map capture indices to an array of
nodes. However, this is a massively breaking change, so must be done
with a proper deprecation period.

`iter_matches`, `add_predicate` and `add_directive` must opt-in to the
correct behavior for backward compatibility. This is done with a new
"all" option. This option will become the default and removed after the
0.10 release.

Co-authored-by: Christian Clason <c.clason@uni-graz.at>
Co-authored-by: MDeiml <matthias@deiml.net>
Co-authored-by: Gregory Anders <greg@gpanders.com>
2024-02-16 11:54:47 -06:00
Christian Clason
674f2513d4 fix(treesitter): validate language alias for injections
Problem: Parsed language annotations can be random garbage so
`nvim_get_runtime_file` throws an error.

Solution: Validate that `alias` is a valid language name before trying
to find a parser for it.
2024-01-18 15:46:08 +01:00
Jaehwang Jung
7fa292c52d fix(treesitter): outdated highlight due to tree with outdated region
Problem:
A region managed by an injected parser may shrink after re-running the
injection query. If the updated region goes out of the range to be
parsed, then the corresponding tree will remain outdated, possibly
retaining the nodes that shouldn't exist anymore. This results in
outdated highlights.

Solution:
Re-parse an invalid tree if its region intersects the range to be
parsed.
2023-12-24 09:47:59 +01:00
Dmytro Soltys
72ed99319d fix(treesitter): don't invalidate parser when discovering injections
When parsing with a range, languagetree looks up injections and adds
them if needed. This explicitly invalidates parser, making `is_valid`
report `false` both when including and excluding children.

This is an attempt to describe desired behaviour of `is_valid` in tests,
with what ended up being a single line change to satisfy them.
2023-11-27 15:53:26 +01:00
L Lllvvuu
e353c869ce fix(languagetree): don't treat unparsed nodes as occupying full range
This is incorrect in the following scenario:
1. The language tree is Lua > Vim > Lua.
2. An edit simultaneously wipes out the `_regions` of all nodes, while
   taking the Vim injection off-screen.
3. The Vim injection is not re-parsed, so the child Lua `_regions` is
   still `nil`.
4. The child Lua is assumed, incorrectly, to occupy the whole document.
5. This causes the injections to be parsed again, resulting in Lua > Vim
   > Lua > Vim.
6. Now, by the same process, Vim ends up with its range assumed over the
   whole document. Now the parse is broken and results in broken
   highlighting and poor performance.

It should be fine to instead treat an unparsed node as occupying
nothing (i.e. effectively non-existent). Since, either:
- The parent was just parsed, hence defining `_regions`
- The parent was not just parsed, in which case this node doesn't need
  to be parsed either.

Also, the name `has_regions` is confusing; it seems to simply
mean the opposite of "root" or "full_document". However, this PR does
not touch it.
2023-09-22 12:51:51 +01:00
Lewis Russell
877d04d0fb feat(lua): add vim.func._memoize
Memoizes a function, using a custom function to hash the arguments.

Private for now until:

- There are other places in the codebase that could benefit from this
  (e.g. LSP), but might require other changes to accommodate.
- Invalidation of the cache needs to be controllable. Using weak tables
  is an acceptable invalidation policy, but it shouldn't be the only
  one.
- I don't think the story around `hash_fn` is completely thought out. We
  may be able to have a good default hash_fn by hashing each argument,
  so basically a better 'concat'.
2023-09-20 13:42:41 +01:00
Jaehwang Jung
71d9b7d15c fix(treesitter): _trees may not be list-like
Problem:
With incremental injection parsing, injected languages' parsers parse
only the relevant regions and stores the result in _trees with the index
of the corresponding region. Therefore, there can be holes in _trees.

Solution:
* Use generic table functions where appropriate.
* Fix type annotations and docs.
2023-09-17 19:52:35 +01:00
Jaehwang Jung
7e5ce42977 fix(treesitter): properly combine injection.combined regions
Problem:
It doesn't make much sense to flatten each region (= list of ranges).
This coincidentally worked for region with a single range.

Solution:
Custom function for combining regions.
2023-09-16 17:02:26 +01:00
L Lllvvuu
908843df61 fix(languagetree): apply resolve_lang to metadata['injection.language']
`resolve_lang` is applied to `@injection.language` when it's supplied as a
capture:

f5953edbac/runtime/lua/vim/treesitter/languagetree.lua (L766-L768)

If we want to support `metadata['injection.language']` (as per #22518 and
[tree-sitter upstream](https://tree-sitter.github.io/tree-sitter/syntax-highlighting#language-injection))
then the behavior should be consistent.

Fixes: nvim-treesitter/nvim-treesitter#4918
2023-09-16 11:12:06 +01:00
Gregory Anders
2e92065686 docs: replace <pre> with ``` (#25136) 2023-09-14 08:23:01 -05:00
LW
9fc321c976 refactor(treesitter): deprecate for_each_child #25118
The name for_each_child is misleading and caused bugs.
After #25111, #25115, there are no more usages of `for_each_child` in Nvim.

In the future if we want to restore this functionality we can consider a
generalized vim.traverse(node, key, visitor) function.
2023-09-14 03:36:16 -07:00
Lewis Russell
7a76fb8547 fix(treesitter): remove more double recursion
Do not call `for_each_child` in functions that are already recursive.
2023-09-12 12:21:42 +01:00
L Lllvvuu
6b5f44817e fix(languagetree): remove double recursion in LanguageTree:parse
`LanguageTree:parse` is recursive, and calls
`LanguageTree:for_each_child`, which is also recursive.

That means that, starting from the third level (child of child of root),
nodes will be parsed twice.

Which then means that if the tree is N layers deep, there will be ~2^N
parses even if the branching factor is 1.

Now, why was the tree deepening with each character inserted? And why
did this only regress in #24647? These are mysteries for another time.

Fixes: #25104
2023-09-12 09:12:53 +02:00
Amaan Qureshi
c6ec7fa8d7 feat(treesitter): add 'injection.self' and 'injection.parent'
Co-authored-by: ObserverOfTime <chronobserver@disroot.org>
2023-08-24 09:05:44 +09:00
Christian Clason
fc0ee871de fix(treesitter)!: remove deprecated legacy injection format 2023-08-14 00:14:35 +02:00
Lewis Russell
8179d68dc1 fix(treesitter): logger memory leak 2023-08-13 11:23:17 +01:00
Lewis Russell
2ca076e45f feat(treesitter)!: incremental injection parsing
Problem:

Treesitter highlighting is slow for large files with lots of injections.

Solution:

Only parse injections we are going to render during a redraw cycle.

---

- `LanguageTree:parse()` will no longer parse injections by default and
  now requires an explicit range argument to be passed.

- `TSHighlighter` now parses injections incrementally during on_win
  callbacks for the line range being rendered.

- Plugins which require certain injections to be parsed must run
  `parser:parse({ start_row, end_row })` before using the tree.
2023-08-12 16:11:36 +01:00
Christian Clason
31c4ed26bc feat(treesitter): add injection language fallback (#24659)
* feat(treesitter): add injection language fallback

Problem: injection languages are often specified via aliases (e.g.,
filetype or in upper case), requiring custom directives.

Solution: include lookup logic (try as parser name, then filetype, then
lowercase) in LanguageTree itself and remove `#inject-language`
directive.

Co-authored-by: Lewis Russell <me@lewisr.dev>
2023-08-11 17:05:17 +02:00
Lewis Russell
0211f889b9 fix(treesitter): make sure injections don't return empty ranges (#24595)
When an injection has not set include children, make sure not to add
the injection if no ranges are determined.

This could happen when there is an injection with a child that has the
same range as itself. e.g. consider this Makefile snippet

```make
foo:
  $(VAR)
```

Line 2 has an injection for bash and a make variable reference. If
include-children isn't set (default), then there is no range on line 2
to inject since the variable reference needs to be excluded.

This caused the language tree to return an empty range, which the parser
now interprets to mean the full buffer. This caused makefiles to have
completely broken highlighting.
2023-08-07 18:22:36 +01:00
Lewis Russell
be74807eef docs(lua): more improvements (#24387)
* docs(lua): teach lua2dox how to table

* docs(lua): teach gen_vimdoc.py about local functions

No more need to mark local functions with @private

* docs(lua): mention @nodoc and @meta in dev-lua-doc

* fixup!

Co-authored-by: Justin M. Keyes <justinkz@gmail.com>

---------

Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
2023-07-18 15:42:30 +01:00