`LanguageTree:parse` is recursive, and calls
`LanguageTree:for_each_child`, which is also recursive.
That means that, starting from the third level (child of child of root),
nodes will be parsed twice.
Which then means that if the tree is N layers deep, there will be ~2^N
parses even if the branching factor is 1.
Now, why was the tree deepening with each character inserted? And why
did this only regress in #24647? These are mysteries for another time.
Fixes: #25104
Problem:
Treesitter highlighting is slow for large files with lots of injections.
Solution:
Only parse injections we are going to render during a redraw cycle.
---
- `LanguageTree:parse()` will no longer parse injections by default and
now requires an explicit range argument to be passed.
- `TSHighlighter` now parses injections incrementally during on_win
callbacks for the line range being rendered.
- Plugins which require certain injections to be parsed must run
`parser:parse({ start_row, end_row })` before using the tree.
* feat(treesitter): add injection language fallback
Problem: injection languages are often specified via aliases (e.g.,
filetype or in upper case), requiring custom directives.
Solution: include lookup logic (try as parser name, then filetype, then
lowercase) in LanguageTree itself and remove `#inject-language`
directive.
Co-authored-by: Lewis Russell <me@lewisr.dev>
When an injection has not set include children, make sure not to add
the injection if no ranges are determined.
This could happen when there is an injection with a child that has the
same range as itself. e.g. consider this Makefile snippet
```make
foo:
$(VAR)
```
Line 2 has an injection for bash and a make variable reference. If
include-children isn't set (default), then there is no range on line 2
to inject since the variable reference needs to be excluded.
This caused the language tree to return an empty range, which the parser
now interprets to mean the full buffer. This caused makefiles to have
completely broken highlighting.
* docs(lua): teach lua2dox how to table
* docs(lua): teach gen_vimdoc.py about local functions
No more need to mark local functions with @private
* docs(lua): mention @nodoc and @meta in dev-lua-doc
* fixup!
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
---------
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
- Add bindings to Treesitter ts_parser_set_logger and ts_parser_logger
- Add logfile with path STDPATH('log')/treesitter.c
- Rework existing LanguageTree loggin to use logfile
- Begin implementing log levels for vim.g.__ts_debug
When injections are added or removed make sure to:
- invoke 'changedtree' callbacks for when new trees are added.
- invoke 'changedtree' callbacks for when trees are invalidated
- redraw regions when languagetree children are removed
Problem:
Codebase inconsistently binds vim.api onto a or api.
Solution:
Use api everywhere. a as an identifier is too short to have at the
module level.
Never return the changes an only notify them using the `on_changedtree`
callback.
It is not guaranteed for a plugin that it'll be the first one to call
`tree:parse()` and thus get the changes.
Closes#19915
Problem:
gen_vimdoc.py / lua2dox.lua does not support @defgroup or \defgroup
except for "api-foo" modules.
Solution:
Modify `gen_vimdoc.py` to look for section names based on `helptag_fmt`.
TODO:
- Support @module ?
https://github.com/LuaLS/lua-language-server/wiki/Annotations#module
Problem:
Help tags like vim.treesitter.language.add() are confusing because
`vim.treesitter.language` is (thankfully) not a user-facing module.
Solution:
Ignore the "fstem" when generating "treesitter" tags.
Problem:
Treesitter injections are slow because all injected trees are invalidated on every change.
Solution:
Implement smarter invalidation to avoid reparsing injected regions.
- In on_bytes, try and update self._regions as best we can. This PR just offsets any regions after the change.
- Add valid flags for each region in self._regions.
- Call on_bytes recursively for all children.
- We still need to run the query every time for the top level tree. I don't know how to avoid this. However, if the new injection ranges don't change, then we re-use the old trees and avoid reparsing children.
This should result in roughly a 2-3x reduction in tree parsing when the comment injections are enabled.
Problem:
vim.treesitter does not know how to map a specific filetype to a parser.
This creates problems since in a few places (including in vim.treesitter itself), the filetype is incorrectly used in place of lang.
Solution:
Add an API to enable this:
- Add vim.treesitter.language.add() as a replacement for vim.treesitter.language.require_language().
- Optional arguments are now passed via an opts table.
- Also takes a filetype (or list of filetypes) so we can keep track of what filetypes are associated with which langs.
- Deprecated vim.treesitter.language.require_language().
- Add vim.treesitter.language.get_lang() which returns the associated lang for a given filetype.
- Add vim.treesitter.language.register() to associate filetypes to a lang without loading the parser.
Add a "show_tree" function to view a textual representation of the
nodes in a language tree in a window. Moving the cursor in the
window highlights the corresponding text in the source buffer, and
moving the cursor in the source buffer highlights the corresponding
nodes in the window.
The private 'get_node_range' function from the languagetree module has
been renamed and remains private as it serve a purpose that is only
relevant inside the languagetree module.
The 'get_node_range' upstreamed from nvim-treesitter in the treesitter
module has been made public as it is in itself a utlity function.
Previously the `offset!` directive populated the metadata in such a way
that the new range could be attributed to a specific capture. #14046
made it so the directive simply stored just the new range in the
metadata and information about what capture the range is based from is
lost.
This change reverts that whilst also correcting the docs.