* feat(treesitter): add injection language fallback
Problem: injection languages are often specified via aliases (e.g.,
filetype or in upper case), requiring custom directives.
Solution: include lookup logic (try as parser name, then filetype, then
lowercase) in LanguageTree itself and remove `#inject-language`
directive.
Co-authored-by: Lewis Russell <me@lewisr.dev>
Problem: luals returns stricter diagnostics with bundled luarc.json
Solution: Improve some function and type annotations:
* use recognized uv.* types
* disable diagnostic for global `vim` in shared.lua
* docs: don't start comment lines with taglink (otherwise LuaLS will interpret it as a type)
* add type alias for lpeg pattern
* fix return annotation for `vim.secure.trust`
* rename local Range object in vim.version (shadows `Range` in vim.treesitter)
* fix some "missing fields" warnings
* add missing required fields for test functions in eval.lua
* rename lsp meta files for consistency
* docs(lua): teach lua2dox how to table
* docs(lua): teach gen_vimdoc.py about local functions
No more need to mark local functions with @private
* docs(lua): mention @nodoc and @meta in dev-lua-doc
* fixup!
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
---------
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
perf(treesitter): cache vim.treesitter.query.get
Problem:
vim.treesitter.query.get searches and reads query files every time it's
called, if user hasn't overridden the query. So this can incur slowdown
when called frequently.
This can happen when using treesitter foldexpr. For example, when using
`:h :range!` in markdown file to format fenced codeblock, on_changedtree
in _fold.lua is triggered many times despite that the tree doesn't have
syntactic changes (might be a bug in LanguageTree). (Incidentally, the
resulting fold is incorrect due to a bug in `:h range!`.) on_changedtree
calls vim.treesitter.query.get for each tree changes. In addition, it
may request folds queries for injected languages without fold queries,
such as markdown_inline.
Solution:
* Cache the result of vim.treesitter.query.get.
* If query file was not found, fail quickly at later calls.
Problem:
Codebase inconsistently binds vim.api onto a or api.
Solution:
Use api everywhere. a as an identifier is too short to have at the
module level.
Problem:
Help tags like vim.treesitter.language.add() are confusing because
`vim.treesitter.language` is (thankfully) not a user-facing module.
Solution:
Ignore the "fstem" when generating "treesitter" tags.
Problem:
Treesitter injections are slow because all injected trees are invalidated on every change.
Solution:
Implement smarter invalidation to avoid reparsing injected regions.
- In on_bytes, try and update self._regions as best we can. This PR just offsets any regions after the change.
- Add valid flags for each region in self._regions.
- Call on_bytes recursively for all children.
- We still need to run the query every time for the top level tree. I don't know how to avoid this. However, if the new injection ranges don't change, then we re-use the old trees and avoid reparsing children.
This should result in roughly a 2-3x reduction in tree parsing when the comment injections are enabled.
Problem:
vim.treesitter does not know how to map a specific filetype to a parser.
This creates problems since in a few places (including in vim.treesitter itself), the filetype is incorrectly used in place of lang.
Solution:
Add an API to enable this:
- Add vim.treesitter.language.add() as a replacement for vim.treesitter.language.require_language().
- Optional arguments are now passed via an opts table.
- Also takes a filetype (or list of filetypes) so we can keep track of what filetypes are associated with which langs.
- Deprecated vim.treesitter.language.require_language().
- Add vim.treesitter.language.get_lang() which returns the associated lang for a given filetype.
- Add vim.treesitter.language.register() to associate filetypes to a lang without loading the parser.
Use the first, not last, query for a language on runtimepath. Typically,
this implies that a user query will override a site plugin query, which
will override a bundled runtime query.
Problem: Treesitter queries for a given language in runtime were merged together,
leading to errors if they targeted different parser versions (e.g., bundled viml queries
and those shipped by nvim-treesitter).
Solution: Runtime queries now work as follows:
* The last query in the rtp without `; extends` in the header will be used as the base query
* All queries (without a specific order) with `; extends` are concatenated with the base query
BREAKING CHANGE: queries need to be updated if they are meant to extend other queries
As part of the upstream of utility functions from nvim-treesitter, this
option when set to false allows to return a table (downstream behavior).
Effectively making the switch from the downstream to the upstream
function much easier.
Previously the `offset!` directive populated the metadata in such a way
that the new range could be attributed to a specific capture. #14046
made it so the directive simply stored just the new range in the
metadata and information about what capture the range is based from is
lost.
This change reverts that whilst also correcting the docs.
Based on https://github.com/neovim/neovim/pull/14445
This extends `vim.treesitter.query.get_node_text` to return the text
that spans a node's range even if start_row ~= end_row.
The official developer documentation in in :h dev-lua-doc specifies to
use "--@" for special/magic tokens. However, this format is not
consistent with EmmyLua notation (used by some Lua language servers) nor
with the C version of the magic docstring tokens which use three comment
characters.
Further, the code base is currently split between usage of "--@",
"---@", and "--- @". In an effort to remain consistent, change all Lua
magic tokens to use "---@" and update the developer documentation
accordingly.