mirror of
https://github.com/neovim/neovim.git
synced 2026-04-25 08:44:06 +00:00
feat(stdlib): overload vim.str_byteindex, vim.str_utfindex #30735
PROBLEM: There are several limitations to vim.str_byteindex, vim.str_utfindex: 1. They throw given out-of-range indexes. An invalid (often user/lsp-provided) index doesn't feel exceptional and should be handled by the caller. `:help dev-error-patterns` suggests that `retval, errmsg` is the preferred way to handle this kind of failure. 2. They cannot accept an encoding. So LSP needs wrapper functions. #25272 3. The current signatures are not extensible. * Calling: The function currently uses a fairly opaque boolean value to indicate to identify the encoding. * Returns: The fact it can throw requires wrapping in pcall. 4. The current name doesn't follow suggestions in `:h dev-naming` and I think `get` would be suitable. SOLUTION: - Because these are performance-sensitive, don't introduce `opts`. - Introduce an "overload" that accepts `encoding:string` and `strict_indexing:bool` params. ```lua local col = vim.str_utfindex(line, encoding, [index, [no_out_of_range]]) ``` Support the old versions by dispatching on the type of argument 2, and deprecate that form. ```lua vim.str_utfindex(line) -- (utf-32 length, utf-16 length), deprecated vim.str_utfindex(line, index) -- (utf-32 index, utf-16 index), deprecated vim.str_utfindex(line, 'utf-16') -- utf-16 length vim.str_utfindex(line, 'utf-16', index) -- utf-16 index vim.str_utfindex(line, 'utf-16', math.huge) -- error: index out of range vim.str_utfindex(line, 'utf-16', math.huge, false) -- utf-16 length ```
This commit is contained in:
@@ -112,18 +112,6 @@ function vim.rpcrequest(channel, method, ...) end
|
||||
--- equal, {a} is greater than {b} or {a} is lesser than {b}, respectively.
|
||||
function vim.stricmp(a, b) end
|
||||
|
||||
--- Convert UTF-32 or UTF-16 {index} to byte index. If {use_utf16} is not
|
||||
--- supplied, it defaults to false (use UTF-32). Returns the byte index.
|
||||
---
|
||||
--- Invalid UTF-8 and NUL is treated like in |vim.str_utfindex()|.
|
||||
--- An {index} in the middle of a UTF-16 sequence is rounded upwards to
|
||||
--- the end of that sequence.
|
||||
--- @param str string
|
||||
--- @param index integer
|
||||
--- @param use_utf16? boolean
|
||||
--- @return integer
|
||||
function vim.str_byteindex(str, index, use_utf16) end
|
||||
|
||||
--- Gets a list of the starting byte positions of each UTF-8 codepoint in the given string.
|
||||
---
|
||||
--- Embedded NUL bytes are treated as terminating the string.
|
||||
@@ -173,19 +161,6 @@ function vim.str_utf_start(str, index) end
|
||||
--- @return integer
|
||||
function vim.str_utf_end(str, index) end
|
||||
|
||||
--- Convert byte index to UTF-32 and UTF-16 indices. If {index} is not
|
||||
--- supplied, the length of the string is used. All indices are zero-based.
|
||||
---
|
||||
--- Embedded NUL bytes are treated as terminating the string. Invalid UTF-8
|
||||
--- bytes, and embedded surrogates are counted as one code point each. An
|
||||
--- {index} in the middle of a UTF-8 sequence is rounded upwards to the end of
|
||||
--- that sequence.
|
||||
--- @param str string
|
||||
--- @param index? integer
|
||||
--- @return integer # UTF-32 index
|
||||
--- @return integer # UTF-16 index
|
||||
function vim.str_utfindex(str, index) end
|
||||
|
||||
--- The result is a String, which is the text {str} converted from
|
||||
--- encoding {from} to encoding {to}. When the conversion fails `nil` is
|
||||
--- returned. When some characters could not be converted they
|
||||
|
||||
Reference in New Issue
Block a user