These were actually hurting performance lol, except in the places where
I added the `.always_inline` calls- for some reason if these functions
aren't inlined there it really messes up the top region scrolling
benchmark in vtebench and I'm not entirely certain why...
This improves the `clearCells` function since it only has to update once
after clearing all of the individual cells, or not at all if the whole
row was cleared since then it knows for sure that it cleared them all.
This also makes it so that the row style flag is properly tracked when
cells are cleared but not the whole row.
This could cause a 0-length hyperlink to be present in the screen,
which, in ReleaseFast, causes a lockup as the string alloc tries to
iterate `1..0` to allocate 0 chunks.
I encountered these bugs while trying to benchmark Ghostty for
performance work.
- Tmux control mode parsing would start accessing deallocated memory
after entering the "broken" state, and worse yet, would cause a
double-free once it was supposed to be deinited.
- Despite our best efforts, CoreText can still produce non-monotonic
(non-ltr) runs. Our renderer code relies on monotonic ltr ordering so in
the rare case where this happens we just sort the buffer before
returning it.
- C1 (8-bit) controls can be executed in certain parser states, so we
need to handle them in the stream's `execute` function. Luckily this was
pretty straightforward since all C1 controls are equivalent to `ESC`
followed by `C1 - 0x40`.
- `Terminal.Screen`'s `cursorScrollDown` function could cause memory
corruption because of `eraseRow` moving the cursor's tracked pin to a
different page. In fixing this, I actually reduced the complexity of
that codepath.
- **Bonus!** Added a nice helper function to `Offset.Slice` so that you
can just do `offset_slice.slice()` instead of
`offset_slice.offset.ptr(base)[0..offset_slice.len]`. Much more
readable.
### `vtebench` before/after
<img width="984" height="691" alt="image"
src="https://github.com/user-attachments/assets/ef20dcc5-d611-4763-9107-355d715a6c0b"
/>
Doesn't seem like any of these changes caused a performance regression.
It was previously possible for `eraseRow` to move the cursor pin to a
different page, and then the call to `cursorChangePin` would try to free
the cursor style from that page even though that's not the page it
belongs to, which creates memory corruption in release modes and
integrity violations or assertions in debug mode.
As a bonus, this should actually be faster this way than the old code,
since it avoids needless work that `cursorChangePin` otherwise does.
These can be unambiguously invoked in certain parser states, and as such
we need to handle them. In real world use they are extremely rare, hence
the branch hint. Without this, we get illegal behavior by trying to cast
the value to the 7-bit C0 enum.
Progress towards #189
## Search Thread
This completes an initial implementation of the search thread. This is a
separate thread that manages a terminal search operation, triggering
event callbacks for various scenarios. The performance goal of the
search thread is to minimize time spent within the critical area of the
terminal lock while making forward progress on search outside of that.
The search thread sends two messages back to the caller right now:
- Total matches: sent continuously as new matches are found, just a
number
- Viewport matches: a list of `Selection` structures spanning the
current viewport of matches within the visible viewport. Sent whenever
the viewport changes (location or content).
I think that's enough to build a rudimentary search UI.
For this initial implementation, the search also relies on a "refresh
timer" which trigger every 24ms (40 FPS) to grab the lock and look for
any reconciliation that needs to happen: viewport moved, active screen
changed, active area changed, etc. This is a total guess and is
arbitrary currently. The value should be tuned to balance responsiveness
and IO throughput (lock-holding). I actually suspect this may be too
frequent right now.
A TODO is noted for the future to pause the refresh timer when the
terminal being search isn't focused. We'll have to do that before
shipping search because the way its built right now we will definitely
consume unnecessary CPU while unfocused. But only while search is
active.
## `ViewportSearch`
I also determined the need for what is called the `ViewportSearch`
layer. This is similar to `ActiveSearch` in that it throws away and
re-searches an area, but is tuned towards efficiently detecting viewport
changes. I found its more efficient to continuously research the visible
viewport than to hunt for those matches within the cached ScreenSearch
results, which can be very large.
## Future
Next up we need to hook up the search thread to some keybindings to
start and stop the search so we can then trigger those from our apprts
(GUIs).
I highly suspect this is going to expose some major performance issues
(overly active message sending being the likely culprit) that we can fix
up in the search thread thereafter. Up until this point we can only run
this stuff in isolation in unit tests which is good for testing
correctness but difficult for testing resource usage.
There are some written TODOs in the Thread.zig file in this PR. I may
address some of them before merging, since I think a couple are pretty
obvious performance gotchas.
This PR changes our primary/alt screen tracking from being by-value
fields that are copied during switch to heap-allocated pointers that are
stable. This is motivated now by #189 since our search thread needs a
stable screen pointer, but this will also help us in the future with our
future N-screens proposal I have.
Also, as a nice bonus, alt screen memory is now initialized on demand
when you first enter alt-screen, so this saves a few MB of memory for a
new terminal if you never use alt screen!
This is something I've wanted to do for a veryyyyyy long time, but the
annoyance of the task really held me back. I finally pushed through and
did this with the help of some AI agents for the rote tasks (renaming
stuff).