This fixes the source of a deadlock that some tip users have hit. If our
surface mailbox is full and there is a dirty scrollbar state, then
drawFrame would block forever trying to queue to the surface mailbox.
We now fail instantly if the queue is full and keep the scrollbar state
dirty. We can try again on the next frame, it's not a critical thing to
get updated.
Fixes#2731 (again)
This regressed in 1.2 due to the renderer rework missing porting this. I
believe this issue is still valid even with the rework since the font
grid changes the atlas and if there are still cached cells that
reference the old atlas coordinates it will produce garbage.
Related to #111
This adds the necessary logic and data for the `PageList` data structure
to keep track of **total length** of the screen, **offset** into the
viewport, and **length** of the viewport. These three values are
necessary to _render_ a scrollbar. This PR updates the renderer to grab
this information but stops short of actually drawing a scrollbar (which
we'll do with native UI), in the interest of having a PR that doesn't
contain too many changes.
**This doesn't yet draw a scrollbar, these are just the internal changes
necessary to support it.**
## Background
The `PageList` structure is very core to how we represent terminal
state. It maintains a doubly linked list of "pages" (not literally
virtual memory pages, but close). Each page stores cell information,
styles, hyperlinks, etc fully self-contained in a contiguous sets of VM
pages using offset addresses rather than full pointers. **Pages are not
guaranteed to be equal sizes.** (This is where scrollbars get difficult)
Because it is a linked list structure of non-equal sized nodes, it isn't
amenable to typical scrollbar behavior. A scrollbar needs to know: full
size, offset, and length in order to draw the scrollbar properly.
Getting these values naively is `O(N)` within the data structure that is
on the hottest IO performance path in all of Ghostty.
## Implementation
### PageList
We now maintain two cached values for **total length** and **viewport
offset**.
The total length is relatively straightforward, we just have to be
careful to update it in every operation that could add or remove rows.
I've done this and ensured that every place we update it is covered with
unit test coverage.
The viewport offset is nasty, but I came up with what I believe is a
good solution. The viewport when arbitrarily scrolled is defined as a
direct pointer to the linked list node plus a row offset into that node.
The only way to calculate offset from the top is `O(N)`.
But we have a couple shortcuts:
1. If the viewport is at the bottom (most common) or top, calculating
the offset is `O(1)`: bottom is `total_rows - active_rows`, both readily
available. And top is `0` by definition.
2. Operations on the PageList typically add or remove rows. We don't do
arbitrary linked list surgery. If we instrument those areas with delta
updates to our cache, we can avoid the `O(N)` cost for most operations,
including scrolling a scrollbar. The only expensive operation is a full,
arbitrary jump (new node pointer).
Point 1 was quick to implement, so I focused all the complexity on point
2. Whenever we have an operation that adds or removes rows (for example
pruning the scroll back, adding more, erase rows within the active area,
etc.) then I do the math to calculate the delta change required for the
offset if we've already calculated it, and apply that directly.
### Renderer
The other issue was how to notify the apprts of scrollbar state. Sending
messages on any terminal change within the IO thread is a non-option
because (1) sending messages is slow (2) the terminal changes a lot and
(3) any slowness in the IO thread slows down overall terminal
throughput.
The solution was to **trigger scrollbar notifications with the renderer
vsync**. We read the scrollbar information when we render a frame,
compare it to renderer previous state, and if the scrollbar changed,
send a message to the apprt _after the frame is GPU-renderer_.
The renderer spends _most_ of its time sleeping compared to the IO
thread, and has more opportunities for optimizing its awake time.
Additionally, there's no reason to update the scrollbar information if
the renderer hasn't rendered the new frames because the user can't even
see the stuff the scrollbar wants to scroll to. We're talking about
millisecond scale stuff here at worst but it adds up.
## Performance
No noticeable performance impact for the additional metrics:
<img width="1012" height="738" alt="image"
src="https://github.com/user-attachments/assets/4ed0a3e8-6d76-40c1-b249-e34041c2f6fd"
/>
## AI Usage
I used Amp to help audit the codebase and write tests. I wrote all the
main implementation code manually. I came up with the main design
myself. Relevant threads:
-
https://ampcode.com/threads/T-95fff686-75bb-4553-a2fb-e41fe4cd4b77#message-0-block-0
-
https://ampcode.com/threads/T-48e9a288-b280-4eec-83b7-ca73d029b4ef#message-91-block-0
## Future
This is just the internal changes necessary to _draw_ a scrollbar. There
will be other changes we'll need to add to handle grabbing and actually
jumping the scrollbar. I have a good idea of how to implement those
performantly as well.
Changes it so that the renderer retains its own MemoryPool for PageList
pages so that new pages rarely need to be allocated when cloning the
screen. Also switches to using an arena allocator in `updateFrame` to
avoid having to deinit the cloned screen since instead we can just throw
out the memory.
Powerline glyphs were treated as whitespace, giving the preceding cell a
constraint width of 2 and cutting off icons in people's prompts and
statuslines. It is however correct to not treat Powerline glyphs like
other Nerd Font symbols; they should simply be treated as normal
characters, just like their relatives in the block elements unicode
block.
This resolves
https://discord.com/channels/1005603569187160125/1417236683266592798
(never promoted to an issue, but real and easy to reproduce).
**Tip**
<img width="215" height="63" alt="Screenshot 2025-09-21 at 16 57 58"
src="https://github.com/user-attachments/assets/81e770c5-d688-4d8e-839c-1f4288703c06"
/>
**This PR**
<img width="215" height="63" alt="Screenshot 2025-09-21 at 16 58 42"
src="https://github.com/user-attachments/assets/5d2dd770-0314-46f6-99b5-237a0933998e"
/>
The constraint width logic was untested but contains some quite subtle
interactions, so I wrote a suite of tests covering the cases I'm aware
of.
While working on this code I also resolved a TODO comment to add all the
box drawing/block element type characters to the set of codepoints
excluded from the minimum contrast settings.
e0d6 and e0d7 were left out.
Also collapsed everything to a single range; unlikely that the unused
gaps (e0c9, e0cb, e0d3, e0d5) would be used for something else in any
font that ships Powerline glyphs.
...not whitespace. Powerline glyphs can be considered an extension of
the Block Elements unicode block, which is neither whitespace nor
symbols (icons).
This ensures that characters immediately followed by a powerline glyph
are constrained to a single cell (unlike the current behavior where a PL
glyph is considered whitespace), while symbols (icons) immediately
preceded by a powerline glyph are not (unlike if a PL glyph were
considered a symbol). This resolves
https://discord.com/channels/1005603569187160125/1417236683266592798
The GLSL to MSL conversion process uses a passed-in sampler state for
the `iChannel0` parameter and we weren't providing it. This magically
worked on Apple Silicon for unknown reasons but failed on Intel GPUs.
In normal, hand-written MSL, we'd explicitly create the sampler state as
a normal variable (we do this in `shaders.metal` already!), but the
Shadertoy conversion stuff doesn't do this, probably because the exact
sampler parameters can't be safely known.
This fixes a Metal validation error when using custom shaders:
```
-[MTLDebugRenderCommandEncoder validateCommonDrawErrors:]:5970: failed
assertion `Draw Errors Validation Fragment Function(main0): missing Sampler
binding at index 0 for iChannel0Smplr[0].
```
This fixes a Metal validation error in Xcode when using custom shaders.
I suspect this is one part of custom shaders not working properly on
Intel macs (probably anything with a discrete GPU).
This happens to work on Apple Silicon but this is undefined behavior and
we're just getting lucky.
There is one more issue I'm chasing down that I think is also still
blocking custom shaders working on Intel macs.
When processing kitty images in a loop in a few places we were returning
under certain conditions where we should instead have just continued the
loop. This caused serious problems for kitty images, especially for apps
that used multiple images on screen at once.
... I have no clue how I originally wrote this code and didn't see such
a trivial mistake, I think I was sleep deprived or something.
This pull request adds the `--faint-opacity` option, as discussed in
#7637.
The default value of the option is also changed from `0.68` to `0.5` for
greater consistency with other popular terminal emulators.
This math was incorrect from the start, the previous fix helped OpenGL
but broke positioning under Metal; this commit fixes the math to be
correct under both backends and adds comments explaining exactly what's
going on.
Fix for discussion #8113
The cursor Y position value exposed to the shader uniforms was
incorrectly calculated. As per the doc in cell_text.v.glsl: In order to
get the top left of the glyph, we compute an offset based on the
bearings. The Y bearing is the distance from the bottom of the cell to
the top of the glyph, so we subtract it from the cell height to get the
y offset.
This calculation was mistakenly left out of the original code.
This will ensure that the custom shaders using
iCurrentCursor/iPreviousCursor get the correct Y coordinate representing
the top-left corner of the cursor rectangle, matching the documented
uniform behavior
Fixes#8243
This adds a check for a zero-sized grid in cursor-related functions.
As an alternate approach, I did look into simply skipping a bunch of
work on zero-sized grids, but that looked like a scarier change to make
now. That may be the better long-term solution but this was an easily
unit testable, focused fix on the crash to start.
This was a memory leak under Metal, leaked 1 swapchain worth of targets
every time a surface was closed.
Under OpenGL I think it was all cleaned up when the GL context was
destroyed.