Fixes#8991
Uses OSC 133 esc sequences to keep track of how long commands take to
execute. If the user chooses, commands that take longer than a user
specified limit will trigger a notification. The user can choose between
a bell notification or a desktop notification.
Signal handlers are connected to surface objects in two spots - when a
tab is added to a page and when the split tree changes. This resulted in
duplicate signal handlers being added for each surface. This was most
noticeable when copying the selection to the clipboard - you would see
two "Copied to clipboard" toasts. Ensure that there is only one signal
handler by removing any old ones before adding the new ones.
You can pretty simply reproduce a crash on `main` in `Debug` mode by
running `printf "مرحبًا \n"` with your primary font set to one that
supports Arabic such as Cascadia Code/Mono or Kawkab Mono, which will
cause CoreText to output the shaped glyphs non-monotonically which hits
the assert we have in the renderer.
In `ReleaseFast` this assert is skipped and because we already moved
ahead to the space glyph (which belongs at the end but is emitted first)
all of the glyphs up to that point are lost. I believe this is probably
the cause of #8280, I tested and this change seems to fix it at least.
Included in this PR is a little optimization: we were allocating buffers
to copy glyphs etc. from runs to every time, even though CoreText
provides `CTRunGet*Ptr` functions which get *pointers* to the internal
storage of these values- these aren't guaranteed to return a usable
pointer but in that case we can always fall back to allocating again.
Also avoided allocation while processing glyphs by ensuring capacity
beforehand immediately after creating the `CTLine`.
The performance impact of this PR is negligible on my machine and
actually seems to be positive, probably due to avoiding allocations if I
had to guess.
The solution we had before worked in most cases but there were some
which caused problems still. This is what HarfBuzz's CoreText shaper
backend does, it uses a CTTypesetter with the forced embedding level
attribute. This fixes the failure case I found that was causing non-
monotonic outputs which can have all sorts of unexpected results, and
causes a crash in Debug modes because we assert the monotonicity while
rendering.
Reduce potential allocation while processing glyphs by ensuring capacity
in the buffer ahead of time and also using CTRunGet*Ptr functions first
and only allocating for those if that didn't work (it should almost
always work in practice.)
We never used it because our minidump files on Linux didn't contain
meaningful information. With Zig's Writergate, let's drop this and
rewrite it later, we can always resurrect it from the git history.
Rejoice @pluiedev
We never used it because our minidump files on Linux didn't contain
meaningful information. With Zig's Writergate, let's drop this and
rewrite it later, we can always resurrect it from the git history.
Fixes#8991
Uses OSC 133 esc sequences to keep track of how long commands take to
execute. If the user chooses, commands that take longer than a user
specified limit will trigger a notification. The user can choose between
a bell notification or a desktop notification.
#8829 fixed the interaction of Powerline glyphs with other symbols, but
regressed in the handling of the Powerline glyphs themselves by letting
them get caught in an early exit that imposes a constraint width of 1.
This PR fixes the regression and adds corresponding tests. Tried to be
somewhat principled about why the special treatment is warranted, hence
the new helper function `isGraphicsElement`.
**Before**
<img width="270" height="44" alt="Screenshot 2025-10-02 at 00 16 54"
src="https://github.com/user-attachments/assets/9e975434-114c-44d5-a4ed-ac6a954b9d00"
/>
**After**
<img width="270" height="44" alt="Screenshot 2025-10-02 at 00 16 11"
src="https://github.com/user-attachments/assets/20545e74-c9f9-4a6b-9bf0-a7cf1d38c3a0"
/>
Zig 0.15 removed the ability to compress from the stdlib, which makes
porting our framegen tool to Zig 0.15+ more work than it's worth. We
already depend on and have the ability to build zlib, and Zig is a full
blown C compiler, so let's just use C.
The framegen C program doesn't free any memory, because it is meant to
exit quickly. It otherwise behaves pretty much the same as the old Zig
codebase.
The build scripts were modified to build the C program and run it, but
also to include the framedata in the generated source tarball so that
downstream packagers don't have to do this (although they'll have all
the deps anyways).
**AI disclosure:** The Zig to C conversion was done by Amp. I know C and
reviewed the code. The build system improvements are partially done by
Amp as well but I finished it all out. I've reviewed everything. Full
thread:
https://ampcode.com/threads/T-4e0a80af-3d2a-48d6-bf59-3e2f05eb7673
Zig 0.15 removed the ability to compress from the stdlib, which makes
porting our framegen tool to Zig 0.15+ more work than it's worth. We
already depend on and have the ability to build zlib, and Zig is a full
blown C compiler, so let's just use C.
The framegen C program doesn't free any memory, because it is meant to
exit quickly. It otherwise behaves pretty much the same as the old Zig
codebase.
The build scripts were modified to build the C program and run it, but
also to include the framedata in the generated source tarball so that
downstream packagers don't have to do this (although they'll have all
the deps anyways).
This fixes the lazyImport importing outside of the build root. The build
root for the build binary is always the root `build.zig` (which we
want), but our `src/build_config.zig` transitively imported SharedDeps
which led to issues with Zig 0.15.
This back ports to main early (outside our Zig 0.15 PR) because its
compatible and will simplify that one.
This fixes the lazyImport importing outside of the build root. The build
root for the build binary is always the root `build.zig` (which we
want), but our `src/build_config.zig` transitively imported SharedDeps
which led to issues.
Since we now use uucode, we don't need ziglyph anymore. Ziglyph was kept
around as a test-only dep so we can verify matching but this is
complicating our Zig 0.15 upgrade because ziglyph doesn't support Zig
0.15. Let's just drop it.
cc @jacobsandlund
Since we now use uucode, we don't need ziglyph anymore. Ziglyph was kept
around as a test-only dep so we can verify matching but this is
complicating our Zig 0.15 upgrade because ziglyph doesn't support Zig
0.15. Let's just drop it.
I used the new CPU counter mode in Instruments.app to track down
functions that had instruction delivery bottlenecks (indicating i-cache
misses) and picked a bunch of trivial functions to mark as inline (plus
a couple that are only used once or twice and which benefit from
inlining).
The size of `macos-arm64/libghostty-fat.a` built with `zig build
-Doptimize=ReleaseFast -Dxcframework-target=native` goes from
`145,538,856` bytes on `main` to `145,595,952` on this branch, a
negligible increase.
These changes resulted in some pretty sizable improvements in vtebench
results on my machine (Apple M3 Max):
<img width="983" height="696" alt="image"
src="https://github.com/user-attachments/assets/cac595ca-7616-48ed-983c-208c2ca2023f"
/>
With this, the only vtebench test we're slower than Alacritty in (on my
machine, at 130x51 window size) is `dense_cells` (which, IMO, is so
artificial that optimizing for it might actually negatively impact real
world performance).
I also did a pretty simple improvement to how we copy the screen in the
renderer, gave it its own page pool for less memory churn. Further
optimization in that area should be explored since in some scenarios it
seems like as much as 35% of the time on the `io-reader` thread is spent
waiting for the lock.
> [!NOTE]
> Before this is merged, someone really ought to test this on an x86
processor to see how the performance compares there, since this *is*
tuning for my processor specifically, and I know that M chips have
pretty big i-cache compared to some x86 processors which could impact
the performance characteristics of these changes.
A whole bunch of inline annotations, some of these were tracked down
with Instruments.app, others are guesses / just seemed right because
they were trivial wrapper functions.
Regardless, these changes are ultimately supported by improved vtebench
results on my machine (Apple M3 Max).
Changes it so that the renderer retains its own MemoryPool for PageList
pages so that new pages rarely need to be allocated when cloning the
screen. Also switches to using an arena allocator in `updateFrame` to
avoid having to deinit the cloned screen since instead we can just throw
out the memory.