Commit Graph

773 Commits

Author SHA1 Message Date
Mitchell Hashimoto
6ec2bfe288 renderer: kitty graphics prep can't fail (skip failed conversions) 2026-01-20 12:29:10 -08:00
Mitchell Hashimoto
d235c490e9 renderer: handle rebuildCells failures gracefully 2026-01-20 12:13:30 -08:00
Mitchell Hashimoto
db092ac3ce renderer: extract rebuildRow to its own function 2026-01-20 12:07:42 -08:00
Mitchell Hashimoto
e3c39ed502 renderer: cell.Contents.resize errdefer handling is now safe 2026-01-20 11:46:27 -08:00
Mitchell Hashimoto
7204f7ef9e renderer: remove palette_dirty, rely on terminal state 2026-01-20 11:34:53 -08:00
Mitchell Hashimoto
8786745969 renderer: don't access shared state for custom shader color palettes 2026-01-20 11:25:27 -08:00
ClearAspect
8d2eb280db custom shaders: add colorscheme information to shader uniforms
Adds palette and color scheme uniforms to custom shaders, allowing
custom shaders to access terminal color information:

  - iPalette[256]: Full 256-color terminal palette (RGB)
  - iBackgroundColor, iForegroundColor: Terminal colors (RGB)
  - iCursorColor, iCursorText: Cursor colors (RGB)
  - iSelectionBackgroundColor, iSelectionForegroundColor: Selection
colors (RGB)

Colors are normalized to [0.0, 1.0] range and update when the palette
changes via OSC sequences or configuration changes. The palette_dirty
flag tracks when colors need to be refreshed, initialized to true to
ensure correct colors on new surfaces.
2026-01-20 11:15:09 -08:00
shivaduke28
b0c868811d use underline instead of inverting colors for preedit text 2026-01-18 16:22:11 +09:00
Martin Emde
ec2612f9ce Add iTimeFocus shader uniform to track time since focus 2026-01-01 13:11:54 -08:00
Zongyuan Li
88e471e015 fix(iOS): fix iOS app startup failure
Fixes #7643

This commit address the issue with 3 minor fixes:
1. Initialize ghostty lib before app start, or global allocator will
   be null.
2. `addSublayer` should be called on CALayer object, which is the
   property 'layer' of UIView
3. According to apple's [document](https://developer.apple.com/documentation/metal/mtlstoragemode/managed?language=objc),
   managed storage mode is not supported by iOS. So always use shared
   mode.

FYI, another [fix](https://github.com/mitchellh/libxev/pull/204) in libxev
is also required to make iOS app work.
2025-12-26 18:53:45 +08:00
rezky_nightky
bf73f75304 chore: fixed some typo
Author: rezky_nightky <with dot rezky at gmail dot com>
Repository: ghostty
Branch: main
Signing: GPG (4B65AAC2)
HashAlgo: BLAKE3

[ Block Metadata ]
BlockHash: c37f4ee817412728a8058ba6087f5ca6aaff5a845560447d595d8055972d0eac
PrevHash: 3510917a780936278debe21786b7bae3a2162cb3857957314c3b8702e921b3d4
PatchHash: 5e5bb4ab35df304ea13c3d297c6d9a965156052c82bccf852b1f00b7bcaa7dd4

FilesChanged: 18
Lines: +92 / -92

Timestamp: 2025-12-25T17:27:08Z
Signature1: c1970dbb94600d1e24dfe8efcc00f001664db7b777902df9632a689b1d9d1498
Signature2: 30babb1e3ca07264931e067bfe36c676fb7988c2e06f8c54e0c9538fe7c7fc9a
2025-12-26 00:27:08 +07:00
Mitchell Hashimoto
594195963d Update src/renderer/Metal.zig
Co-authored-by: Jon Parise <jon@indelible.org>
2025-12-19 09:31:10 -08:00
Mitchell Hashimoto
07b47b87fa renderer/metal: clamp texture sizes to the maximum allowed by the device
This prevents a crash in our renderer when it is larger.

I will pair this with apprt changes so that our mac app won't ever allow
a default window larger than the screen but we should be resilient at
the renderer level as well.
2025-12-19 09:24:49 -08:00
Dominique Martinet
f37acdf6a0 gtk/opengl: print an error when OpenGL version is too old
#1123 added a warning when the OpenGL version is too old, but
it is never used because GTK enforces the version set with
gl_area.setRequiredVersion() before prepareContext() is called:
we end up with a generic "failed to make GL context" error:
warning(gtk_ghostty_surface): failed to make GL context current: Unable to create a GL context
warning(gtk_ghostty_surface): this error is almost always due to a library, driver, or GTK issue
warning(gtk_ghostty_surface): this is a common cause of this issue: https://ghostty.org/docs/help/gtk-opengl-context

This patch removes the requirement at the GTK level and lets the ghostty
renderer check, now failing as follow:
info(opengl): loaded OpenGL 4.2
error(opengl): OpenGL version is too old. Ghostty requires OpenGL 4.3
warning(gtk_ghostty_surface): failed to initialize surface err=error.OpenGLOutdated
warning(gtk_ghostty_surface): surface failed to initialize err=error.SurfaceError

(Note that this does not render a ghostty window, unlike the previous
error which rendered the "Unable to acquire an OpenGL context for
rendering." view, so while the error itself is easier to understand it
might be harder to view)
2025-12-16 13:30:37 -08:00
Mitchell Hashimoto
051e6543ff Decouple balanced top and left window paddings to avoid diagonal resize jitter (#9518)
I had a bit of the same annoyance as #9064. I agree that with balanced
padding, you should expect horizontal jitter when resizing horizontally,
and vertical jitter when resizing vertically. Diagonal jitter, however,
happened because the top padding was upper bounded by the left padding,
coupling the vertical and horizontal wobbling. This looks kind of janky,
and it's surprising that the balance between top and bottom padding
changes as you vary only the width of the window.

With this PR, the upper bound is instead equal to the maximum left
padding. Since this only depends on the config, not the current window
width, the diagonal wobbling is avoided. Both the top and left padding
still have the same range of motion as before.

An alternative could be to bound by the minimum or median left padding
instead. Open to suggestions.

**Before**


https://github.com/user-attachments/assets/d12c5870-f05d-450f-89fc-c59eab90e199

**After**


https://github.com/user-attachments/assets/35b80bb0-9ea2-41c1-8502-3a8eec51dbc6
2025-12-15 12:11:04 -08:00
Mitchell Hashimoto
a6ddf03a2e remove the macos-background-style config 2025-12-15 10:54:35 -08:00
Mitchell Hashimoto
bb23071166 config: change macos-background-style to be enums on background-blur 2025-12-15 10:42:21 -08:00
Justy Null
45aceace72 fix: disable renderer background when macOS effects are enabled 2025-12-15 10:12:11 -08:00
Mitchell Hashimoto
dbfc3eb679 Remove unused imports 2025-11-27 13:37:53 -08:00
Mitchell Hashimoto
9206b3dc9b renderer: manual selection should take priority over search matches
Previously it was impossible to select a search match. Well, it was
selecting but it wasn't showing that it was selected.
2025-11-26 10:28:28 -08:00
Mitchell Hashimoto
15f00a9cd1 renderer: setup proper dirty state on search selection changing 2025-11-26 08:50:04 -08:00
Mitchell Hashimoto
880db9fdd0 renderer: hook up search selection match highlighting 2025-11-25 11:05:38 -08:00
Daniel Wennberg
d31be89b16 fix(renderer): load linearized fg color for cursor cell 2025-11-24 21:04:54 -08:00
Mitchell Hashimoto
de16e4a92b config: add selection-foreground/background 2025-11-24 20:16:01 -08:00
Mitchell Hashimoto
a4e40c7567 set proper dirty state to redo viewport search 2025-11-24 19:55:27 -08:00
Mitchell Hashimoto
06981175af renderer: reset search dirty state after processing 2025-11-24 19:55:27 -08:00
Mitchell Hashimoto
dd9ed531ad render viewport matches 2025-11-24 19:55:27 -08:00
Mitchell Hashimoto
6c8ffb5fc1 renderer: receive message with viewport match selections
Doesn't draw yet
2025-11-24 19:55:27 -08:00
Mitchell Hashimoto
878ccd3f34 renderer: use proper cell style for cursor-color/text
Regression from render state work.
2025-11-24 19:52:21 -08:00
Mitchell Hashimoto
df466f3c73 renderer: make cursorStyle depend on RenderState
This makes `cursorStyle` utilize `RenderState` to determine the
appropriate cursor style. This moves the cursor style logic outside the
critical area, although it was cheap to begin with.

This always removes `viewport_is_bottom` which had no practical use.
2025-11-22 14:36:53 -08:00
Mitchell Hashimoto
82f5c1a13c renderer: clear renderstate memory periodically 2025-11-21 09:03:03 -08:00
Mitchell Hashimoto
3d56a3a02b font/shaper: remove old pre-renderstate logic 2025-11-20 22:00:44 -08:00
Mitchell Hashimoto
7728620ea8 terminal: render state dirty state 2025-11-20 22:00:44 -08:00
Mitchell Hashimoto
cd00a8a2ab renderer: handle normal non-osc8 links with new render state 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
fa26e9a384 terminal: OSC8 hyperlinks in render state 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
81142265aa terminal: renderstate stores pins 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
cc268694ed renderer: convert bg extend to new render state 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
d1e87c73fb terminal: renderstate now has terminal colors 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
ebc8bff8f1 renderer: switch to using render state 2025-11-20 22:00:43 -08:00
Mitchell Hashimoto
040d7794af renderer: build up render state, rebuild cells with it 2025-11-20 22:00:42 -08:00
Mitchell Hashimoto
410d79b151 Assorted Performance Enhancements (#9645)
A whole bunch of optimizations in hot paths in the IO processing areas
of our code (well, one of them covers everything). I validated that each
commit either improved one or more of our vtebench results, or improved
the time it takes to process 2 years worth (2.4GB) of data from
asciinema.

## vtebench
<img width="1278" height="903" alt="image"
src="https://github.com/user-attachments/assets/bad46777-4606-4870-b7d7-8df0c4bb3b39"
/>

(I decided to patch vtebench to report in nanoseconds instead of
milliseconds since clearly it was not designed for a machine as fast as
mine. Nanoseconds gives much more useful results when the numbers are
this low.)

Do note the *slight* regression in the "unicode" test, this is probably
because I added a branch hint in `Terminal.print` in order to optimize
for printing narrow characters, since they make up the vast majority of
characters typically printed in the terminal, but the vtebench "unicode"
test is pretty much all wide characters.

This shouldn't have a negative effect on users of CJK languages since
it's a *very* slight reduction in speed and they will still be printing
many narrow characters, especially in TUIs; spaces, box drawing
characters, symbols, punctuation, etc.

## asciinema processing
I wrote a program that uses libghostty to push 2 years worth (2.4GB) of
data from publicly uploaded asciinema recordings in to the terminal as
fast as possible- since it's just libghostty, there's no renderer
overhead happening, it's just the core terminal emulation, effectively
everything that io-reader thread does if it didn't have wait for the
renderer ever.

On main, this took roughly 26.1–26.7 seconds to process, on this branch
it takes just 18.4–18.6 seconds, that's a ~30% improvement in raw IO
processing speed when processing real world data!

## Summary of changes
In order of commits:
- Fixed a bug that I hit when trying to have Ghostty process all that
asciinema data, in certain bad cases it was possible to accidentally
insert the `0` hyperlink ID in to a page, which would then cause a
lockup in ReleaseFast mode when trying to clone that page since the
string alloc would try to iterate `1..0` to allocate 0 chunks.
- I noticed in profiling Ghostty that `std.debug.assert` was showing up
in the profile, which it should not have been since its doc comment
promises that it will be optimized out in ReleaseFast- but evidently
something is wrong with Zig, or that comment's promise is based on an
expectation from LLVM that it fails to meet - but either way, by
replacing all uses of `assert` with a version that is explicitly marked
`inline`, that function call overhead in tight loops and hotpaths is
avoided. This change alone accounts for like a third of the IO
processing time improvement, though it had minimal impact on vtebench
scores.
- I optimized the SGR parser somewhat by adding branch hints and
removing the `.reset_underline` action, replacing it with `.{ .underline
= .none }`.
- Gated a somewhat expensive assert in RefCountedSet behind a runtime
safety check.
- Improved the performance of `Style.eql` and `Style.hash` since these
are hot functions, called extremely frequently since adding styles to
the style set is a very common operation. Achieved this by making `eql`
less generic - explicitly comparing each part of the style rather than
looping over fields - and ordering checks from most likely to differ to
least likely to differ so that differences can be found as soon as
possible; and changed the hash from xxhash to simply folding the packed
struct down to 64 bits and then using `std.hash.int`. Also manually
inlined the code from `std.meta.activeTag` in `Packed.fromStyle`, since
profiling showed it in the callstack and it's a single cast so it really
should not have the function call overhead.
- Explicitly marked some trivial functions as inline, the optimizer
would already have been doing this (probably) but doing it explicitly
gives the optimizer more time to spend on other things. Added cold
branch hints to "should be impossible" and error-returning paths that
should be very rare, and unlikely branch hints to a lot of "invalid"
paths- to optimize for receiving valid data.
- Removed a branch in the parser csi param action, just unconditionally
multiply by 10 before adding digit value, even if it's the first digit.
This codepath is rarely hit since we have a fast path for this in the
stream code, but the stream code already has this optimization so I just
copied it over.
- `CharsetState.charsets` used to be an `EnumArray`, but the
layout/access logic for that was less-than-ideal, and the access
functions were not inlining-- and these are very hot since we access
this for every single print, so I wrote a bespoke struct to hold that
info instead, gained a couple percent of IO perf with that.
- Added branch hints based on the data I derived from the asciinema
dump, which gave big boost to vtebench results, especially for the
cursor movement and dense cells tests (which makes sense, since cursor
movement and setting attributes both got `likely` hints :p) -- data at
https://github.com/qwerasd205/asciinema-stats
- This is probably the most invasive change in this PR: I removed the
dirty bitset from `Page` and replaced it with a dirty flag on each row,
for the majority of operations this is faster to write, since the row
being dirtied is probably already loaded and probably will be written to
for other changes as well. This gave a couple percent IO processing
improvement. The only exception is scrolling-type operations, which are
extremely efficient by just moving rows around with a single memmov, so
looping through the rows to mark each dirty slows them down, and indeed
after this change the scrolling benchmarks in vtebench regressed,
*however*...
- Added a "full page dirty" flag on `Page`, which is set when an
operation is performed that dirties most or all the rows in the page,
which is used for scrolling-type operations. This *does* make the dirty
tracking slightly less precise for these operations, but with the
caching and stuff we do in the renderer, I don't think `rebuildCells` is
a bottleneck, so rebuilding a few extra rows shouldn't hurt. After this
change, all the scrolling benchmarks in vtebench improved drastically.
- Tiny micro-improvements to RefCountedSet; streamlined the control flow
in `lookup`, added an unlikely branch hint in `insert` for the branch
that resurrects dead items since dead items aren't that common.
- Improve SGR parser performance again by using `@call(.always_inline`
to explicitly inline calls to `StaticBitSet.isSet` (for the separator
list), since I noticed they weren't being inlined, causing function call
overhead in a hotpath.
- I noticed that `clearGrapheme` and `clearHyperlink` would check every
cell in the row after they were done in order to update the
`grapheme`/`hyperlink` flag on the row if there were none left, which
isn't great since `clearCells` called these functions for multiple cells
in the same row back-to-back, which leads to a ton of excess work. I
separated the flag updating parts of these functions out and called them
only if necessary (if the cells being cleared were the full row then the
flag could unconditionally be set to false) and only after all the cells
were cleared. This gave a nice improvement to IO processing since
clearCells is evidently a very hot function.
- Removed inline annotations on `Page.clearGrapheme` and
`Page.clearHyperlink` in favor of inlining directly at the one callsite
that benefited from inlining, this improved IO processing speed.
- Inlined trivial function `Charset.table`.
- Inlined `size.getOffset` and `size.intFromBase` as they are both
trivial pointer math that often benefits from surrounding context.

---

If you'd like me to separate out the trivial improvements (branch hints,
inline annotations, 1-line changes) from the functionality-changing ones
(pretty much just the changes to dirty tracking), just let me know!
2025-11-19 12:53:48 -10:00
Mitchell Hashimoto
8d8798bc79 renderer: minor log update, all commented 2025-11-19 06:21:53 -10:00
Qwerasd
81eda848cb perf: add full-page dirty flag
Avoids overhead of marking many rows dirty in functions that manipulate
row positions which dirties all or most of the page.
2025-11-18 20:46:00 -07:00
Qwerasd
30472c0077 perf: replace dirty bitset with a flag on each row
This is much faster for most operations since the row is often already
loaded when we have to mark it as dirty.
2025-11-18 20:46:00 -07:00
Qwerasd
6d5b4a3426 perf: replace std.debug.assert with inlined version
See doc comment in `quirks.zig` for reasoning
2025-11-17 12:13:56 -07:00
Qwerasd
00c2216fe1 style: add Offset.Slice.slice helper fn
Makes code that interacts with these so much cleaner
2025-11-16 12:39:10 -07:00
Mitchell Hashimoto
580f9f057b convert t.screen to t.screens.active 2025-11-14 15:40:31 -08:00
Mitchell Hashimoto
3aff5f0aff ScreenSet 2025-11-14 15:08:10 -08:00
Mitchell Hashimoto
368f4f565a terminal: Screen opts is a structure 2025-11-14 14:10:37 -08:00
Daniel Wennberg
9339ccf769 Decouple balanced top and left window paddings 2025-11-08 12:21:40 -08:00