Fixes#10357
This updates `manualStyleUpdate` in `Screen` to perform a page split if
we get an `OutOfSpace` error. A page split will choose the half that has
less used capacity and split to that side, hoping we'll now have space
for adding our style. Previously, we'd just raise this error up and not
do any splitting.
Callers of `manualStyleUpdate` include `setAttribute` amongst many
others. So a significant part of libghostty is now resilient to
OutOfSpace.
This also updates `restoreCursor` to no longer ever fail. If style
updates fail for restore cursor, we revert back to a default (unstyled)
cursor. This is important because terminal programs that send a restore
cursor sequence have no way to understand the restoration failed, and
its more important that we restored as much as possible (position and so
on) then restore everything in it.
## Internals
This adds two new PageList operations:
* `split` - Splits a page at a given row, moving rows at and after the
split point to a new page. Used when a page runs out of capacity and we
need to split it to continue operations.
* `compact` - Compacts a page to use the minimum required memory.
Computes exact capacity needed for page contents and creates a smaller
page if meaningful savings are possible. Clones the data into it,
updating all metadata like tracked pins. **This isn't used yet! But it
is well tested.**
And a supporting Page operation:
* `Page.exactRowCapacity` - Computes the exact capacity required to
store a range of rows from a page by counting unique styles, hyperlinks,
grapheme bytes, and string bytes. This takes into account load factors
and so on.
## Weaknesses
* `manualStyleUpdate` only splits once. For maximum robustness we should
probably recursively split to try to make space until we're down to a 1
row page. I didn't want to do this without compaction implemented though
cause pathological cases could cause crazy memory blow-up.
* `split` duplicates the capacity, effectively doubling memory when it
happens (wasted capacity). I think `compact` calling should be done
somewhere, which is why I implemented it, but didn't want to integrate
too much in one PR so we can get some real world testing...
* `compact` can't go smaller than `std_size` due to the way PageList
works. We may want to consider making a MUCH smaller `std_size` and
leaning in to more non standard pages.
## AI Disclosure
I used Amp a lot to help in every aspect of this work. I reviewed every
line written including tests and did significant manual modification
too.
We need to have sane behavior in error handling because the running
program that sends the restore cursor command has no way to realize it
failed. So if our style fails to add (our only fail case) then we revert
to no style.
https://ampcode.com/threads/T-019bd7dc-cf0b-7439-ad2f-218b3406277a
Fixes#10369
When `resizeWithoutReflowGrowCols` copies rows to a previous page with
spare capacity, tracked pins pointing to those rows were not being
remapped. This left pins pointing to the original page which was
subsequently destroyed.
The fix adds pin remapping for rows copied to the previous page,
matching the existing remapping logic for rows copied to new pages.
I also added new integrity checks to verify that our tracked pins are
always valid at points where internal operations complete.
Thanks to @grishy for finding this!
**AI disclosure:** Amp used for verifying and fixing this bug. I
reviewed the results and just did minor manual tweaks.
https://ampcode.com/threads/T-019bd6d7-0645-73dd-8fd7-659f019fa83d and
https://ampcode.com/threads/T-019bd6c9-cc2e-73bc-bbaa-f8766e11c234
Fixes#10369
When `resizeWithoutReflowGrowCols` copies rows to a previous page with
spare capacity, tracked pins pointing to those rows were not being remapped.
This left pins pointing to the original page which was subsequently destroyed.
The fix adds pin remapping for rows copied to the previous page,
matching the existing remapping logic for rows copied to new pages.
I also added new integrity checks to verify that our tracked pins are
always valid at points where internal operations complete.
Thanks to @grishy for finding this!
This never caused any known issues, but it's a bug! `increaseCapacity`
should produce a node with identical contents, just more capacity. We
were forgetting to copy over the dirty flag.
I looked back at `adjustCapacity` and it also didn't preserve the dirty
flag so presumably downstream consumers have been handling this case
manually. But, I think semantically it makes sense for
`increaseCapacity` to preserve the dirty flag.
This bug was found by AI (while I was doing another task). I fixed it
and wrote the test by hand though.
We previously wrote our new cache file into a temporary directory and
the (atomically) renamed it to the canonical cache file path. This
rename operation unfortunately only works when both files are on the
same file system, and that's not always the case (e.g. when $TMPDIR is
on its own file system).
Instead, we can use Zig's AtomicFile to safely perform this operation
inside of the cache directory.
There's a new risk of a crash leaving the temporary file around in this
directory (and not getting cleaned up like $TMPDIR-based files), but the
probability is low and those files will only be readable by the creating
user (mode 0o600).
There's a new test cash that verifies the expected AtomicFile clean up
behavior. I also switched the file-oriented tests to use testing.tmpDir
rather than using our application-level TempDir type.
Fixes#10364
---
**AI Disclosure:** I asked Claude to write the initial test case to
verify the AtomicFile cleanup behavior.
This never caused any known issues, but it's a bug! `increaseCapacity`
should produce a node with identical contents, just more capacity. We
were forgetting to copy over the dirty flag.
I looked back at `adjustCapacity` and it also didn't preserve the dirty
flag so presumably downstream consumers have been handling this case
manually. But, I think semantically it makes sense for
`increaseCapacity` to preserve the dirty flag.
This bug was found by AI (while I was doing another task). I fixed it
and wrote the test by hand though.
We previously wrote our new cache file into a temporary directory and
the (atomically) renamed it to the canonical cache file path. This
rename operation unfortunately only works when both files are on the
same file system, and that's not always the case (e.g. when $TMPDIR is
on its own file system).
Instead, we can use Zig's AtomicFile to safely perform this operation
inside of the cache directory.
There's a new risk of a crash leaving the temporary file around in this
directory (and not getting cleaned up like $TMPDIR-based files), but the
probability is low and those files will only be readable by the creating
user (mode 0o600).
There's a new test cash that verifies the expected AtomicFile clean up
behavior. I also switched the file-oriented tests to use testing.tmpDir
rather than using our application-level TempDir type.
Fixes#10352
The bug was that non-standard pages would mix the old
`growRequiredForActive` check and make our active area insufficient in
the PageList.
But, since scrollbars now require we have a cached `total_rows` that our
safety checks always verify, we can remove the old linked list traversal
and switch to some simple math in general across all page sizes.
Fixes#10352
The bug was that non-standard pages would mix the old
`growRequiredForActive` check and make our active area insufficient in
the PageList.
But, since scrollbars now require we have a cached `total_rows` that our
safety checks always verify, we can remove the old linked list traversal
and switch to some simple math in general across all page sizes.
Fixes#10258
Replaces #10284
1. `Page.Capacity` now uses smaller bit-width integers that represent a
true maximum capacity for various fields.
2. On 64-bit systems, a maxed out `Page.Capacity` (every field `maxInt`)
can be represented in an addressable allocation (total required memory
less than 64 bits). This means `Page.layout` can't overflow.
3. All `adjustCapacity` functions replaced with `increaseCapacity` which
doesn't allow specifying the resulting value, which makes it so overflow
is only possible in significantly fewer places, making it easier to
handle in general.
4. `increaseCapacity` can return a new error `OutOfSpace` which happens
when overflow is detected. This means that no valid page can accommodate
the desired capacity increase because we're already at the max. The
caller is expected to handle this.
5. Updated our resize so that the only possible error is system OOM, we
handle the new `OutOfSpace` by copying the recent reflowed row into a
new page and continuing.
A very, very high-level overview is below. The "overflow" here papers
over a bunch of details where the prior usize capacities flowed through
to Page.layout and ultimately RefCountedSet and other managed types
which then caused incorrect calculations on total memory size required.
```mermaid
flowchart TB
subgraph Before["Before: adjustCapacity"]
A1[capacity: usize] --> A2["capacity *= 2"]
A2 --> A3{Overflow?}
A3 -->|"Not detected"| A4["Massive allocation or crash"]
end
subgraph After["After: increaseCapacity"]
B1["capacity: bounded int<br/>(u16/u32)"] --> B2["capacity *= 2"]
B2 --> B3{Overflow?}
B3 -->|"OutOfSpace error"| B4["Graceful handling:<br/>move row to new page"]
B3 -->|"Success"| B5["Normal allocation"]
end
Before --> After
classDef beforeStyle fill:#3d1a1a,stroke:#ff6b6b,color:#ff6b6b
classDef afterStyle fill:#1a3d3a,stroke:#4ecdc4,color:#4ecdc4
class A1,A2,A3,A4 beforeStyle
class B1,B2,B3,B4,B5 afterStyle
```