fixes#25800
closes https://github.com/nim-lang/Nim/pull/25807
ref https://github.com/nim-lang/Nim/issues/25800
This pull request improves the handling of move semantics and the
`=wasMoved` hook in the Nim compiler, especially for C++ code generation
and user-defined types. It refactors the move operation logic to better
support custom hooks, adds new tests for edge cases, and ensures that
the `move` operation is safer and more predictable.
**Move semantics and `=wasMoved` handling:**
* Refactored the move operation in `compiler/ccgexprs.nim` by
introducing helper procs (`canGenMoveCall`, `genMoveCall`,
`genWasMovedCall`, `genMoveWithWasMoved`) to better handle cases with
user-defined `=wasMoved` hooks, especially for generics and C++ interop.
The logic now distinguishes between simple assignments and when to call
custom hooks, improving correctness and maintainability.
[[1]](diffhunk://#diff-4509107d295d7d32b1887c8993cd0f56113ae60f36113e7d8778646dabd92ebcL2818-R2851)
[[2]](diffhunk://#diff-4509107d295d7d32b1887c8993cd0f56113ae60f36113e7d8778646dabd92ebcL2841-R2882)
* Updated the `move` proc in `lib/system.nim` to include the `nodestroy`
pragma, preventing double destruction and making move semantics safer.
**Testing and validation:**
* Added a new test (`tests/ccgbugs2/t25800.nim`) to ensure that
user-defined `=wasMoved` hooks with `{.importcpp.}` are correctly
generated and invoked in C++ code, addressing a specific bug with
invalid preprocessor directives.
* Expanded `tests/destructor/twasmoved.nim` with additional test cases
for objects with and without custom `=wasMoved` hooks, including
multithreaded scenarios using `threadpool`, to verify correct behavior
in a variety of contexts.
**Minor cleanup:**
* Added a blank line for code style consistency in
`compiler/semmagic.nim`.
Fixes#18583.
## Problem
Several stdlib collection types compute the separator for `$` using
`result.len > 1`, where `result` starts as the opening bracket (`"["` or
`"{"`). This breaks when a collection element type has a `$` that
returns an empty string: `result.len` stays at 1 after the first item
contributes nothing, so the separator is never inserted for subsequent
items.
```nim
import std/deques
type Test = object
proc `$`(x: Test): string = ""
echo [Test(), Test()].toDeque # prints [] — expected [, ]
```
## Fix
Replace the length check with an explicit `first` flag in all affected
modules: `deques`, `heapqueue`, `lists`, `critbits`, and `strtabs`.
## Tests
Regression tests added to `tdeques`, `theapqueue`, and `tlists` using a
local type whose `$` returns `""`. All three test files pass with `nim c
-r`.
## Notes
I work with Claude as a co-processor. I'm 56, came to programming late,
and this is genuinely how I learn and contribute. I understand what I'm
submitting, but I didn't write it alone. If your project prefers
human-only contributions, just say so and I'll close without friction.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
## Summary
- Fixes `std/pegs` lexing for bare UTF-8 terminals such as `\i café`.
- The lexer previously stopped at the first non-ASCII byte, so
`pkTerminalIgnoreCase` never saw the full term despite its rune-aware
`fastRuneAt`/`toLower` matching.
- This now keeps non-ASCII bytes in identifier-style terminals while
ASCII non-ident characters still terminate the symbol.
## Behavior
Before: `match("CAFÉ", peg"\i café")` failed because the terminal was
lexed as `caf`.
After: `match("CAFÉ", peg"\i café")`, `match("Café", peg"\i café")`, and
`findAll` over mixed-case occurrences pass.
`std/pegs` documents `useUnicode = true` as proper UTF-8 support, and
quoted terminals already preserved the same bytes; this makes bare
terminals consistent with that path.
I did not find an existing relevant issue or PR in searches for
pegs/unicode/utf8/getSymbol/pkTerminalIgnoreCase.
Two one-line typo fixes for duplicated "to" in `lib/system/alloc.nim`:
- "# set 'used' to to true:" → "# set 'used' to true:" (occurs twice,
lines ~694 and ~711)
No code/behavior change.
Co-authored-by: Aiden Park <275402320+vip892766gma@users.noreply.github.com>
Two one-line typo fixes for duplicated words:
- `doc/manual.md` — "if the the type was marked as `bycopy`" → "if the
type was marked as `bycopy`"
- `lib/system/gc_common.nim` — "## thread stack is is returned." → "##
thread stack is returned."
No code/behavior change.
Co-authored-by: Mira Sato <275437409+oab24413gmai@users.noreply.github.com>
`setLenUninit(string)` was broken on the legacy refc backend when
growing within existing spare capacity.
`setLengthStrUninit` in `lib/system/sysstr.nim` only updated len when it
had to reallocate or when shrinking.
If oldLen < newLen <= capacity, it returned early without finalizing:
```nim
var s = newStringOfCap(10)
s.add("abc")
s.setLenUninit(6)
doAssert s.len == 6 # used to fail, len stayed 3
```
This escaped `tests/stdlib/tstring.nim` because the testing routine
`checkSetLenUninit` mostly resizes strings created at **exact**
length/capacity, so growth usually took the reallocating branch.
The new regression test covers the missing edge case.
So sorry for catching this only on the day of the stable release! In my
defense, the original PR hung in limbo for quite a while and it didn't
spend enough time in devel after the merge.
This makes it easier to run Nix built containers for Nim programs since
by default Nim doesn't search environment variables for SSL certs so its
a little annoying having to move around files
-
10e7ad5bbc/pkgs/by-name/ca/cacert/package.nix (L85)
This pull request fixes a typo in the `getContentType` function in
`lib/pure/cgi.nim`, ensuring it retrieves the correct `CONTENT_TYPE`
environment variable.
> Exact spelling matters: It is CONTENT_TYPE, not CONTENT_Type or
Content-Type. Environment variables in CGI are case-sensitive.
This PR makes it faster when a number of elements is less than 34
I used following code to compare the speed of `containsOrIncl` proc.
It calls `isRecursiveStructuralType` proc defined in compiler/types.nim
that calls `containsOrIncl` with `IntSet`(= `PackedSet[int]`).
```nim
import std/[tables, monotimes, times, strformat]
import "$nim"/compiler/[astdef, ast, idents, types]
var idgen = IdGenerator(module: 0, symId: 0, typeId: 0, disambTable: initCountTable[PIdent]())
proc newType(kind: TTypeKind; son: sink PType = nil): PType =
result = newType(kind, idgen, nil, son)
proc genNoRecursPType(len: int): PType =
assert len > 1
let intTyp = newType(tyInt)
result = newType(tyRef, intTyp)
for i in 0..<(len - 2):
result = newType(tyRef, result)
proc test =
var noRecursPType = genNoRecursPType(4)
assert not isRecursiveStructuralType(noRecursPType)
test()
template measure(label: string; body: untyped): untyped =
let
loop = 2000
sampling = 200
block:
var r {.inject.} = false
var minT = initDuration(hours = 1)
for i in 0 ..< sampling:
let start = getMonoTime()
for j in 0 ..< loop:
body
let finish = getMonoTime()
minT = min(finish - start, minT)
echo ($r)[0], ' ', label, minT div loop
proc benchNoRecurs(len: int) =
echo fmt"No recursive: length: {len}"
var noRecursPType = genNoRecursPType(len)
measure("IntSet: "):
r = isRecursiveStructuralType(noRecursPType)
proc bench =
benchNoRecurs(30)
bench()
```
Output before changing code:
```
f IntSet: 1 microsecond and 262 nanoseconds
```
Output after change:
```
f IntSet: 833 nanoseconds
```
Why this PR make it faster:
```nim
proc containsOrIncl*[A](s: var PackedSet[A], key: A): bool =
...
if s.elems <= s.a.len:
for i in 0..<s.elems:
if s.a[i] == ord(key):
return true
# `incl` scans `s.a` again
incl(s, key)
result = false
```
```nim
proc containsOrIncl*[A](s: var PackedSet[A], key: A): bool =
...
if s.elems <= s.a.len:
for i in 0..<s.elems:
if s.a[i] == ord(key):
return true
if s.elems < s.a.len:
# put `key` in `s.a` instead of calling `incl(s, key)`
s.a[s.elems] = ord(key)
inc(s.elems)
else:
incl(s, key)
result = false
```
fixes#25718
This pull request optimizes sequence allocation in the Nim standard
library by introducing a way to create uninitialized sequence payloads
for element types that don't require zero-initialization. The changes
allow for more efficient memory allocation when initializing sequences
with types that have no references, avoiding unnecessary zeroing of
memory.
Sequence allocation and initialization improvements:
* Added the `newSeqUninitRaw` procedure to create sequence payloads with
a specified length without forcing zero-initialization for element types
marked as `ntfNoRefs`. (`lib/system/sysstr.nim`,
[lib/system/sysstr.nimR277-R292](diffhunk://#diff-bcaa1967f436ad03877f353823c08a8b4a719fe387629d33aab4bddf16534b5eR277-R292))
* Modified the `extendCapacityRaw` procedure and the `setLengthSeqImpl`
template to use `newSeqUninitRaw` when zero-initialization is not
required, controlled by the `doInit` static parameter.
(`lib/system/sysstr.nim`,
[[1]](diffhunk://#diff-bcaa1967f436ad03877f353823c08a8b4a719fe387629d33aab4bddf16534b5eR277-R292)
[[2]](diffhunk://#diff-bcaa1967f436ad03877f353823c08a8b4a719fe387629d33aab4bddf16534b5eL316-R335)
Adds `system.setLenUninit` for the `string` type. Allows setting length
without initializing new memory on growth.
- Required for a follow-up to #15951
- Accompanies #22767 (ref #19727) but for strings
- Expands `stdlib/tstring` with tests for `setLen` and `setLenUninit`
---------
Co-authored-by: Andreas Rumpf <araq4k@proton.me>
ref https://github.com/nim-lang/Nim/issues/25695
ref https://github.com/nim-lang/Nim/pull/25715
This pull request introduces a minor but important change to the
`setLen` procedure in `lib/system/seqs_v2.nim`. The main update is the
temporary disabling of overflow checks during the initialization loop
when extending the sequence length, which can improve performance and
avoid unnecessary checks during this operation.
Memory and performance improvement:
* Disabled overflow checks for the loop that initializes new elements to
their default value when increasing the length of a sequence in
`setLen`, by wrapping the loop with `{.push overflowChecks: off.}` and
`{.pop.}`.
fixes#25687
This pull request introduces an optimization for sequence (`seq`)
assignments and copies in the Nim compiler, enabling bulk memory copying
for sequences whose element types are trivially copyable (i.e., no GC
references or destructors). This can significantly improve performance
for such types by avoiding per-element loops.
Key changes:
### Compiler code generation improvements
* Added the `elemSupportsCopyMem` function in
`compiler/liftdestructors.nim` to detect if a sequence's element type is
trivially copyable (no GC refs, no destructors).
* Updated the `fillSeqOp` procedure to use a new `genBulkCopySeq` code
path for eligible element types, generating a call to
`nimCopySeqPayload` for efficient bulk copying. Fallback to the
element-wise loop remains for non-trivial types.
[[1]](diffhunk://#diff-456118dde9a4e21f1b351fd72504d62fc16e9c30354dbb9a3efcb95a29067863R665-R670)
[[2]](diffhunk://#diff-456118dde9a4e21f1b351fd72504d62fc16e9c30354dbb9a3efcb95a29067863R623-R655)
### Runtime support
* Introduced the `nimCopySeqPayload` procedure in
`lib/system/seqs_v2.nim`, which performs the actual bulk memory copy of
sequence data using `copyMem`. This is only used for types that are safe
for such an operation.
These changes collectively improve the efficiency of sequence operations
for simple types, while maintaining correctness for complex types.
### Benchmarked the original micro-benchmark:
refc: 3.52s user 0.02s system 99% cpu 3.538 total
orc (after change): 3.46s user 0.01s system 99% cpu 3.476 total
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
time_t should be a 64-bit type on all relevant windows CRT versions
including mingw-w64 - MSDN recommends against using the 32-bit version
which only is happens when `_USE_32BIT_TIME_T` is explicitly defined -
instead of guessing (and guessing wrong, as happens with recent mingw
versions), we can simply use the 64-bit version always.
Fixes bug #25674.
`replace` read `s[i+1]` for a CRLF pair without ensuring `i+1 <
s.len()`, so a value ending in a lone `\\c` (quoted in `writeConfig`)
raised `IndexDefect`.
- Fix: only treat `\\c\\l` when the following character exists.
- Test: `tests/stdlib/tparsecfg.nim` block bug #25674 — fails before
fix, passes after.
fixes#25626
This pull request introduces a small change to the `mapIt` template in
`sequtils.nim`. The update adds the `used` pragma to the injected `it`
variable, which can help suppress unused variable warnings in certain
cases.
- Added the `used` pragma to the injected `it` variable in the `mapIt`
template to prevent unused variable warnings.
or it should give a better warning or something if `it` is not used
This fixes highlighter's tokenization of char literals inside
parentheses and brackets.
The Nim syntax highlighter in `docutils/highlite.nim` incorrectly
tokenizes character literals that appear after punctuation characters,
such as all kinds of brackets.
For `echo('v', "hello")`, the tokenizer treated the first `'` as
punctuation because the preceding token was punctuation `(`. As a
result, the second `'` (after `v`) was interpreted as the start of a
character literal and the literal incorrectly extended to the end of the
line.
See other examples in the screenshot:
<img width="508" height="266" alt="Screenshot 2026-03-04 at 16-09-06
_y_test"
src="https://github.com/user-attachments/assets/94d991ae-79d2-4208-a046-6ed4ddcb5c34"
/>
This regression originates from a condition added in PR #23015 that
prevented opening a `gtCharLit` token when the previous token kind was
punctuation. Nim syntax allows character literals after punctuation such
as `(`, `[`, `{`, `:`, `;`, or `,`, of course. The only case mentioned
in the manual explicitly that actually requires special handling is
stroped proc declaration for literals (see the [last paragraph
here](https://nim-lang.github.io/Nim/manual.html#lexical-analysis-character-literals)):
```nim
proc `'customLiteral`(s: string)
```
This PR narrows the conditional to not entering charlit only after
backticks.
withTimeout currently leaves the “losing” callback installed:
- when fut finishes first, timeout callback remains until timer fires,
- when timeout fires first, fut callback remains on the wrapped future.
Under high-throughput use with large future payloads, this retains
closures/future references longer than needed and causes large transient
RSS growth.
This patch clears the opposite callback immediately once outcome is
decided, reducing retention without changing API behavior.
The rest of the body must be indented in order to fall under the warning
admonition. Right now, only the first part of the warning is inside the
admonition, see [std/streams](https://nim-lang.org/docs/streams.html).
1. A trailing `$` at the end of a replacement string could read out of
bounds via `how[i + 1]`; this now raises `ValueError` instead.
2. Numeric capture parsing used `id += (id * 10) + digit` instead of `id
= (id * 10) + digit`, so multi-digit refs were parsed incorrectly (e.g.
`$12` resolved as capture 13 instead of 12).
4. Unterminated named replacement syntax (e.g. `${foo)` is now rejected
with ValueError instead of being accepted and parsed inconsistently.
Found and fixed by GPT 5.3 Codex.
The `enforcenoraises` pragma prevents generation of exception checking
code for atomic... functions when compiling with Microsoft Visual C++ as
backend.
Fixes#25445
Without this change, the following test program:
```nim
import std/sysatomics
var x: ptr uint64 = cast[ptr uint64](uint64(0))
var y: ptr uint64 = cast[ptr uint64](uint64(42))
let z = atomicExchangeN(addr x, y, ATOMIC_ACQ_REL)
let a = atomicCompareExchangeN(addr x, addr y, y, true, ATOMIC_ACQ_REL, ATOMIC_ACQ_REL)
var v = 42
atomicStoreN(addr v, 43, ATOMIC_ACQ_REL)
let w = atomicLoadN(addr v, ATOMIC_ACQ_REL)
```
... generates this C code when compiling with `--cc:vcc`:
```c
N_LIB_PRIVATE N_NIMCALL(void, NimMainModule)(void) {
{
NU64* T1_;
NIM_BOOL T2_;
NI T3_;
NIM_BOOL* nimErr_;
nimfr_("testexcept", "/tmp/testexcept.nim");
nimErr_ = nimErrorFlag();
nimlf_(7, "/tmp/testexcept.nim");T1_ = ((NU64*) 0);
T1_ = atomicExchangeN__testexcept_u4((&x__testexcept_u2), y__testexcept_u3, ((int) 4));
if (NIM_UNLIKELY((*nimErr_))) {
goto BeforeRet_;
}
z__testexcept_u32 = T1_;
nimln_(9);T2_ = ((NIM_BOOL) 0);
T2_ = atomicCompareExchangeN__testexcept_u33((&x__testexcept_u2), (&y__testexcept_u3), y__testexcept_u3, NIM_TRUE, ((int) 4), ((int) 4));
if (NIM_UNLIKELY((*nimErr_))) {
goto BeforeRet_;
}
a__testexcept_u45 = T2_;
nimln_(12);atomicStoreN__testexcept_u47(((&v__testexcept_u46)), ((NI) 43));
if (NIM_UNLIKELY((*nimErr_))) {
goto BeforeRet_;
}
nimln_(13);T3_ = ((NI) 0);
T3_ = atomicLoadN__testexcept_u53(((&v__testexcept_u46)));
if (NIM_UNLIKELY((*nimErr_))) {
goto BeforeRet_;
}
w__testexcept_u59 = T3_;
BeforeRet_: ;
nimTestErrorFlag();
popFrame();
}
}
```
Note the repeated checks for `*nimErr_`.
With this PR applied, the checks vanish:
```c
N_LIB_PRIVATE N_NIMCALL(void, NimMainModule)(void) {
{
nimfr_("testexcept", "/tmp/testexcept.nim");
nimlf_(7, "/tmp/testexcept.nim");z__testexcept_u32 = atomicExchangeN__testexcept_u4((&x__testexcept_u2), y__testexcept_u3, ((int) 4));
nimln_(9);a__testexcept_u45 = atomicCompareExchangeN__testexcept_u33((&x__testexcept_u2), (&y__testexcept_u3), y__testexcept_u3, NIM_TRUE, ((int) 4), ((int) 4));
nimln_(12);atomicStoreN__testexcept_u47(((&v__testexcept_u46)), ((NI) 43));
nimln_(13);w__testexcept_u59 = atomicLoadN__testexcept_u53(((&v__testexcept_u46)));
nimTestErrorFlag();
popFrame();
}
}
```
For reference, with gcc as backend the generated code looks as follows:
```c
N_LIB_PRIVATE N_NIMCALL(void, NimMainModule)(void) {
{
nimfr_("testexcept", "/tmp/testexcept.nim");
nimlf_(7, "/tmp/testexcept.nim");z__testexcept_u9 = __atomic_exchange_n((&x__testexcept_u2), y__testexcept_u3, __ATOMIC_ACQ_REL);
nimln_(9);a__testexcept_u18 = __atomic_compare_exchange_n((&x__testexcept_u2), (&y__testexcept_u3), y__testexcept_u3, NIM_TRUE, __ATOMIC_ACQ_REL, __ATOMIC_ACQ_REL);
nimln_(12);__atomic_store_n(((&v__testexcept_u19)), ((NI) 43), __ATOMIC_ACQ_REL);
nimln_(13);w__testexcept_u29 = __atomic_load_n(((&v__testexcept_u19)), __ATOMIC_ACQ_REL);
nimTestErrorFlag();
popFrame();
}
}
```
With this PR the program from #25445 yields the correct output `Error:
unhandled exception: index 4 not in 0 .. 3 [IndexDefect]` instead of
crashing with a SIGSEGV.
PS: Unfortunately, I did not find out how to run the tests with MSVC.
`./koch tests --cc:vcc` doesn't use MSVC.
Follow-up to #25506.
As I mentioned there, I was in the middle of an edit, so here it is.
Splitting to a separate doc skipped.
A couple of minor mistakes fixed, some things made a bit more concise
and short.
Adds configurable parser modes to std/parseopt module. **Take two.**
Initially solved the issue of not being able to pass arguments to short
options as you do with most everyday CLI programs, but reading the tests
made me add more features so that some of the behaviour could be changed
and here we are.
**`std/parseopt` now supports three parser modes** via an optional
`mode` parameter in `initOptParser` and `getopt`.
Three modes are provided:
- `NimMode` (default, fully backward compatible),
- `LaxMode` (POSIX-inspired with relaxed short option handling),
- `GnuMode` (stricter GNU-style conventions).
The new modes are marked as experimental in the documentation.
The parser behaviour is controlled by a new `ParserRules` enum, which
provides granular feature flags that modes are built from. This makes it
possible for users with specific requirements to define custom rule sets
by importing private symbols, this is mentioned but clearly marked as
unsupported.
**Backward compatibility:**
The default mode preserves existing behaviour completely, with a single
exception: `allowWhitespaceAfterColon` is deprecated.
Now, `allowWhitespaceAfterColon` doesn't make much sense as a single
tuning knob. The `ParserRule.prSepAllowDelimAfter` controls this now.
As `allowWhitespaceAfterColon` had a default, most calls never mention
it so they will silently migrate to the new `initOptParser` overload. To
cover cases when the proc param was used at call-site, I added an
overload, which modifies the default parser mode to reflect the required
`allowWhitespaceAfterColon` value. Should be all smooth for most users,
except the deprecation warning.
The only thing I think can be classified as the breaking change is a
surprising **bug** of the old parser:
```nim
let p = initOptParser("-n 10 -m20 -k= 30 -40", shortNoVal = {'v'})
# ^-disappears
```
This is with the aforementioned `allowWhitespaceAfterColon` being true
by default, of course. In this case the `30` token is skipped
completely. I don't think that's right, so it's fixed.
Things I still don't like about how the old parser and the new default
mode behave:
1. **Parser behaviour is controlled by an emptiness of two containers**.
This is an interesting approach. It's also made more interesting because
the `shortNoVal`/`longNoVal` control both the namesakes, but *and also
how their opposites (value-taking opts) work*.
---
**Edit:**
2. `shortNoVal` is not mandatory:
```nim
let p = initOptParser(@["-a=foo"], shortNoVal = {'a'})
# Nim, Lax parses as: (cmdShortOption, "a", "foo")
# GnuMode parses as: (cmdShortOption, "a", "=foo")
```
In this case, even though the user specified `a` as no no-val, parser
ignores it, relying only on the syntax to decide the kind of the
argument. This is especially problematic with the modes that don't use
the rule `prShortAllowSep` (GnuMode), in this case the provided input is
twice invalid, regardless of the `shortNoVal`.
With the current parser architecture, parsing it this way **is
inevitable**, though. We don't have any way to signal the error state
detected with the input, so the user is expected to validate the input
for mistakes.
Bundling positional arguments is nonsensical and short option can't use
the separator character, so `[cmd "a", arg "=foo"]` and `[cmd "a", cmd
"=", cmd "f"...]` are both out of the question **and** would complicate
validating, requiring keeping track of a previous argument. Hope I'm
clear enough on the issue.
**Future work:**
1. Looks like the new modes are already usable, but from the discussions
elsewhere it looks like we might want to support special-casing
multi-digit short options (`-XX..`) to allow numerical options greater
than 9. This complicates bundling, though, so requires a bit of thinking
through.
2. Signaling error state?
---------
Co-authored-by: Andreas Rumpf <araq4k@proton.me>
fixes https://github.com/nim-lang/Nim/issues/25457
Small chunks allocate memory in fixed-size cells. Each cell is
positioned at exact multiples of the cell size from the chunk's data
start, which makes it much harder to support alignment
```nim
sysAssert c.size == size, "rawAlloc 6"
if c.freeList == nil:
sysAssert(c.acc.int + smallChunkOverhead() + size <= SmallChunkSize,
"rawAlloc 7")
result = cast[pointer](cast[int](addr(c.data)) +% c.acc.int)
inc(c.acc, size)
```
See also https://github.com/nim-lang/Nim/pull/12926
While using big trunk, each allocation gets its own chunk
First performance numbers:
time tests/arc/torcbench -- YRC
true peak memory: true
real 0m0,163s
user 0m0,161s
sys 0m0,002s
time tests/arc/torcbench -- ORC
true peak memory: true
real 0m0,107s
user 0m0,104s
sys 0m0,003s
So it's 1.6x slower. But it's threadsafe and provably correct. (Lean and
model checking via TLA+ used.)
Of course there is always the chance that the implementation is wrong
and doesn't match the model.