deps.nim's generateBackendBuildFile now emits the per-module pipeline instead
of one whole-program nifc rule: per-module cg -> merge -> per-module emit ->
link. The single nim_nifc command template carries the per-rule stage/module
switches in nifmake's (args) slot; backendCFile reconstructs each module's .c
path exactly as cgen.getCFile does (mangleModuleName of the source path for
main, the NIF suffix for deps) so the rules can declare outputs without loading
any backend module. The main module's cg depends on every other .c.nif (it
reads their init metas for NimMain), so it runs last; merge depends on all
.c.nif; each emit on merge; link on all .c.
Supporting changes:
- new `link` stage (nifbackend.generateLinkStage): registers every emitted .c
and runs extccomp.callCCompiler once (parallel cc + link). Skips modules with
no .c (extra members of system's closure whose code was emit-everywhere'd).
- loadBackendModules also loads system's transitive closure so every module in
the dep graph is a resolvable cg/emit target (was: project closure + system
only, leaving system-closure modules unfindable).
- cg always writes a .c.nif (even for code-less leaf modules) so every cg rule
has its declared nifmake output.
- export getSomeNameForModule; deps.nim imports modulepaths/extccomp/cnif.
`nim ic` now builds via the per-module backend end to end. Validated: the int
diamond and a generic+exception program build and run byte-correct vs the
whole-program backend; koch ic thallo/tconverter/timp/tmiscs/tparseutils all
green (and a thallo binary's output matches the whole-program build).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
genTypeInfoV2 has its own owner-routing: when the type's owner module is open
for codegen it pushes the RTTI definition into that module and emits only an
extern here. Under per-module cg every module except the target is loaded but
unwritten, so the definition landed in a discarded backend module and the
demanding module kept just an extern -> the RTTI symbol (e.g. an exception
type's NTIv2) was defined nowhere and failed to link.
Gate the owner-routing (and the reuse-cache shortcut) off when
icBackendStage == "cg" so RTTI is emit-everywhere like procs and consts: every
demanding module emits the 'd' definition and the merge stage dedups it to one
owner.
Validated end-to-end on a generic+exception program (shared box[int] instance
across two sibling modules, a raised/caught custom exception, seq+string
hooks): the per-module pipeline now links with no undefined/duplicate symbols
and prints the same output as the whole-program build. Whole-program path
unchanged (gate is per-module-cg only); koch ic thallo/tmiscs green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In the per-module model each module's init/datInit proc is generated by its
own cg process, but NimMain (emitted with the main module) must call them all.
Previously the main module's cg only registered its own init, so a/b's globals
stayed at their defaults -- the linked exe ran but printed wrong values.
When the cg target is the main module, register every other loaded module's
init/datInit into NimMain via registerReusedModuleToMain (exported), reading
each module's initRequired/datInitRequired from its .c.nif meta head. This is
the same no-codegen registration the whole-program backend uses for reused
TUs; it requires the main module's cg to run last, after every other .c.nif
exists (the per-module nifmake graph will order it so). Modules without init
code have no .c.nif and register nothing.
Validated: the 3-module diamond now builds end-to-end through the per-module
pipeline (cg all with main last, merge, emit all, cc, link) and the resulting
executable prints the correct "16 23" -- matching the whole-program build.
koch ic thallo green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Consts and RTTI are demand-emitted, so under emit-everywhere `cg` the same
external-linkage data definition lands in several modules' .c.nif as `cdata`
(raw text, not droppable) -> multiple definition at link. Procs already
deduped via the `'u'` cdef flag; data now gets the same droppable+owner
treatment, with one difference: data is never DCE'd (RTTI needs pointer
identity for `of`/exception checks; static-per-TU would break that), so it is
always a liveness root and kept by its single owner regardless of liveness.
- New `'d'` cdef flag = a data definition: the merge stage assigns it one
owner (smallest claimant, like `'u'`) and roots it (so its body keeps the
procs it references live); emit keeps the body only in the owner, every
other module keeps just an always-emitted `extern` declaration (the data
analogue of a proc prototype).
- genConstDefinition (ccgexprs) and genTypeInfoV2Impl (ccgtypes) now, under
cmdNifC, emit an extern declaration + wrap the definition in a `'d'` cdef
directive. The RTTI forward decl becomes a real `extern` (was a tentative
definition that would collide across TUs).
- cnif: computeLiveFromCArtifacts, computeMergeDecision and
renderCFromArtifact all handle `'d'`.
icFormatVersion 4 -> 5 (old .c.nif lack the data wrappers).
Validated on the 3-module diamond: the full per-module pipeline (cg all,
merge, emit all, cc, link) now LINKS with no duplicate symbols -- RTTI
(NTIv2) and const tables each land in exactly one object. Whole-program IC
path unchanged (koch ic thallo/tconverter/tmiscs green). Remaining: NimMain
init orchestration (a/b module inits not yet called from main's cg -> the
linked exe runs but prints defaults), the next unit.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
--icBackendStage:emit --icBackendModule:<suffix> renders one module's final
.c from its .c.nif and ic.backend.merge.nif. cnif.renderCFromArtifact walks
the artifact token stream: string literals verbatim, symbols by name, and a
(cdef ...) body is dropped when the name is dead OR it is a 'u' unique
definition this module does not own. The prototype lives in the surrounding
raw text (cgen emits a forward declaration for every used proc regardless of
where the body lands), so a dropped body keeps a valid declaration -- no
synthesis needed.
emit loads the module graph the same way cg does (factored into
loadBackendModules/findTargetModule) so getCFile yields the identical path cg
wrote to -- in particular the main module's source-vs-suffix aliasing.
Validated end-to-end on a 3-module diamond (lib.shared demanded by siblings
a and b at top level): cg all modules, merge, emit all, cc, link. The proc
shared lands in exactly one object (its assigned owner a) and is referenced
(U) from the other -- proc dedup + ownership works at the object level. The
only remaining link failures are DATA (RTTI NTIv2, const tables): those are
emit-everywhere'd as cdata, which is not yet wrapped in a droppable directive
nor given a guaranteed extern in non-owners -- the next unit (data ownership).
koch ic thallo green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reorder mangleProcNameExt and makeUnique so the module suffix comes LAST:
name_u<disamb>__<suffix> (was name__<suffix>_u<disamb>). The suffix is now a
strippable trailing token, so content-addressed cross-module merging (the
per-module backend's instance/hook dedup) can recover a mint-site-independent
name by chopping everything from the final "__" -- no reference rewriting.
Also drops the main-module special case in mangleProcNameExt: it omitted the
suffix because the main module's symbols key on its NIF-suffix file index. But
the backend already aliases that suffix to the main's source index
(nifbackend.loadModuleDependencies), so graph.ifaces[s.itemId.module] is
populated for the main module too -- the guard was redundant. Main-module
procs now mangle uniformly (e.g. mainProc_u0__<mainname>).
icFormatVersion 3 -> 4: cached .c.nif artifacts hold the old name scheme and
must be wiped.
Validated: koch boot (non-IC self-host) reaches fixed point; koch ic thallo
tconverter timp tmiscs tparseutils all green; a 3-module diamond IC build
runs correctly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
--icBackendStage:merge reads every module's .c.nif (the cg stages'
emit-everywhere output), computes the global live set and, for each
externally-linked definition that several cg processes emitted, the single
artifact allowed to embed its body; it writes ic.backend.merge.nif for the
emit stage to consume. This is the cross-process replacement for the
whole-program backend's in-process icSharedDefOwner/DCE coordination.
Mechanism:
- cgen marks every unique program-wide definition (callConv != ccInline and
not a dispatcher) with a new 'u' cdef flag. Its complement -- inline procs
(static per-TU) and method dispatchers (main-only) -- is emitted into every
using TU and must never be deduplicated, so it carries no flag. The flag is
inert for the whole-program path (renderMarkedC/computeLiveFromCArtifacts
ignore it).
- cnif.computeMergeDecision does one mark&sweep pass over all artifacts
(same liveness as computeLiveFromCArtifacts) plus owner assignment: the
owner of a 'u' definition is the lexicographically smallest artifact that
emits it -- a pure function of the claimant set, stable across rebuilds.
writeMergeDecision/readMergeDecision serialize the result as
(merge (live ...) (owners (own Symbol StrLit)*)).
- generateMergeStage is a pure artifact operation (no module graph loaded):
glob the nimcache's .c.nif, compute, write the decision.
Validated on a diamond (lib.shared called from sibling modules a and b, both
with top-level demands): cg emits shared into a, b and lib; merge assigns
owner = lib (smallest claimant) so a/b will prototype it, while the inline
nimFrame stays out of the owners map (kept everywhere). Whole-program backend
path unchanged (dispatch guarded on icBackendStage); koch ic thallo green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
--icBackendStage:cg --icBackendModule:<suffix> generates C for a single module
and writes only its .c.nif (no merge, no .c render, no cc/link -- separate
stages). The whole program is still loaded so types resolve, but only the
target module is code-generated; findPendingModule routes every demand into it
(emit-everywhere into the current module), so a definition gets its canonical
owner-suffixed C name regardless of which module's process emits it -- cross-
process duplicates then collide by exact name, ready for the merge stage to keep
one and prototype the rest.
Validated: cg of the main module of a 2-module project recreates its .c.nif with
the demanded closure (greet/add named by their owner suffix); a leaf module whose
procs only callers use yields an (correct) empty .c.nif. Whole-program backend
path unchanged (dispatch guarded on icBackendStage), koch ic thallo green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Config plumbing for the per-module backend: icBackendStage (cg|merge|emit, empty
= today's whole-program backend) and icBackendModule (the NIF suffix the cg/emit
stage operates on). No behavior yet -- the nifbackend stages and deps.nim rules
that consume these land next. Whole-program backend, koch boot, and a 2-module
IC build are unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
generateBuildFile becomes three procs: computeForwardedArgs (config/define
forwarding + writeIcConfig, depends only on config so computed once),
generateFrontendBuildFile (nifler + nim m rules), and generateBackendBuildFile
(today's single whole-program nim nifc rule; semmed NIFs enter as leaf inputs
with no producing rule, like nifler's .nim source inputs).
commandIc now runs two nifmake passes: phase 1 drives the frontend to the
.s.deps discovery fixpoint, phase 2 runs the backend once over the now-final
graph. Backend rebuilds are then a pure nifmake mtime decision, independent of
frontend discovery -- and the backend file is the slot the per-module codegen
+ DCE + link rules drop into next.
Observably inert: koch boot and koch bootic both reach their byte-identical
fixed points (clean ric_ cache), 2-module cold/warm/body-edit correct,
koch ic thallo green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The content key is hashed into <disamb> (setInstanceDisamb), not a separate
.key. token; document that the cross-TU merge and DCE already key on the
module-suffix-stripped name.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tests the fix in 7148ae347: the RHS of a dot expression wrapped in
nkOpenSym by the generic prepass must use the captured symbol when nothing
is injected, while an injected symbol still overrides it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cold builds were serial (one nim m at a time), leaving the machine idle.
nifmake fans out commands at each DAG depth via execProcesses, so pass
--parallel by default; this roughly halved cold compiler self-build wall
time (81s -> 53s on a 32-core box). Opt out with -d:icNoParallel for
readable, non-interleaved child output when debugging a build.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An UncheckedArray has no known length, so it cannot be copied, moved or
destroyed as a value; it only ever lives behind a pointer. The pointer-like
group emitted x = y for it, an assignment of an unsized array the C backend
cannot lower (genAssignment: tyUncheckedArray) -- which surfaced under
nifc's hook-driven refc codegen (e.g. ref UncheckedArray in widestrs).
Give it its own discard branch so all value hooks no-op; seq/string element
ops still go through the seq/string hooks, which know the length.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The early IC fix made the else branch (concrete generic args) copy the
invocation type unconditionally before propagateToOwner. Besides
avoiding the in-place mutation, the copy flips `header != t` for
all-concrete invocations, which activates the searchInstTypes/sameFlags
cached-instance return path that devel skipped - a cached, meta-flagged
instance could be returned where a fresh one was expected.
Arraymancer's build then failed with "cannot cast to a non concrete
type: 'ptr NimSeqV2[Node[Tensor[float32]]]'" in seqs_v2.setLen.
Copy only when the invocation type is actually immutable
(IC-loaded/Sealed); non-IC behavior is devel's again, the IC assert
stays fixed.
Verified: arraymancer tests_cpu.nim builds and links (its test-suite
SIGSEGV in io_npy is pre-existing - a devel-built compiler produces the
identical 226-tests-then-crash). Macro sweep 93/95, tests/ic 5/5,
koch boot -d:release and clean koch bootic reach bit-identical fixed
points.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When a generic's body is computed by a macro (Bar[T, U] = makeBar(T, U)),
`newbody` after replaceTypeVarsT can be a type loaded from a dep module
- even a builtin like `int` - which is Sealed under IC:
- the in-place flag accumulation `newbody.flags = newbody.flags + ...`
asserted (and under non-IC silently pollutes the shared type's flags,
e.g. the global `int`); compute the flags into a local, skip the
in-place store for Sealed types and feed `result.flags` from the
local - value-identical for the instance.
- `newbody.typeInst = result` likewise; a loaded body keeps whatever
its defining module serialized (the field was first-wins anyway).
Both changes are no-ops for non-IC (types are never Sealed there).
Fixes tmacrogenerics. Macro sweep 93/95 - the two remaining fails are
tmacro7 (disabled test, fails identically under non-IC) and
tmacrogetimpl (needs a design decision on getImplTransformed sym
sharing). tests/ic 5/5, koch boot -d:release and clean koch bootic
both reach bit-identical fixed points.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Under `nim nifc` the main module is loaded by its source FileIndex, but
its serialized symbols carry the module's NIF suffix, so
registerNifSuffix allocated a SECOND FileIndex for the same module:
top-level globals were emitted into the source-index BModule while
procs went into the suffix-index BModule, and a N_LIB_PRIVATE global
declared in one TU was undeclared in the other (tincremental,
tmacros_various).
Pre-aliasing the suffix to the source index in loadModuleDependencies
unifies the TUs. This was tried before and reverted: the split was
masking a hook C-name disamb collision between sem-lifted (loaded) and
codegen-lifted hooks in the same module. That collision class is gone
since backend-minted symbols mangle as _c<item> (BackendIdOffset), so
the unification is safe now.
Macro sweep 92/95 (fixes tincremental + tmacros_various; remaining:
tmacro7 which fails identically under non-IC and is disabled,
tmacrogenerics, tmacrogetimpl), tests/ic 5/5, clean koch bootic
reaches the bit-identical fixed point.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The `pkg marker appended to skPackage NIF names leaked into the
user-visible name of Partial stubs: vmgen's toKey built callback keys
like 'getCurrentException.system.stdlib`pkg', so the VM compiled the
real body of getCurrentException instead of dispatching to the vmops
callback and failed with 'cannot evaluate at compile time:
currException' (tparsefile, ttryparseexpr). The marked name was also a
latent hash-divergence source: sighashes' hashNonProc/hashOwner hash
package names straight off possibly-Partial stubs.
Stubs are now created with the clean name; the marker doubles as a kind
signal, so the stub starts as skPackage and globalName re-appends the
marker when rebuilding the NIF index key. NIF file content is
unchanged.
Macro sweep 90/95 (up from 88, restores the baseline; the 5 remaining
fails are the known deep ones), tests/ic 5/5.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- ast2nif.canonicalRoutine: collapse a forward-decl's discarded impl sym
onto the surviving proto so it is not serialized twice (was an ambiguous
overload in importers; fixes tnewlit).
- deps.nim: handle `import m except syms` (importexcept) in the dependency
scanner so the build-order edge is not dropped (fixes strformat->strutils
ordering).
- ast2nif.writeSymDef: object fields (skField) are no longer marked
bare-importable (x) in the NIF index; an exported field name leaked into
importer scope and shadowed a template's open symbol (type mismatch 'T').
Together these fix tmacro8.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fixes#22936
This pull request improves the compiler's handling of generic type
constraints, specifically for subtypes of generics, and adds a test to
cover this behavior. The main changes are an enhancement to the type
relationship logic in the compiler and a new test case for generic
subtyping with `Future`.
### Compiler improvements for generic subtyping
* Updated `typeRel` in `compiler/sigmatch.nim` to allow generic
constraints (like `F: Future`) to accept not just direct instantiations
but also descendants of the generic family, ensuring more flexible and
correct overload resolution. Inheritance depth is now considered for
overload ranking, making deeper descendants slightly less preferred,
consistent with other inheritance-based matches.
### New test coverage
* Added a test in `tests/typerel/t8905.nim` to verify that generic
constraints correctly accept subtypes of `Future`, including a custom
`B[T, E] = ref object of Future[T]` type, and that overloads like
`take`, `takeMany`, and the macro `checkFutures` work as expected with
these types.
fixes #20811
This pull request addresses issues with parameter capture in nested
generic procedures and templates, ensuring that outer parameters are
correctly visible and accessible within nested scopes. The main changes
include a fix in the semantic analysis logic and the addition of
targeted regression tests.
### Semantic analysis improvements:
* Updated `semGenericStmtSymbol` in `compiler/semgnrc.nim` to ensure
that parameters from outer scopes are preserved and accessible in nested
generic procedures, fixing visibility issues with captured parameters.
### Added regression tests:
* Added `tests/generics/t20811.nim` to verify that both generic and
plain inner procedures can access parameters from their enclosing
procedure.
* Extended `tests/template/topensym.nim` with a new block for issue
#20811 to test that template-injected parameters are correctly captured
and visible in nested generic procedures.
fixes#18238
This pull request makes a targeted change to the object construction
logic in the `genObjConstr` procedure. The main update refines the
conditions under which memory zeroing is required during object
construction, making the behavior more accurate for different garbage
collection and destructor options.
Key logic update:
- Improved the `needsZeroMem` condition in `genObjConstr` to check for
the presence of garbage-collected references and the `optSeqDestructors`
option, instead of relying solely on the selected garbage collector and
field flags. This ensures memory is zeroed only when necessary,
potentially improving performance and correctness.
```c
T1_ = NIM_NIL;
T1_ = ((tyObject_E__uEKympBdEK4SY9anUbpNaLQ*) newObj((&NTIrefe__bJ9cSuxv8xHYxmdolQqFkUw_), sizeof(tyObject_E__uEKympBdEK4SY9anUbpNaLQ)));
nimZeroMem(((void*) ((&(*T1_).z.z.z.z))), sizeof(tyObject_A__G2lWlL9cFqoiWWwZmWqfJ9bA));
(*T1_).z.z.z.z.y = ((NI) 5);
asgnRef(((void**) ((&z1__test8_u12))), T1_);
asgnRef(((void**) ((&z2__test8_u55))), new__test8_u13());
(*z2__test8_u55).z.z.z.z.y = ((NI) 5);
T2_ = NIM_NIL;
T2_ = ((tyObject_E__uEKympBdEK4SY9anUbpNaLQ*) newObj((&NTIrefe__bJ9cSuxv8xHYxmdolQqFkUw_), sizeof(tyObject_E__uEKympBdEK4SY9anUbpNaLQ)));
asgnRef(((void**) ((&z3__test8_u56))), T2_);
(*z3__test8_u56).z.z.z.z.y = ((NI) 5);
```
The original test case has already been fixed for `ORC`, now extends it
to `refc`: if a constructor is fully initialized, it does not need a
zero-fill step
fixes#25725
This pull request makes significant improvements to symbol handling
during transformation passes in the compiler, particularly for routines
(procedures, iterators) and their parameters. The changes ensure that
when routines are copied (for inlining, closure generation, etc.), all
relevant symbols and type headers are also freshly copied and correctly
owned, preventing subtle bugs from symbol reuse. Additionally, new
regression tests are added to cover previously problematic iterator
cases.
**Improvements to symbol copying and ownership:**
* Introduced `freshOwnedSym` to create a fresh copy of a symbol with a
specified owner, ensuring that transformed routines and their parameters
do not share symbols with the originals, which prevents accidental
aliasing and ownership issues.
* Refactored `freshVar` to use `freshOwnedSym`, centralizing fresh
symbol creation logic.
* Added `introduceNewRoutineHeaderSyms` and `copyRoutineTypeHeader` to
ensure that when routines are copied, all parameter/result symbols and
their types are also freshly copied and mapped, avoiding shared state
between original and transformed routines.
* Updated `introduceNewLocalVars` to use `freshOwnedSym` for routine
symbols and to invoke the new header/type copying procedures, ensuring
correctness in routine transformation.
**Testing and regression coverage:**
* Added new blocks to `tests/iter/titer_issues.nim` to test iterator
transformation edge cases, including scenarios that previously led to
symbol reuse bugs (e.g., bugs #25724 and #25725).