cleaned up the internal documentation (#17524)

2026-04-20 06:20:38 +00:00 · 2021-03-26 16:27:55 +01:00
parent add771b051
commit 3e03f67335
3 changed files with 220 additions and 467 deletions
--- a/doc/intern.rst
+++ b/doc/intern.rst
@@ -32,28 +32,29 @@ Path           Purpose
 `doc`          the documentation; it is a bunch of
               reStructuredText files
 `lib`          the Nim library
-`web`          website of Nim; generated by `nimweb`
-               from the `*.txt` and `*.nimf` files
 ============   ===================================================


 Bootstrapping the compiler
 ==========================

+**Note**: Add ``.`` to your PATH so that `koch` can be used without the `./`.
+
 Compiling the compiler is a simple matter of running::

  nim c koch.nim
-  ./koch boot
+  koch boot -d:release

-For a release version use::
+For a debug version use::

  nim c koch.nim
-  ./koch boot -d:release
+  koch boot
+

 And for a debug version compatible with GDB::

  nim c koch.nim
-  ./koch boot --debuginfo --linedir:on
+  koch boot --debuginfo --linedir:on

 The `koch` program is Nim's maintenance script. It is a replacement for
 make and shell scripting with the advantage that it is much more portable.
@@ -61,12 +62,105 @@ More information about its options can be found in the `koch <koch.html>`_
 documentation.


+Developing the compiler
+=======================
+
+To create a new compiler for each run, use `koch temp`::
+
+  koch temp c test.nim
+
+`koch temp` creates a debug build of the compiler, which is useful
+to create stacktraces for compiler debugging.
+
+You can of course use GDB or Visual Studio to debug the
+compiler (via `--debuginfo --lineDir:on`). However, there
+are also lots of procs that aid in debugging:
+
+
+.. code-block:: nim
+
+  # dealing with PNode:
+  echo renderTree(someNode)
+  debug(someNode) # some JSON representation
+
+  # dealing with PType:
+  echo typeToString(someType)
+  debug(someType)
+
+  # dealing with PSym:
+  echo symbol.name.s
+  debug(symbol)
+
+  # pretty prints the Nim ast, but annotates symbol IDs:
+  echo renderTree(someNode, {renderIds})
+  if `??`(conf, n.info, "temp.nim"):
+    # only output when it comes from "temp.nim"
+    echo renderTree(n)
+  if `??`(conf, n.info, "temp.nim"):
+    # why does it process temp.nim here?
+    writeStackTrace()
+
+These procs may not already be imported by the module you're editing.
+You can import them directly for debugging:
+
+.. code-block:: nim
+  from astalgo import debug
+  from types import typeToString
+  from renderer import renderTree
+  from msgs import `??`
+
+
+The compiler's architecture
+===========================
+
+Nim uses the classic compiler architecture: A lexer/scanner feds tokens to a
+parser. The parser builds a syntax tree that is used by the code generators.
+This syntax tree is the interface between the parser and the code generator.
+It is essential to understand most of the compiler's code.
+
+Semantic analysis is separated from parsing.
+
+.. include:: filelist.txt
+
+
+The syntax tree
+---------------
+The syntax tree consists of nodes which may have an arbitrary number of
+children. Types and symbols are represented by other nodes, because they
+may contain cycles. The AST changes its shape after semantic checking. This
+is needed to make life easier for the code generators. See the "ast" module
+for the type definitions. The `macros <macros.html>`_ module contains many
+examples how the AST represents each syntactic structure.
+
+
+Bisecting for regressions
+=========================
+
+`koch temp` returns 125 as the exit code in case the compiler
+compilation fails. This exit code tells `git bisect` to skip the
+current commit.::
+
+  git bisect start bad-commit good-commit
+  git bisect run ./koch temp -r c test-source.nim
+
+You can also bisect using custom options to build the compiler, for example if
+you don't need a debug version of the compiler (which runs slower), you can replace
+`./koch temp` by explicit compilation command, see `Rebuilding the compiler`_.
+
+
+Runtimes
+========
+
+Nim has two different runtimes, the "old runtime" and the "new runtime". The old
+runtime supports the old GCs (markAndSweep, refc, Boehm), the new runtime supports
+ARC/ORC. The new runtime is active `when defined(nimV2)`.
+
+
 Coding Guidelines
 =================

-* Use CamelCase, not underscored_identifiers.
-* Indent with two spaces.
-* Max line length is 80 characters.
+* We follow Nim's official style guide, see `<nep1.html>`_.
+* Max line length is 100 characters.
 * Provide spaces around binary operators if that enhances readability.
 * Use a space after a colon, but not before it.
 * [deprecated] Start types with a capital `T`, unless they are
@@ -86,9 +180,9 @@ POSIX-compliant systems on conventional hardware are usually pretty easy to
 port: Add the platform to `platform` (if it is not already listed there),
 check that the OS, System modules work and recompile Nim.

-The only case where things aren't as easy is when the garbage
-collector needs some assembler tweaking to work. The standard
-version of the GC uses C's `setjmp` function to store all registers
+The only case where things aren't as easy is when old runtime's garbage
+collectors need some assembler tweaking to work. The default
+implementation uses C's `setjmp` function to store all registers
 on the hardware stack. It may be necessary that the new platform needs to
 replace this generic code by some assembler code.

@@ -96,136 +190,33 @@ replace this generic code by some assembler code.
 Runtime type information
 ========================

+**Note**: This section describes the "old runtime".
+
 *Runtime type information* (RTTI) is needed for several aspects of the Nim
 programming language:

 Garbage collection
-  The most important reason for RTTI. Generating
-  traversal procedures produces bigger code and is likely to be slower on
-  modern hardware as dynamic procedure binding is hard to predict.
+  The old GCs use the RTTI for traversing abitrary Nim types, but usually
+  only the `marker` field which contains a proc that does the traversal.

 Complex assignments
  Sequences and strings are implemented as
  pointers to resizeable buffers, but Nim requires copying for
-  assignments. Apart from RTTI the compiler could generate copy procedures
-  for any type that needs one. However, this would make the code bigger and
-  the RTTI is likely already there for the GC.
+  assignments. Apart from RTTI the compiler also generates copy procedures
+  as a specialization.

 We already know the type information as a graph in the compiler.
 Thus we need to serialize this graph as RTTI for C code generation.
 Look at the file `lib/system/hti.nim` for more information.

-Rebuilding the compiler
+
+Magics and compilerProcs
 ========================

-After an initial build via `sh build_all.sh` on posix or `build_all.bat` on windows,
-you can rebuild the compiler as follows:
-
-* `nim c koch` if you need to rebuild koch
-* `./koch boot -d:release` this ensures the compiler can rebuild itself
-  (use `koch` instead of `./koch` on windows), which builds the compiler 3 times.
-
-A faster approach if you don't need to run the full bootstrapping implied by `koch boot`,
-is the following:
-
-* `pathto/nim c --lib:lib -d:release -o:bin/nim_temp compiler/nim.nim`
-
-Where `pathto/nim` is any nim binary sufficiently recent (e.g. `bin/nim_cources`
-built during bootstrap or `$HOME/.nimble/bin/nim` installed by `choosenim 1.2.0`)
-
-You can pass any additional options such as `-d:leanCompiler` if you don't need
-certain features or `-d:debug --stacktrace:on --excessiveStackTrace --stackTraceMsgs`
-for debugging the compiler. See also `Debugging the compiler`_.
-
-Debugging the compiler
-======================
-
-You can of course use GDB or Visual Studio to debug the
-compiler (via `--debuginfo --lineDir:on`). However, there
-are also lots of procs that aid in debugging:
-
-
-.. code-block:: nim
-  # pretty prints the Nim AST
-  echo renderTree(someNode)
-  # outputs some JSON representation
-  debug(someNode)
-  # pretty prints some type
-  echo typeToString(someType)
-  debug(someType)
-  echo symbol.name.s
-  debug(symbol)
-  # pretty prints the Nim ast, but annotates symbol IDs:
-  echo renderTree(someNode, {renderIds})
-  if `??`(conf, n.info, "temp.nim"):
-    # only output when it comes from "temp.nim"
-    echo renderTree(n)
-  if `??`(conf, n.info, "temp.nim"):
-    # why does it process temp.nim here?
-    writeStackTrace()
-
-These procs may not be imported by a module. You can import them directly for debugging:
-
-.. code-block:: nim
-  from astalgo import debug
-  from types import typeToString
-  from renderer import renderTree
-  from msgs import `??`
-
-To create a new compiler for each run, use `koch temp`::
-
-  ./koch temp c /tmp/test.nim
-
-`koch temp` creates a debug build of the compiler, which is useful
-to create stacktraces for compiler debugging. See also
-`Rebuilding the compiler`_ if you need more control.
-
-Bisecting for regressions
-=========================
-
-`koch temp` returns 125 as the exit code in case the compiler
-compilation fails. This exit code tells `git bisect` to skip the
-current commit.::
-
-  git bisect start bad-commit good-commit
-  git bisect run ./koch temp -r c test-source.nim
-
-You can also bisect using custom options to build the compiler, for example if
-you don't need a debug version of the compiler (which runs slower), you can replace
-`./koch temp` by explicit compilation command, see `Rebuilding the compiler`_.
-
-The compiler's architecture
-===========================
-
-Nim uses the classic compiler architecture: A lexer/scanner feds tokens to a
-parser. The parser builds a syntax tree that is used by the code generator.
-This syntax tree is the interface between the parser and the code generator.
-It is essential to understand most of the compiler's code.
-
-In order to compile Nim correctly, type-checking has to be separated from
-parsing. Otherwise generics cannot work.
-
-.. include:: filelist.txt
-
-
-The syntax tree
---------------
-The syntax tree consists of nodes which may have an arbitrary number of
-children. Types and symbols are represented by other nodes, because they
-may contain cycles. The AST changes its shape after semantic checking. This
-is needed to make life easier for the code generators. See the "ast" module
-for the type definitions. The `macros <macros.html>`_ module contains many
-examples how the AST represents each syntactic structure.
-
-
-How the RTL is compiled
-=======================
-
 The `system` module contains the part of the RTL which needs support by
-compiler magic (and the stuff that needs to be in it because the spec
-says so). The C code generator generates the C code for it, just like any other
+compiler magic. The C code generator generates the C code for it, just like any other
 module. However, calls to some procedures like `addInt` are inserted by
-the CCG. Therefore the module `magicsys` contains a table (`compilerprocs`)
+the generator. Therefore there is a table (`compilerprocs`)
 with all symbols that are marked as `compilerproc`. `compilerprocs` are
 needed by the code generator. A `magic` proc is not the same as a
 `compilerproc`: A `magic` is a proc that needs compiler magic for its
@@ -233,325 +224,6 @@ semantic checking, a `compilerproc` is a proc that is used by the code
 generator.


-Compilation cache
-=================
-
-The implementation of the compilation cache is tricky: There are lots
-of issues to be solved for the front- and backend.
-
-
-General approach: AST replay
----------------------------
-
-We store a module's AST of a successful semantic check in a SQLite
-database. There are plenty of features that require a sub sequence
-to be re-applied, for example:
-
-.. code-block:: nim
-  {.compile: "foo.c".} # even if the module is loaded from the DB,
-                       # "foo.c" needs to be compiled/linked.
-
-The solution is to **re-play** the module's top level statements.
-This solves the problem without having to special case the logic
-that fills the internal seqs which are affected by the pragmas.
-
-In fact, this describes how the AST should be stored in the database,
-as a "shallow" tree. Let's assume we compile module `m` with the
-following contents:
-
-.. code-block:: nim
-  import std/strutils
-
-  var x*: int = 90
-  {.compile: "foo.c".}
-  proc p = echo "p"
-  proc q = echo "q"
-  static:
-    echo "static"
-
-Conceptually this is the AST we store for the module:
-
-.. code-block:: nim
-  import std/strutils
-
-  var x*
-  {.compile: "foo.c".}
-  proc p
-  proc q
-  static:
-    echo "static"
-
-The symbol's `ast` field is loaded lazily, on demand. This is where most
-savings come from, only the shallow outer AST is reconstructed immediately.
-
-It is also important that the replay involves the `import` statement so
-that dependencies are resolved properly.
-
-
-Shared global compiletime state
-------------------------------
-
-Nim allows `.global, compiletime` variables that can be filled by macro
-invocations across different modules. This feature breaks modularity in a
-severe way. Plenty of different solutions have been proposed:
-
- Restrict the types of global compiletime variables to `Set[T]` or
-  similar unordered, only-growable collections so that we can track
-  the module's write effects to these variables and reapply the changes
-  in a different order.
- In every module compilation, reset the variable to its default value.
- Provide a restrictive API that can load/save the compiletime state to
-  a file.
-
-(These solutions are not mutually exclusive.)
-
-Since we adopt the "replay the top level statements" idea, the natural
-solution to this problem is to emit pseudo top level statements that
-reflect the mutations done to the global variable. However, this is
-MUCH harder than it sounds, for example `squeaknim` uses this
-snippet:
-
-.. code-block:: nim
-  apicall.add(") module: '" & dllName & "'>\C" &
-              "\t^self externalCallFailed\C!\C\C")
-  stCode.add(st & "\C\t\"Generated by NimSqueak\"\C\t" & apicall)
-
-We can "replay" `stCode.add` only if the values of `st`
-and `apicall` are known. And even then a hash table's `add` with its
-hashing mechanism is too hard to replay.
-
-In practice, things are worse still, consider `someGlobal[i][j].add arg`.
-We only know the root is `someGlobal` but the concrete path to the data
-is unknown as is the value that is added. We could compute a "diff" between
-the global states and use that to compute a symbol patchset, but this is
-quite some work, expensive to do at runtime (it would need to run after
-every module has been compiled) and would also break for hash tables.
-
-We need an API that hides the complex aliasing problems by not relying
-on Nim's global variables. The obvious solution is to use string keys
-instead of global variables:
-
-.. code-block:: nim
-
-  proc cachePut*(key: string; value: string)
-  proc cacheGet*(key: string): string
-
-However, the values being strings/json is quite problematic: Many
-lookup tables that are built at compiletime embed *proc vars* and
-types which have no obvious string representation... Seems like
-AST diffing is still the best idea as it will not require to use
-an alien API and works with some existing Nimble packages, at least.
-
-On the other hand, in Nim's future I would like to replace the VM
-by native code. A diff algorithm wouldn't work for that.
-Instead the native code would work with an API like `put`, `get`:
-
-.. code-block:: nim
-
-  proc cachePut*(key: string; value: NimNode)
-  proc cacheGet*(key: string): NimNode
-
-The API should embrace the AST diffing notion: See the
-module `macrocache` for the final details.
-
-
-
-Methods and type converters
---------------------------
-
-In the following
-sections *global* means *shared between modules* or *property of the whole
-program*.
-
-Nim contains language features that are *global*. The best example for that
-are multi methods: Introducing a new method with the same name and some
-compatible object parameter means that the method's dispatcher needs to take
-the new method into account. So the dispatching logic is only completely known
-after the whole program has been translated!
-
-Other features that are *implicitly* triggered cause problems for modularity
-too. Type converters fall into this category:
-
-.. code-block:: nim
-  # module A
-  converter toBool(x: int): bool =
-    result = x != 0
-
-.. code-block:: nim
-  # module B
-  import A
-
-  if 1:
-    echo "ugly, but should work"
-
-If in the above example module `B` is re-compiled, but `A` is not then
-`B` needs to be aware of `toBool` even though  `toBool` is not referenced
-in `B` *explicitly*.
-
-Both the multi method and the type converter problems are solved by the
-AST replay implementation.
-
-
-Generics
-~~~~~~~~
-
-We cache generic instantiations and need to ensure this caching works
-well with the incremental compilation feature. Since the cache is
-attached to the `PSym` datastructure, it should work without any
-special logic.
-
-
-Backend issues
--------------
-
- Init procs must not be "forgotten" to be called.
- Files must not be "forgotten" to be linked.
- Method dispatchers are global.
- DLL loading via `dlsym` is global.
- Emulated thread vars are global.
-
-However the biggest problem is that dead code elimination breaks modularity!
-To see why, consider this scenario: The module `G` (for example the huge
-Gtk2 module...) is compiled with dead code elimination turned on. So none
-of `G`'s procs is generated at all.
-
-Then module `B` is compiled that requires `G.P1`. Ok, no problem,
-`G.P1` is loaded from the symbol file and `G.c` now contains `G.P1`.
-
-Then module `A` (that depends on `B` and `G`) is compiled and `B`
-and `G` are left unchanged. `A` requires `G.P2`.
-
-So now `G.c` MUST contain both `P1` and `P2`, but we haven't even
-loaded `P1` from the symbol file, nor do we want to because we then quickly
-would restore large parts of the whole program.
-
-
-Solution
-~~~~~~~~
-
-The backend must have some logic so that if the currently processed module
-is from the compilation cache, the `ast` field is not accessed. Instead
-the generated C(++) for the symbol's body needs to be cached too and
-inserted back into the produced C file. This approach seems to deal with
-all the outlined problems above.
-
-
-Debugging Nim's memory management
-=================================
-
-The following paragraphs are mostly a reminder for myself. Things to keep
-in mind:
-
-* If an assertion in Nim's memory manager or GC fails, the stack trace
-  keeps allocating memory! Thus a stack overflow may happen, hiding the
-  real issue.
-* What seem to be C code generation problems is often a bug resulting from
-  not producing prototypes, so that some types default to `cint`. Testing
-  without the `-w` option helps!
-
-
-The Garbage Collector
-=====================
-
-Introduction
------------
-
-I use the term *cell* here to refer to everything that is traced
-(sequences, refs, strings).
-This section describes how the GC works.
-
-The basic algorithm is *Deferrent Reference Counting* with cycle detection.
-References on the stack are not counted for better performance and easier C
-code generation.
-
-Each cell has a header consisting of a RC and a pointer to its type
-descriptor. However the program does not know about these, so they are placed at
-negative offsets. In the GC code the type `PCell` denotes a pointer
-decremented by the right offset, so that the header can be accessed easily. It
-is extremely important that `pointer` is not confused with a `PCell`
-as this would lead to a memory corruption.
-
-
-The CellSet data structure
--------------------------
-
-The GC depends on an extremely efficient datastructure for storing a
-set of pointers - this is called a `TCellSet` in the source code.
-Inserting, deleting and searching are done in constant time. However,
-modifying a `TCellSet` during traversal leads to undefined behaviour.
-
-.. code-block:: Nim
-  type
-    TCellSet # hidden
-
-  proc cellSetInit(s: var TCellSet) # initialize a new set
-  proc cellSetDeinit(s: var TCellSet) # empty the set and free its memory
-  proc incl(s: var TCellSet, elem: PCell) # include an element
-  proc excl(s: var TCellSet, elem: PCell) # exclude an element
-
-  proc `in`(elem: PCell, s: TCellSet): bool # tests membership
-
-  iterator elements(s: TCellSet): (elem: PCell)
-
-
-All the operations have to perform efficiently. Because a Cellset can
-become huge a hash table alone is not suitable for this.
-
-We use a mixture of bitset and hash table for this. The hash table maps *pages*
-to a page descriptor. The page descriptor contains a bit for any possible cell
-address within this page. So including a cell is done as follows:
-
- Find the page descriptor for the page the cell belongs to.
- Set the appropriate bit in the page descriptor indicating that the
-  cell points to the start of a memory block.
-
-Removing a cell is analogous - the bit has to be set to zero.
-Single page descriptors are never deleted from the hash table. This is not
-needed as the data structures needs to be rebuilt periodically anyway.
-
-Complete traversal is done in this way::
-
-  for each page descriptor d:
-    for each bit in d:
-      if bit == 1:
-        traverse the pointer belonging to this bit
-
-
-Further complications
---------------------
-
-In Nim the compiler cannot always know if a reference
-is stored on the stack or not. This is caused by var parameters.
-Consider this example:
-
-.. code-block:: Nim
-  proc setRef(r: var ref TNode) =
-    new(r)
-
-  proc usage =
-    var
-      r: ref TNode
-    setRef(r) # here we should not update the reference counts, because
-              # r is on the stack
-    setRef(r.left) # here we should update the refcounts!
-
-We have to decide at runtime whether the reference is on the stack or not.
-The generated code looks roughly like this:
-
-.. code-block:: C
-  void setref(TNode** ref) {
-    unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode)))
-  }
-  void usage(void) {
-    setRef(&r)
-    setRef(&r->left)
-  }
-
-Note that for systems with a continuous stack (which most systems have)
-the check whether the ref is on the stack is very cheap (only two
-comparisons).
-
-
 Code generation for closures
 ============================

@@ -598,14 +270,14 @@ This should produce roughly this code:

 .. code-block:: nim
  type
-    PEnv = ref object
+    Env = ref object
      x: int # data

-  proc anon(y: int, c: PEnv): int =
+  proc anon(y: int, c: Env): int =
    return y + c.x

  proc add(x: int): tuple[prc, data] =
-    var env: PEnv
+    var env: Env
    new env
    env.x = x
    result = (anon, env)
@@ -630,25 +302,25 @@ This should produce roughly this code:

 .. code-block:: nim
  type
-    PEnvX = ref object
+    EnvX = ref object
      x: int # data

-    PEnvY = ref object
+    EnvY = ref object
      y: int
-      ex: PEnvX
+      ex: EnvX

-  proc lambdaZ(z: int, ey: PEnvY): int =
+  proc lambdaZ(z: int, ey: EnvY): int =
    return ey.ex.x + ey.y + z

-  proc lambdaY(y: int, ex: PEnvX): tuple[prc, data: PEnvY] =
-    var ey: PEnvY
+  proc lambdaY(y: int, ex: EnvX): tuple[prc, data: EnvY] =
+    var ey: EnvY
    new ey
    ey.y = y
    ey.ex = ex
    result = (lambdaZ, ey)

-  proc add(x: int): tuple[prc, data: PEnvX] =
-    var ex: PEnvX
+  proc add(x: int): tuple[prc, data: EnvX] =
+    var ex: EnvX
    ex.x = x
    result = (labmdaY, ex)

@@ -663,17 +335,11 @@ More useful is escape analysis and stack allocation of the environment,
 however.


-Alternative
-----------
-
-Process the closure of all inner procs in one pass and accumulate the
-environments. This is however not always possible.
-
-
 Accumulator
 -----------

 .. code-block:: nim
+
  proc getAccumulator(start: int): proc (): int {.closure} =
    var i = start
    return lambda: int =
@@ -708,20 +374,26 @@ backend somehow. We deal with this by modifying `s.ast[paramPos]` to contain
 the formal hidden parameter, but not `s.typ`!


-Integer literals:
-----------------
+Notes on type and AST representation
+====================================
+
+To be expanded.
+
+
+Integer literals
+----------------

 In Nim, there is a redundant way to specify the type of an
 integer literal. First of all, it should be unsurprising that every
 node has a node kind. The node of an integer literal can be any of the
-following values:
+following values::

    nkIntLit, nkInt8Lit, nkInt16Lit, nkInt32Lit, nkInt64Lit,
    nkUIntLit, nkUInt8Lit, nkUInt16Lit, nkUInt32Lit, nkUInt64Lit

 On top of that, there is also the `typ` field for the type. It the
 kind of the `typ` field can be one of the following ones, and it
-should be matching the literal kind:
+should be matching the literal kind::

    tyInt, tyInt8, tyInt16, tyInt32, tyInt64, tyUInt, tyUInt8,
    tyUInt16, tyUInt32, tyUInt64
@@ -777,6 +449,7 @@ pointing back to the integer literal node in the ast containing the
 integer value. These are the properties that hold true for integer
 literal types.

+::
    n.kind == nkIntLit
    n.typ.kind == tyInt
    n.typ.n == n
--- a/lib/system/cellsets.nim
+++ b/lib/system/cellsets.nim
@@ -7,7 +7,40 @@
 #    distribution, for details about the copyright.
 #

-# Efficient set of pointers for the GC (and repr)
+
+#[
+
+Efficient set of pointers for the GC (and repr)
+-----------------------------------------------
+
+The GC depends on an extremely efficient datastructure for storing a
+set of pointers - this is called a `CellSet` in the source code.
+Inserting, deleting and searching are done in constant time. However,
+modifying a `CellSet` during traversal leads to undefined behaviour.
+
+All operations on a CellSet have to perform efficiently. Because a Cellset can
+become huge a hash table alone is not suitable for this.
+
+We use a mixture of bitset and hash table for this. The hash table maps *pages*
+to a page descriptor. The page descriptor contains a bit for any possible cell
+address within this page. So including a cell is done as follows:
+
+- Find the page descriptor for the page the cell belongs to.
+- Set the appropriate bit in the page descriptor indicating that the
+  cell points to the start of a memory block.
+
+Removing a cell is analogous - the bit has to be set to zero.
+Single page descriptors are never deleted from the hash table. This is not
+needed as the data structures needs to be rebuilt periodically anyway.
+
+Complete traversal is done in this way::
+
+  for each page descriptor d:
+    for each bit in d:
+      if bit == 1:
+        traverse the pointer belonging to this bit
+
+]#

 when defined(gcOrc) or defined(gcArc):
  type
--- a/lib/system/gc.nim
+++ b/lib/system/gc.nim
@@ -12,6 +12,53 @@
 # Refcounting + Mark&Sweep. Complex algorithms avoided.
 # Been there, done that, didn't work.

+#[
+
+A *cell* is anything that is traced by the GC
+(sequences, refs, strings, closures).
+
+The basic algorithm is *Deferrent Reference Counting* with cycle detection.
+References on the stack are not counted for better performance and easier C
+code generation.
+
+Each cell has a header consisting of a RC and a pointer to its type
+descriptor. However the program does not know about these, so they are placed at
+negative offsets. In the GC code the type `PCell` denotes a pointer
+decremented by the right offset, so that the header can be accessed easily. It
+is extremely important that `pointer` is not confused with a `PCell`.
+
+In Nim the compiler cannot always know if a reference
+is stored on the stack or not. This is caused by var parameters.
+Consider this example:
+
+.. code-block:: Nim
+  proc setRef(r: var ref TNode) =
+    new(r)
+
+  proc usage =
+    var
+      r: ref TNode
+    setRef(r) # here we should not update the reference counts, because
+              # r is on the stack
+    setRef(r.left) # here we should update the refcounts!
+
+We have to decide at runtime whether the reference is on the stack or not.
+The generated code looks roughly like this:
+
+.. code-block:: C
+  void setref(TNode** ref) {
+    unsureAsgnRef(ref, newObj(TNode_TI, sizeof(TNode)))
+  }
+  void usage(void) {
+    setRef(&r)
+    setRef(&r->left)
+  }
+
+Note that for systems with a continuous stack (which most systems have)
+the check whether the ref is on the stack is very cheap (only two
+comparisons).
+]#
+
 {.push profiler:off.}

 const