version 0.7.4

This commit is contained in:
Andreas Rumpf
2009-01-07 17:03:25 +01:00
parent 1c8ddca7e0
commit 439aa2d04d
114 changed files with 13664 additions and 10110 deletions

View File

@@ -3,25 +3,28 @@
The documentation consists of several documents:
- | `First steps after installation <steps.html>`_
| Read this after installation for a quick introduction.
- | `Nimrod tutorial (part I) <tut1.html>`_
| The Nimrod tutorial part one deals with the basics.
- | `Nimrod tutorial (part II) <tut2.html>`_
| The Nimrod tutorial part two deals with the advanced language constructs.
- | `Nimrod manual <manual.html>`_
| Read this to get to know the Nimrod programming system.
| The Nimrod manual is a draft that will evolve into a proper specification.
- | `User guide for the Nimrod Compiler <nimrodc.html>`_
| The user guide lists command line arguments, Nimrodc's special features, etc.
| The user guide lists command line arguments, special features of the
compiler, etc.
- | `User guide for the Embedded Nimrod Debugger <endb.html>`_
| This document describes how to use the Embedded debugger. The embedded
debugger currently has no GUI. Please help!
| This document describes how to use the Embedded Debugger.
- | `Nimrod library documentation <lib.html>`_
| This document describes Nimrod's standard library.
- | `Nimrod internal documentation <intern.html>`_
| The internal documentation describes how the compiler is implemented. Read
this if you want to hack the compiler or develop advanced macros.
this if you want to hack the compiler.
- | `Index <theindex.html>`_
| The generated index. Often the quickest way to find the piece of

View File

@@ -1,41 +1,54 @@
Short description of Nimrod's modules
-------------------------------------
============== ==========================================================
Module Description
============== ==========================================================
lexbase buffer handling of the lexical analyser
scanner lexical analyser
ast type definitions of the abstract syntax tree (AST) and
node constructors
astalgo algorithms for containers of AST nodes; converting the
AST to YAML; the symbol table
passes implement the passes managemer for passes over the AST
trees few algorithms for nodes; this module is less important
types module for traversing type graphs; also contain several
helpers for dealing with types
sigmatch contains the matching algorithm that is used for proc
calls
semexprs contains the semantic checking phase for expressions
semstmts contains the semantic checking phase for statements
semtypes contains the semantic checking phase for types
idents implements a general mapping from identifiers to an internal
representation (``PIdent``) that is used, so that a simple
id-comparison suffices to say whether two Nimrod identifiers
are equivalent
ropes implements long strings using represented as trees for
lazy evaluation; used mainly by the code generators
ccgobj contains type definitions neeeded for C code generation
and some helpers
ccgutils contains helpers for the C code generator
ccgtypes the generator for C types
ccgstmts the generator for statements
ccgexprs the generator for expressions
extccomp this module calls the C compiler and linker; interesting
if you want to add support for a new C compiler
============== ==========================================================
Short description of Nimrod's modules
-------------------------------------
============== ==========================================================
Module Description
============== ==========================================================
nimrod main module: parses the command line and calls
```main.MainCommand``
main implements the top-level command dispatching
lexbase buffer handling of the lexical analyser
scanner lexical analyser
pnimsyn Nimrod's parser
rnimsyn Nimrod code renderer (AST back to its textual form)
paslex lexer for Pascal
pasparse parser for Pascal; Pascal's advanced OO features are not
supported
options contains global and local compiler options
ast type definitions of the abstract syntax tree (AST) and
node constructors
astalgo algorithms for containers of AST nodes; converting the
AST to YAML; the symbol table
passes implement the passes managemer for passes over the AST
trees few algorithms for nodes; this module is less important
types module for traversing type graphs; also contain several
helpers for dealing with types
sigmatch contains the matching algorithm that is used for proc
calls
semexprs contains the semantic checking phase for expressions
semstmts contains the semantic checking phase for statements
semtypes contains the semantic checking phase for types
semfold contains code to deal with constant folding
evals contains an AST interpreter for compile time evaluation
pragmas semantic checking of pragmas
idents implements a general mapping from identifiers to an internal
representation (``PIdent``) that is used, so that a simple
id-comparison suffices to say whether two Nimrod identifiers
are equivalent
ropes implements long strings represented as trees for
lazy evaluation; used mainly by the code generators
transf transformations on the AST that need to be done before
code generation
cgen main file of the C code generator
ccgutils contains helpers for the C code generator
ccgtypes the generator for C types
ccgstmts the generator for statements
ccgexprs the generator for expressions
extccomp this module calls the C compiler and linker; interesting
if you want to add support for a new C compiler
============== ==========================================================

View File

@@ -1,45 +1,43 @@
module ::= ([COMMENT] [SAD] stmt)*
comma ::= ',' [COMMENT] [IND]
operator ::= OP0 | OR | XOR | AND | OP3 | OP4 | OP5 | IS | ISNOT | IN | NOTIN
| OP6 | DIV | MOD | SHL | SHR | OP7 | NOT
operator ::= OP0 | OR | XOR | AND | OP3 | OP4 | OP5 | OP6 | OP7
| 'is' | 'isnot' | 'in' | 'notin'
| 'div' | 'mod' | 'shl' | 'shr' | 'not'
prefixOperator ::= OP0 | OP3 | OP4 | OP5 | OP6 | OP7 | NOT
prefixOperator ::= OP0 | OP3 | OP4 | OP5 | OP6 | OP7 | 'not'
optInd ::= [COMMENT] [IND]
lowestExpr ::= orExpr ( OP0 optInd orExpr )*
orExpr ::= andExpr ( OR | XOR optInd andExpr )*
andExpr ::= cmpExpr ( AND optInd cmpExpr )*
cmpExpr ::= ampExpr ( OP3 | IS | ISNOT | IN | NOTIN optInd ampExpr )*
ampExpr ::= plusExpr ( OP4 optInd plusExpr )*
plusExpr ::= mulExpr ( OP5 optInd mulExpr )*
mulExpr ::= dollarExpr ( OP6 | DIV | MOD | SHL | SHR optInd dollarExpr )*
dollarExpr ::= primary ( OP7 optInd primary )*
lowestExpr ::= orExpr (OP0 optInd orExpr)*
orExpr ::= andExpr (OR | 'xor' optInd andExpr)*
andExpr ::= cmpExpr ('and' optInd cmpExpr)*
cmpExpr ::= ampExpr (OP3 | 'is' | 'isnot' | 'in' | 'notin' optInd ampExpr)*
ampExpr ::= plusExpr (OP4 optInd plusExpr)*
plusExpr ::= mulExpr (OP5 optInd mulExpr)*
mulExpr ::= dollarExpr (OP6 | 'div' | 'mod' | 'shl' | 'shr' optInd dollarExpr)*
dollarExpr ::= primary (OP7 optInd primary)*
namedTypeOrExpr ::=
DOTDOT [expr]
| expr [EQUALS (expr [DOTDOT expr] | typeDescK | DOTDOT [expr] )
| DOTDOT [expr]]
'..' [expr]
| expr ['=' (expr ['..' expr] | typeDescK | '..' [expr]) | '..' [expr]]
| typeDescK
castExpr ::= CAST BRACKET_LE optInd typeDesc BRACKERT_RI
PAR_LE optInd expr PAR_RI
addrExpr ::= ADDR PAR_LE optInd expr PAR_RI
symbol ::= ACC (KEYWORD | IDENT | operator | PAR_LE PAR_RI
| BRACKET_LE BRACKET_RI | EQUALS | literal )+ ACC
castExpr ::= 'cast' '[' optInd typeDesc [SAD] ']' '(' optInd expr [SAD] ')'
addrExpr ::= 'addr' '(' optInd expr ')'
symbol ::= '`' (KEYWORD | IDENT | operator | '(' ')'
| '[' ']' | '=' | literal)+ '`'
| IDENT
primary ::= ( prefixOperator optInd )* ( symbol | constructor |
| castExpr | addrExpr ) (
DOT optInd symbol
#| CURLY_LE namedTypeDescList CURLY_RI
| PAR_LE optInd namedExprList PAR_RI
| BRACKET_LE optInd
[ namedTypeOrExpr (comma namedTypeOrExpr)* [comma] ]
BRACKET_RI
| CIRCUM
| pragma )*
primary ::= (prefixOperator optInd)* (symbol | constructor |
| castExpr | addrExpr) (
'.' optInd symbol
| '(' optInd namedExprList [SAD] ')'
| '[' optInd
[namedTypeOrExpr (comma namedTypeOrExpr)* [comma]]
[SAD] ']'
| '^'
| pragma)*
literal ::= INT_LIT | INT8_LIT | INT16_LIT | INT32_LIT | INT64_LIT
| FLOAT_LIT | FLOAT32_LIT | FLOAT64_LIT
@@ -48,48 +46,42 @@ literal ::= INT_LIT | INT8_LIT | INT16_LIT | INT32_LIT | INT64_LIT
| NIL
constructor ::= literal
| BRACKET_LE optInd colonExprList BRACKET_RI # []-Constructor
| CURLY_LE optInd sliceExprList CURLY_RI # {}-Constructor
| PAR_LE optInd colonExprList PAR_RI # ()-Constructor
| '[' optInd colonExprList [SAD] ']'
| '{' optInd sliceExprList [SAD] '}'
| '(' optInd colonExprList [SAD] ')'
exprList ::= [ expr (comma expr)* [comma] ]
colonExpr ::= expr [':' expr]
colonExprList ::= [colonExpr (comma colonExpr)* [comma]]
colonExpr ::= expr [COLON expr]
colonExprList ::= [ colonExpr (comma colonExpr)* [comma] ]
namedExpr ::= expr ['=' expr]
namedExprList ::= [namedExpr (comma namedExpr)* [comma]]
namedExpr ::= expr [EQUALS expr] # actually this is symbol EQUALS expr|expr
namedExprList ::= [ namedExpr (comma namedExpr)* [comma] ]
sliceExpr ::= expr ['..' expr]
sliceExprList ::= [sliceExpr (comma sliceExpr)* [comma]]
sliceExpr ::= expr [ DOTDOT expr ]
sliceExprList ::= [ sliceExpr (comma sliceExpr)* [comma] ]
anonymousProc ::= LAMBDA paramList [pragma] EQUALS stmt
anonymousProc ::= 'lambda' paramList [pragma] '=' stmt
expr ::= lowestExpr
| anonymousProc
| IF expr COLON expr
(ELIF expr COLON expr)*
ELSE COLON expr
| 'if' expr ':' expr ('elif' expr ':' expr)* 'else' ':' expr
namedTypeDesc ::= typeDescK | expr [EQUALS (typeDescK | expr)]
namedTypeDescList ::= [ namedTypeDesc (comma namedTypeDesc)* [comma] ]
namedTypeDesc ::= typeDescK | expr ['=' (typeDescK | expr)]
namedTypeDescList ::= [namedTypeDesc (comma namedTypeDesc)* [comma]]
qualifiedIdent ::= symbol [ DOT symbol ]
qualifiedIdent ::= symbol ['.' symbol]
typeDescK ::= VAR typeDesc
| REF typeDesc
| PTR typeDesc
| TYPE expr
| TUPLE tupleDesc
| PROC paramList [pragma]
typeDescK ::= 'var' typeDesc
| 'ref' typeDesc
| 'ptr' typeDesc
| 'type' expr
| 'tuple' tupleDesc
| 'proc' paramList [pragma]
typeDesc ::= typeDescK | primary
optSemicolon ::= [SEMICOLON]
macroStmt ::= COLON [stmt] (OF [sliceExprList] COLON stmt
| ELIF expr COLON stmt
| EXCEPT exceptList COLON stmt )*
[ELSE COLON stmt]
macroStmt ::= ':' [stmt] ('of' [sliceExprList] ':' stmt
|'elif' expr ':' stmt
|'except' exceptList ':' stmt )*
['else' ':' stmt]
simpleStmt ::= returnStmt
| yieldStmt
@@ -107,88 +99,91 @@ complexStmt ::= ifStmt | whileStmt | caseStmt | tryStmt | forStmt
| procDecl | iteratorDecl | macroDecl | templateDecl
| constSection | typeSection | whenStmt | varSection
indPush ::= IND # push
indPush ::= IND # and push indentation onto the stack
indPop ::= # pop indentation from the stack
stmt ::= simpleStmt [SAD]
| indPush (complexStmt | simpleStmt)
([SAD] (complexStmt | simpleStmt) )*
DED
([SAD] (complexStmt | simpleStmt))*
DED indPop
exprStmt ::= lowestExpr [EQUALS expr | [expr (comma expr)* [comma]] [macroStmt]]
returnStmt ::= RETURN [expr]
yieldStmt ::= YIELD expr
discardStmt ::= DISCARD expr
raiseStmt ::= RAISE [expr]
breakStmt ::= BREAK [symbol]
continueStmt ::= CONTINUE
ifStmt ::= IF expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt]
whenStmt ::= WHEN expr COLON stmt (ELIF expr COLON stmt)* [ELSE COLON stmt]
caseStmt ::= CASE expr (OF sliceExprList COLON stmt)*
(ELIF expr COLON stmt)*
[ELSE COLON stmt]
whileStmt ::= WHILE expr COLON stmt
forStmt ::= FOR symbol (comma symbol)* [comma] IN expr [DOTDOT expr] COLON stmt
exceptList ::= [qualifiedIdent (comma qualifiedIdent)* [comma]]
exprStmt ::= lowestExpr ['=' expr | [expr (comma expr)*] [macroStmt]]
returnStmt ::= 'return' [expr]
yieldStmt ::= 'yield' expr
discardStmt ::= 'discard' expr
raiseStmt ::= 'raise' [expr]
breakStmt ::= 'break' [symbol]
continueStmt ::= 'continue'
ifStmt ::= 'if' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt]
whenStmt ::= 'when' expr ':' stmt ('elif' expr ':' stmt)* ['else' ':' stmt]
caseStmt ::= 'case' expr [':'] ('of' sliceExprList ':' stmt)*
('elif' expr ':' stmt)*
['else' ':' stmt]
whileStmt ::= 'while' expr ':' stmt
forStmt ::= 'for' symbol (comma symbol)* 'in' expr ['..' expr] ':' stmt
exceptList ::= [qualifiedIdent (comma qualifiedIdent)*]
tryStmt ::= TRY COLON stmt
(EXCEPT exceptList COLON stmt)*
[FINALLY COLON stmt]
asmStmt ::= ASM [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT)
blockStmt ::= BLOCK [symbol] COLON stmt
tryStmt ::= 'try' ':' stmt
('except' exceptList ':' stmt)*
['finally' ':' stmt]
asmStmt ::= 'asm' [pragma] (STR_LIT | RSTR_LIT | TRIPLESTR_LIT)
blockStmt ::= 'block' [symbol] ':' stmt
filename ::= symbol | STR_LIT | RSTR_LIT | TRIPLESTR_LIT
importStmt ::= IMPORT filename (comma filename)* [comma]
includeStmt ::= INCLUDE filename (comma filename)* [comma]
fromStmt ::= FROM filename IMPORT symbol (comma symbol)* [comma]
importStmt ::= 'import' filename (comma filename)*
includeStmt ::= 'include' filename (comma filename)*
fromStmt ::= 'from' filename 'import' symbol (comma symbol)*
pragma ::= CURLYDOT_LE colonExprList (CURLYDOT_RI | CURLY_RI)
pragma ::= '{.' optInd (colonExpr [comma])* [SAD] ('.}' | '}')
param ::= symbol (comma symbol)* [comma] COLON typeDesc
paramList ::= [PAR_LE [param (comma param)* [comma]] PAR_RI] [COLON typeDesc]
param ::= symbol (comma symbol)* ':' typeDesc
paramList ::= ['(' [param (comma param)*] [SAD] ')'] [':' typeDesc]
genericParams ::= BRACKET_LE (symbol [EQUALS typeDesc] )* BRACKET_RI
genericParam ::= symbol [':' typeDesc]
genericParams ::= '[' genericParam (comma genericParam)* [SAD] ']'
procDecl ::= PROC symbol ["*"] [genericParams]
paramList [pragma]
[EQUALS stmt]
macroDecl ::= MACRO symbol ["*"] [genericParams] paramList [pragma]
[EQUALS stmt]
iteratorDecl ::= ITERATOR symbol ["*"] [genericParams] paramList [pragma]
[EQUALS stmt]
templateDecl ::= TEMPLATE symbol ["*"] [genericParams] paramList [pragma]
[EQUALS stmt]
procDecl ::= 'proc' symbol ['*'] [genericParams] paramList [pragma]
['=' stmt]
macroDecl ::= 'macro' symbol ['*'] [genericParams] paramList [pragma]
['=' stmt]
iteratorDecl ::= 'iterator' symbol ['*'] [genericParams] paramList [pragma]
['=' stmt]
templateDecl ::= 'template' symbol ['*'] [genericParams] paramList [pragma]
['=' stmt]
colonAndEquals ::= [COLON typeDesc] EQUALS expr
colonAndEquals ::= [':' typeDesc] '=' expr
constDecl ::= symbol ["*"] [pragma] colonAndEquals [COMMENT | IND COMMENT]
constDecl ::= symbol ['*'] [pragma] colonAndEquals [COMMENT | IND COMMENT]
| COMMENT
constSection ::= CONST indPush constDecl (SAD constDecl)* DED
constSection ::= 'const' indPush constDecl (SAD constDecl)* DED indPop
typeDef ::= typeDesc | objectDef | enumDef
objectField ::= symbol ["*"] [pragma]
objectField ::= symbol ['*'] [pragma]
objectIdentPart ::=
objectField (comma objectField)* [comma] COLON typeDesc [COMMENT|IND COMMENT]
objectField (comma objectField)* ':' typeDesc [COMMENT|IND COMMENT]
objectWhen ::= WHEN expr COLON [COMMENT] objectPart
(ELIF expr COLON [COMMENT] objectPart)*
[ELSE COLON [COMMENT] objectPart]
objectCase ::= CASE expr COLON typeDesc [COMMENT]
(OF sliceExprList COLON [COMMENT] objectPart)*
[ELSE COLON [COMMENT] objectPart]
objectWhen ::= 'when' expr ':' [COMMENT] objectPart
('elif' expr ':' [COMMENT] objectPart)*
['else' ':' [COMMENT] objectPart]
objectCase ::= 'case' expr ':' typeDesc [COMMENT]
('of' sliceExprList ':' [COMMENT] objectPart)*
['else' ':' [COMMENT] objectPart]
objectPart ::= objectWhen | objectCase | objectIdentPart | NIL
| indPush objectPart (SAD objectPart)* DED
tupleDesc ::= BRACKET_LE optInd [param (comma param)* [comma]] BRACKET_RI
objectPart ::= objectWhen | objectCase | objectIdentPart | 'nil'
| indPush objectPart (SAD objectPart)* DED indPop
tupleDesc ::= '[' optInd [param (comma param)*] [SAD] ']'
objectDef ::= OBJECT [pragma] [OF typeDesc] objectPart
enumField ::= symbol [EQUALS expr]
enumDef ::= ENUM [OF typeDesc] (enumField [comma | COMMENT | IND COMMENT])+
objectDef ::= 'object' [pragma] ['of' typeDesc] objectPart
enumField ::= symbol ['=' expr]
enumDef ::= 'enum' ['of' typeDesc] (enumField [comma] [COMMENT | IND COMMENT])+
typeDecl ::= COMMENT
| symbol ["*"] [genericParams] [EQUALS typeDef] [COMMENT | IND COMMENT]
| symbol ['*'] [genericParams] ['=' typeDef] [COMMENT | IND COMMENT]
typeSection ::= TYPE indPush typeDecl (SAD typeDecl)* DED
typeSection ::= 'type' indPush typeDecl (SAD typeDecl)* DED indPop
colonOrEquals ::= COLON typeDesc [EQUALS expr] | EQUALS expr
varField ::= symbol ["*"] [pragma]
varPart ::= symbol (comma symbol)* [comma] colonOrEquals [COMMENT | IND COMMENT]
varSection ::= VAR (varPart
| indPush (COMMENT|varPart) (SAD (COMMENT|varPart))* DED)
colonOrEquals ::= ':' typeDesc ['=' expr] | '=' expr
varField ::= symbol ['*'] [pragma]
varPart ::= symbol (comma symbol)* colonOrEquals [COMMENT | IND COMMENT]
varSection ::= 'var' (varPart
| indPush (COMMENT|varPart)
(SAD (COMMENT|varPart))* DED indPop)

View File

@@ -34,7 +34,6 @@ Path Purpose
on it!
``web`` website of Nimrod; generated by ``koch.py``
from the ``*.txt`` and ``*.tmpl`` files
``koch`` the Koch Build System (written for Nimrod)
``obj`` generated ``*.obj`` files go into here
============ ==============================================
@@ -45,44 +44,76 @@ Bootstrapping the compiler
The compiler is written in a subset of Pascal with special annotations so
that it can be translated to Nimrod code automatically. This conversion is
done by Nimrod itself via the undocumented ``boot`` command. Thus both Nimrod
and Free Pascal can compile the Nimrod compiler.
and Free Pascal can compile the Nimrod compiler. However, the Pascal version
has no garbage collector and leaks memory like crazy! So the Pascal version
should only be used for bootstrapping.
Requirements for bootstrapping:
- Free Pascal (I used version 2.2) [optional]
- Python (should work with version 1.5 or higher)
- Python (should work with version 1.5 or higher) (optional)
- supported C compiler
- C compiler -- one of:
Compiling the compiler is a simple matter of running::
* win32-lcc (currently broken)
* Borland C++ (tested with 5.5; currently broken)
* Microsoft C++
* Digital Mars C++
* Watcom C++ (currently broken)
* GCC
* Intel C++
* Pelles C (currently broken)
* llvm-gcc
koch.py boot
| Compiling the compiler is a simple matter of running:
| ``koch.py boot``
| Or you can compile by hand, this is not difficult.
For a release version use::
If you want to debug the compiler, use the command::
koch.py boot -d:release
koch.py boot --debugger:on
The ``koch.py`` script is Nimrod's maintainance script. It is a replacement for
make and shell scripting with the advantage that it is much more portable.
The ``koch.py`` script is Nimrod's maintainance script: Everything that has
been automated is accessible with it. It is a replacement for make and shell
scripting with the advantage that it is more portable.
If you don't have Python, there is a ``boot`` Nimrod program which does roughly
the same::
nimrod cc boot.nim
./boot [-d:release]
Coding standards
================
Pascal annotations
==================
There are some annotations that the Pascal sources use so that they can
be converted to Nimrod automatically:
The compiler is written in a subset of Pascal with special annotations so
that it can be translated to Nimrod code automatically. As a general rule,
Pascal code that does not translate to Nimrod automatically is forbidden.
``{@discard} <expr>``
Tells the compiler that a ``discard`` statement is needed for Nimrod
here.
``{@cast}typ(expr)``
Tells the compiler that the Pascal conversion is a ``cast`` in Nimrod.
``{@emit <code>}``
Emits ``<code>``. The code fragment needs to be in Pascal syntax.
``{@ignore} <codeA> {@emit <codeB>}``
Ignores ``<codeA>`` and instead emits ``<codeB>`` which needs to be in
Pascal syntax. An empty ``{@emit}`` is possible too (it then only closes
the ``<codeA>`` part).
``record {@tuple}``
Is used to tell the compiler that the record type should be transformed
to a Nimrod tuple type.
``^ {@ptr}``
Is used to tell the compiler that the pointer type should be transformed
to a Nimrod ``ptr`` type. The default is a ``ref`` type.
``'a' + ''``
The idiom ``+''`` is used to tell the compiler that it is a string
literal and not a character literal. (Pascal does not distinguish between
character literals and string literals of length 1.)
``+{&}``
This tells the compiler that Pascal's ``+`` here is a string concatenation
and thus should be converted to ``&``. Note that this is not needed if
any of the operands is a string literal because the compiler then can
figure this out by itself.
``{@set}['a', 'b', 'c']``
Tells the compiler that Pascal's ``[]`` constructor is a set and not an
array. This is only needed if the compiler cannot figure this out for
itself.
Porting to new platforms
@@ -99,7 +130,7 @@ check that the OS, System modules work and recompile Nimrod.
The only case where things aren't as easy is when the garbage
collector needs some assembler tweaking to work. The standard
version of the GC uses C's ``setjmp`` function to store all registers
on the hardware stack. It may be that the new platform needs to
on the hardware stack. It may be necessary that the new platform needs to
replace this generic code by some assembler code.
@@ -132,11 +163,11 @@ The Garbage Collector
Introduction
------------
We use the term *cell* here to refer to everything that is traced
I use the term *cell* here to refer to everything that is traced
(sequences, refs, strings).
This section describes how the new GC works.
The basic algorithm is *Deferrent reference counting* with cycle detection.
The basic algorithm is *Deferrent Reference Counting* with cycle detection.
References in the stack are not counted for better performance and easier C
code generation.
@@ -170,7 +201,7 @@ modifying a ``TCellSet`` during traversation leads to undefined behaviour.
iterator elements(s: TCellSet): (elem: PCell)
All the operations have to be perform efficiently. Because a Cellset can
All the operations have to perform efficiently. Because a Cellset can
become huge a hash table alone is not suitable for this.
We use a mixture of bitset and hash table for this. The hash table maps *pages*
@@ -246,16 +277,10 @@ This syntax tree is the interface between the parser and the code generator.
It is essential to understand most of the compiler's code.
In order to compile Nimrod correctly, type-checking has to be seperated from
parsing. Otherwise generics would not work. Code generation is done for a
whole module only after it has been checked for semantics.
parsing. Otherwise generics would not work.
.. include:: filelist.txt
The first command line argument selects the backend. Thus the backend is
responsible for calling the parser and semantic checker. However, when
compiling ``import`` or ``include`` statements, the semantic checker needs to
call the backend, this is done by embedding a PBackend into a TContext.
The syntax tree
---------------
@@ -265,7 +290,7 @@ may contain cycles. The AST changes its shape after semantic checking. This
is needed to make life easier for the code generators. See the "ast" module
for the type definitions.
We use the notation ``nodeKind(fields, [sons])`` for describing
I use the notation ``nodeKind(fields, [sons])`` for describing
nodes. ``nodeKind[sons]`` is a short-cut for ``nodeKind([sons])``.
XXX: Description of the language's syntax and the corresponding trees.
@@ -273,12 +298,16 @@ XXX: Description of the language's syntax and the corresponding trees.
How the RTL is compiled
=======================
The system module contains the part of the RTL which needs support by
The ``system`` module contains the part of the RTL which needs support by
compiler magic (and the stuff that needs to be in it because the spec
says so). The C code generator generates the C code for it just like any other
module. However, calls to some procedures like ``addInt`` are inserted by
the CCG. Therefore the module ``magicsys`` contains a table
(``compilerprocs``) with all symbols that are marked as ``compilerproc``.
the CCG. Therefore the module ``magicsys`` contains a table (``compilerprocs``)
with all symbols that are marked as ``compilerproc``. ``compilerprocs`` are
needed by the code generator. A ``magic`` proc is not the same as a
``compilerproc``: A ``magic`` is a proc that needs compiler magic for its
semantic checking, a ``compilerproc`` is a proc that is used by the code
generator.
@@ -290,77 +319,3 @@ underlying C compiler already does all the hard work for us. The problem is the
common runtime library, especially the memory manager. Note that Borland's
Delphi had exactly the same problem. The workaround is to not link the GC with
the Dll and provide an extra runtime dll that needs to be initialized.
How to implement closures
=========================
A closure is a record of a proc pointer and a context ref. The context ref
points to a garbage collected record that contains the needed variables.
An example:
.. code-block:: Nimrod
type
TListRec = record
data: string
next: ref TListRec
proc forEach(head: ref TListRec, visitor: proc (s: string) {.closure.}) =
var it = head
while it != nil:
visit(it.data)
it = it.next
proc sayHello() =
var L = new List(["hallo", "Andreas"])
var temp = "jup\xff"
forEach(L, lambda(s: string) =
io.write(temp)
io.write(s)
)
This should become the following in C:
.. code-block:: C
typedef struct ... /* List type */
typedef struct closure {
void (*PrcPart)(string, void*);
void* ClPart;
}
typedef struct Tcl_data {
string temp; // all accessed variables are put in here!
}
void forEach(TListRec* head, const closure visitor) {
TListRec* it = head;
while (it != NIM_NULL) {
visitor.prc(it->data, visitor->cl_data);
it = it->next;
}
}
void printStr(string s, void* cl_data) {
Tcl_data* x = (Tcl_data*) cl_data;
io_write(x->temp);
io_write(s);
}
void sayhello() {
Tcl_data* data = new(...);
asgnRef(&data->temp, "jup\xff");
...
closure cl;
cl.prc = printStr;
cl.cl_data = data;
foreach(L, cl);
}
What about nested closure? - There's not much difference: Just put all used
variables in the data record.

File diff suppressed because it is too large Load Diff

View File

@@ -12,7 +12,7 @@ Introduction
This document describes the usage of the *Nimrod compiler*
on the different supported platforms. It is not a definition of the Nimrod
programming system (therefore is the Nimrod manual).
programming language (therefore is the manual).
Nimrod is free software; it is licensed under the
`GNU General Public License <gpl.html>`_.
@@ -41,7 +41,7 @@ looks for it in the following directories (in this order):
2. ``$nimrod/config/nimrod.cfg`` (UNIX, Windows)
3. ``/etc/nimrod.cfg`` (UNIX)
The search stops as soon as a configuration file has been found. The reading
The search stops as soon as a configuration file has been found. The reading
of ``nimrod.cfg`` can be suppressed by the ``--skip_cfg`` command line option.
Configuration settings can be overwritten in a project specific
configuration file that is read automatically. This specific file has to
@@ -54,21 +54,13 @@ Command line settings have priority over configuration file settings.
Nimrod's directory structure
----------------------------
The generated files that Nimrod produces all go into a subdirectory called
``nimcache`` in your project directory. This makes it easy to delete all
``nimcache`` in your project directory. This makes it easy to delete all
generated files.
However, the generated C code is not platform independant. C code generated for
Linux does not compile on Windows, for instance. The comment on top of the
C file lists the OS, CPU and CC the file has been compiled for.
The library lies in ``lib``. Directly in the library directory are essential
Nimrod modules like the ``system`` and ``os`` modules. Under ``lib/base``
are additional specialized libraries or interfaces to foreign libraries which
are included in the standard distribution. The ``lib/extra`` directory is
initially empty. Third party libraries should go there. In the default
configuration the compiler always searches for libraries in ``lib``,
``lib/base`` and ``lib/extra``.
Additional Features
===================
@@ -86,8 +78,8 @@ available.
Importc Pragma
~~~~~~~~~~~~~~
The `importc`:idx: pragma provides a means to import a type, a variable, or a
procedure from C. The optional argument is a string containing the C
identifier. If the argument is missing, the C name is the Nimrod
procedure from C. The optional argument is a string containing the C
identifier. If the argument is missing, the C name is the Nimrod
identifier *exactly as spelled*:
.. code-block::
@@ -97,8 +89,8 @@ identifier *exactly as spelled*:
Exportc Pragma
~~~~~~~~~~~~~~
The `exportc`:idx: pragma provides a means to export a type, a variable, or a
procedure to C. The optional argument is a string containing the C
identifier. If the argument is missing, the C name is the Nimrod
procedure to C. The optional argument is a string containing the C
identifier. If the argument is missing, the C name is the Nimrod
identifier *exactly as spelled*:
.. code-block:: Nimrod
@@ -109,7 +101,7 @@ Dynlib Pragma
~~~~~~~~~~~~~
With the `dynlib`:idx: pragma a procedure or a variable can be imported from
a dynamic library (``.dll`` files for Windows, ``lib*.so`` files for UNIX). The
non-optional argument has to be the name of the dynamic library:
non-optional argument has to be the name of the dynamic library:
.. code-block:: Nimrod
proc gtk_image_new(): PGtkWidget {.cdecl, dynlib: "libgtk-x11-2.0.so", importc.}
@@ -128,8 +120,8 @@ the C code. Thus it makes the following possible, for example:
.. code-block:: Nimrod
var
EOF {.importc: "EOF", no_decl.}: cint # pretend EOF was a variable, as
# Nimrod does not know its value
EACCES {.importc, no_decl.}: cint # pretend EACCES was a variable, as
# Nimrod does not know its value
However, the ``header`` pragma is often the better alternative.
@@ -164,14 +156,6 @@ strings automatically:
printf("hallo %s", "world") # "world" will be passed as C string
No_static Pragma
~~~~~~~~~~~~~~~~
The `no_static`:idx: pragma can be applied to almost any symbol and specifies
that it shall not be declared ``static`` in the generated C code. Note that
symbols in the interface part of a module never get declared ``static``, so
only in very special cases this pragma is necessary.
Line_dir Option
~~~~~~~~~~~~~~~
The `line_dir`:idx: option can be turned on or off. If on the generated C code
@@ -216,14 +200,14 @@ The `register`:idx: pragma is for variables only. It declares the variable as
in a hardware register for faster access. C compilers usually ignore this
though and for good reason: Often they do a better job without it anyway.
In highly specific cases (a dispatch loop of an bytecode interpreter for
In highly specific cases (a dispatch loop of an bytecode interpreter for
example) it may provide benefits, though.
Acyclic Pragma
~~~~~~~~~~~~~~
The `acyclic`:idx: pragma can be used for object types to mark them as acyclic
even though they seem to be cyclic. This is an **optimization** for the garbage
even though they seem to be cyclic. This is an **optimization** for the garbage
collector to not consider objects of this type as part of a cycle::
type
@@ -231,15 +215,31 @@ collector to not consider objects of this type as part of a cycle::
TNode {.acyclic, final.} = object
left, right: PNode
data: string
In the example a tree structure is declared with the ``TNode`` type. Note that
the type definition is recursive thus the GC has to assume that objects of
this type may form a cyclic graph. The ``acyclic`` pragma passes the
the type definition is recursive thus the GC has to assume that objects of
this type may form a cyclic graph. The ``acyclic`` pragma passes the
information that this cannot happen to the GC. If the programmer uses the
``acyclic`` pragma for data types that are in reality cyclic, the GC may leak
memory, but nothing worse happens.
Dead_code_elim Pragma
~~~~~~~~~~~~~~~~~~~~~
The `dead_code_elim`:idx: pragma only applies to whole modules: It tells the
compiler to active (or deactivate) dead code elimination for the module the
pragma appers in.
The ``--dead_code_elim:on`` command line switch has the same effect as marking
any module with ``{.dead_code_elim:on}``. However, for some modules such as
the GTK wrapper it makes sense to *always* turn on dead code elimination -
no matter if it is globally active or not.
Example:
.. code-block:: nimrod
{.dead_code_elim: on.}
Disabling certain messages
--------------------------
@@ -280,8 +280,8 @@ However, sometimes one has to optimize. Do it in the following order:
This section can only help you with the last item. Note that rewriting parts
of your program in C is *never* necessary to speed up your program, because
everything that can be done in C can be done in Nimrod. Rewriting parts in
assembler *might*.
everything that can be done in C can be done in Nimrod.
Optimizing string handling
--------------------------

View File

@@ -18,8 +18,7 @@ compatible to the original implementation as one would like.
Even though Nimrod's |rst| parser does not parse all constructs, it is pretty
usable. The missing features can easily be circumvented. An indication of this
fact is that Nimrod's
*whole* documentation itself (including this document) is
fact is that Nimrod's *whole* documentation itself (including this document) is
processed by Nimrod's |rst| parser. (Which is an order of magnitude faster than
Docutils' parser.)

File diff suppressed because it is too large Load Diff

1382
doc/tut1.txt Normal file

File diff suppressed because it is too large Load Diff

718
doc/tut2.txt Normal file
View File

@@ -0,0 +1,718 @@
=============================
The Nimrod Tutorial (Part II)
=============================
:Author: Andreas Rumpf
:Version: |nimrodversion|
.. contents::
Introduction
============
"With great power comes great responsibility." -- Spider-man
This document is a tutorial for the advanced constructs of the *Nimrod*
programming language.
Pragmas
=======
Pragmas are Nimrod's method to give the compiler additional information/
commands without introducing a massive number of new keywords. Pragmas are
processed during semantic checking. Pragmas are enclosed in the
special ``{.`` and ``.}`` curly dot brackets. This tutorial does not cover
pragmas. See the `manual <manual.html>`_ or `user guide <nimrodc.html>`_ for
a description of the available pragmas.
Object Oriented Programming
===========================
While Nimrod's support for object oriented programming (OOP) is minimalistic,
powerful OOP technics can be used. OOP is seen as *one* way to design a
program, not *the only* way. Often a procedural approach leads to simpler
and more efficient code.
Objects
-------
Like tuples, objects are a means to pack different values together in a
structured way. However, objects provide many features that tuples do not:
They provide inheritance and information hiding. Because objects encapsulate
data, the ``()`` tuple constructor cannot be used to construct objects. So
the order of the object's fields is not as important as it is for tuples. The
programmer should provide a proc to initialize the object (this is called
a *constructor*).
Objects have access to their type at runtime. There is an
``is`` operator that can be used to check the object's type:
.. code-block:: nimrod
type
TPerson = object of TObject
name*: string # the * means that `name` is accessible from other modules
age: int # no * means that the field is hidden from other modules
TStudent = object of TPerson # TStudent inherits from TPerson
id: int # with an id field
var
student: TStudent
person: TPerson
assert(student is TStudent) # is true
Object fields that should be visible from outside the defining module, have to
be marked by ``*``. In contrast to tuples, different object types are
never *equivalent*. New object types can only be defined within a type
section.
Inheritance is done with the ``object of`` syntax. Multiple inheritance is
currently not supported. If an object type has no suitable ancestor, ``TObject``
should be used as its ancestor, but this is only a convention.
Note that aggregation (*has-a* relation) is often preferable to inheritance
(*is-a* relation) for simple code reuse. Since objects are value types in
Nimrod, aggregation is as efficient as inheritance.
Mutually recursive types
------------------------
Objects, tuples and references can model quite complex data structures which
depend on each other. This is called *mutually recursive types*. In Nimrod
these types need to be declared within a single type section. Anything else
would require arbitrary symbol lookahead which slows down compilation.
Example:
.. code-block:: nimrod
type
PNode = ref TNode # a traced reference to a TNode
TNode = object
le, ri: PNode # left and right subtrees
sym: ref TSym # leaves contain a reference to a TSym
TSym = object # a symbol
name: string # the symbol's name
line: int # the line the symbol was declared in
code: PNode # the symbol's abstract syntax tree
Type conversions
----------------
Nimrod distinguishes between `type casts`:idx: and `type conversions`:idx:.
Casts are done with the ``cast`` operator and force the compiler to
interpret a bit pattern to be of another type.
Type conversions are a much more polite way to convert a type into another:
They preserve the abstract *value*, not necessarily the *bit-pattern*. If a
type conversion is not possible, the compiler complains or an exception is
raised.
The syntax for type conversions is ``destination_type(expression_to_convert)``
(like an ordinary call):
.. code-block:: nimrod
proc getID(x: TPerson): int =
return TStudent(x).id
The ``EInvalidObjectConversion`` exception is raised if ``x`` is not a
``TStudent``.
Object variants
---------------
Often an object hierarchy is overkill in certain situations where simple
`variant`:idx: types are needed.
An example:
.. code-block:: nimrod
# This is an example how an abstract syntax tree could be modelled in Nimrod
type
TNodeKind = enum # the different node types
nkInt, # a leaf with an integer value
nkFloat, # a leaf with a float value
nkString, # a leaf with a string value
nkAdd, # an addition
nkSub, # a subtraction
nkIf # an if statement
PNode = ref TNode
TNode = object
case kind: TNodeKind # the ``kind`` field is the discriminator
of nkInt: intVal: int
of nkFloat: floavVal: float
of nkString: strVal: string
of nkAdd, nkSub:
leftOp, rightOp: PNode
of nkIf:
condition, thenPart, elsePart: PNode
var
n: PNode
new(n) # creates a new node
n.kind = nkFloat
n.floatVal = 0.0 # valid, because ``n.kind==nkFloat``
# the following statement raises an `EInvalidField` exception, because
# n.kind's value does not fit:
n.strVal = ""
As can been seen from the example, an advantage to an object hierarchy is that
no conversion between different object types is needed. Yet, access to invalid
object fields raises an exception.
Methods
-------
In ordinary object oriented languages, procedures (also called *methods*) are
bound to a class. This has disadvantages:
* Adding a method to a class the programmer has no control over is
impossible or needs ugly workarounds.
* Often it is unclear where the procedure should belong to: Is
``join`` a string method or an array method? Should the complex
``vertexCover`` algorithm really be a method of the ``graph`` class?
Nimrod avoids these problems by not distinguishing between methods and
procedures. Methods are just ordinary procedures. However, there is a special
syntactic sugar for calling procedures: The syntax ``obj.method(args)`` can be
used instead of ``method(obj, args)``. If there are no remaining arguments, the
parentheses can be omitted: ``obj.len`` (instead of ``len(obj)``).
This `method call syntax`:idx: is not restricted to objects, it can be used
for any type:
.. code-block:: nimrod
echo("abc".len) # is the same as echo(len("abc"))
echo("abc".toUpper())
echo({'a', 'b', 'c'}.card)
stdout.writeln("Hallo") # the same as write(stdout, "Hallo")
If it gives you warm fuzzy feelings, you can even write ``1.`+`(2)`` instead of
``1 + 2`` and claim that Nimrod is a pure object oriented language. (That
would not even be lying: *pure OO* has no meaning anyway. :-)
Properties
----------
As the above example shows, Nimrod has no need for *get-properties*:
Ordinary get-procedures that are called with the *method call syntax* achieve
the same. But setting a value is different; for this a special setter syntax
is needed:
.. code-block:: nimrod
type
TSocket* = object of TObject
FHost: int # cannot be accessed from the outside of the module
# the `F` prefix is a convention to avoid clashes since
# the accessors are named `host`
proc `host=`*(s: var TSocket, value: int) {.inline.} =
## setter of hostAddr
s.FHost = value
proc host*(s: TSocket): int {.inline.} =
## getter of hostAddr
return s.FHost
var
s: TSocket
s.host = 34 # same as `host=`(s, 34)
(The example also shows ``inline`` procedures.)
The ``[]`` array access operator can be overloaded to provide
`array properties`:idx:\ :
.. code-block:: nimrod
type
TVector* = object
x, y, z: float
proc `[]=`* (v: var TVector, i: int, value: float) =
# setter
case i
of 0: v.x = value
of 1: v.y = value
of 2: v.z = value
else: assert(false)
proc `[]`* (v: TVector, i: int): float =
# getter
case i
of 0: result = v.x
of 1: result = v.y
of 2: result = v.z
else: assert(false)
The example is silly, since a vector is better modelled by a tuple which
already provides ``v[]`` access.
Dynamic binding
---------------
In Nimrod procedural types are used to implement dynamic binding. The following
example also shows some more conventions: The ``self`` or ``this`` object
is named ``my`` (because it is shorter than the alternatives), each class
provides a constructor, etc.
.. code-block:: nimrod
type
TFigure = object of TObject # abstract base class:
draw: proc (my: var TFigure) # concrete classes implement this proc
proc init(f: var TFigure) =
f.draw = nil
type
TCircle = object of TFigure
radius: int
proc drawCircle(my: var TCircle) = echo("o " & $my.radius)
proc init(my: var TCircle) =
init(TFigure(my)) # call base constructor
my.radius = 5
my.draw = drawCircle
type
TRectangle = object of TFigure
width, height: int
proc drawRectangle(my: var TRectangle) = echo("[]")
proc init(my: var TRectangle) =
init(TFigure(my)) # call base constructor
my.width = 5
my.height = 10
my.draw = drawRectangle
# now use these classes:
var
r: TRectangle
c: TCircle
init(r)
init(c)
r.draw(r)
c.draw(c)
The last line shows the syntactical difference between static and dynamic
binding: The ``r.draw(r)`` dynamic call refers to ``r`` twice. This difference
is not necessarily bad. But if you want to eliminate the somewhat redundant
``r``, it can be done by using *closures*:
.. code-block:: nimrod
type
TFigure = object of TObject # abstract base class:
draw: proc () {.closure.} # concrete classes implement this proc
proc init(f: var TFigure) =
f.draw = nil
type
TCircle = object of TFigure
radius: int
proc init(me: var TCircle) =
init(TFigure(me)) # call base constructor
me.radius = 5
me.draw = lambda () =
echo("o " & $me.radius)
type
TRectangle = object of TFigure
width, height: int
proc init(me: var TRectangle) =
init(TFigure(me)) # call base constructor
me.width = 5
me.height = 10
me.draw = lambda () =
echo("[]")
# now use these classes:
var
r: TRectangle
c: TCircle
init(r)
init(c)
r.draw()
c.draw()
The example also introduces `lambda`:idx: expressions: A ``lambda`` expression
defines a new proc with the ``closure`` calling convention on the fly.
`Version 0.7.4: Closures and lambda expressions are not implemented.`:red:
Exceptions
==========
In Nimrod `exceptions`:idx: are objects. By convention, exception types are
prefixed with an 'E', not 'T'. The ``system`` module defines an exception
hierarchy that you should stick to. Reusing an existing exception type is
often better than defining a new exception type: It avoids a proliferation of
types.
Exceptions should be allocated on the heap because their lifetime is unknown.
A convention is that exceptions should be raised in *exceptional* cases:
For example, if a file cannot be opened, this should not raise an exception
since this is quite common (the file may have been deleted).
Raise statement
---------------
Raising an exception is done with the ``raise`` statement:
.. code-block:: nimrod
var
e: ref EOS
new(e)
e.msg = "the request to the OS failed"
raise e
If the ``raise`` keyword is not followed by an expression, the last exception
is *re-raised*.
Try statement
-------------
The `try`:idx: statement handles exceptions:
.. code-block:: nimrod
# read the first two lines of a text file that should contain numbers
# and tries to add them
var
f: TFile
if openFile(f, "numbers.txt"):
try:
var a = readLine(f)
var b = readLine(f)
echo("sum: " & $(parseInt(a) + parseInt(b)))
except EOverflow:
echo("overflow!")
except EInvalidValue:
echo("could not convert string to integer")
except EIO:
echo("IO error!")
except:
echo("Unknown exception!")
# reraise the unknown exception:
raise
finally:
closeFile(f)
The statements after the ``try`` are executed unless an exception is
raised. Then the appropriate ``except`` part is executed.
The empty ``except`` part is executed if there is an exception that is
not explicitely listed. It is similiar to an ``else`` part in ``if``
statements.
If there is a ``finally`` part, it is always executed after the
exception handlers.
The exception is *consumed* in an ``except`` part. If an exception is not
handled, it is propagated through the call stack. This means that often
the rest of the procedure - that is not within a ``finally`` clause -
is not executed (if an exception occurs).
Generics
========
`Version 0.7.4: Complex generic types like in the example do not work.`:red:
`Generics`:idx: are Nimrod's means to parametrize procs, iterators or types
with `type parameters`:idx:. They are most useful for efficient type safe
containers:
.. code-block:: nimrod
type
TBinaryTree[T] = object # TBinaryTree is a generic type with
# with generic param ``T``
le, ri: ref TBinaryTree[T] # left and right subtrees; may be nil
data: T # the data stored in a node
PBinaryTree*[T] = ref TBinaryTree[T] # type that is exported
proc newNode*[T](data: T): PBinaryTree[T] =
# constructor for a node
new(result)
result.dat = data
proc add*[T](root: var PBinaryTree[T], n: PBinaryTree[T]) =
# insert a node into the tree
if root == nil:
root = n
else:
var it = root
while it != nil:
# compare the data items; uses the generic ``cmd`` proc that works for
# any type that has a ``==`` and ``<`` operator
var c = cmp(it.data, n.data)
if c < 0:
if it.le == nil:
it.le = n
return
it = it.le
else:
if it.ri == nil:
it.ri = n
return
it = it.ri
proc add*[T](root: var PBinaryTree[T], data: T) =
# convenience proc:
add(root, newNode(data))
iterator preorder*[T](root: PBinaryTree[T]): T =
# Preorder traversal of a binary tree.
# Since recursive iterators are not yet implemented,
# this uses an explicit stack (which is more efficient anyway):
var stack: seq[PBinaryTree[T]] = @[root]
while stack.len > 0:
var n = stack[stack.len-1]
setLen(stack, stack.len-1) # pop `n` of the stack
while n != nil:
yield n
add(stack, n.ri) # push right subtree onto the stack
n = n.le # and follow the left pointer
var
root: PBinaryTree[string] # instantiate a PBinaryTree with ``string``
add(root, newNode("hallo")) # instantiates generic procs ``newNode`` and ``add``
add(root, "world") # instantiates the second ``add`` proc
for str in preorder(root):
stdout.writeln(str)
The example shows a generic binary tree. Depending on context, the brackets are
used either to introduce type parameters or to instantiate a generic proc,
iterator or type. As the example shows, generics work with overloading: The
best match of ``add`` is used. The built-in ``add`` procedure for sequences
is not hidden and used in the ``preorder`` iterator.
Templates
=========
Templates are a simple substitution mechanism that operates on Nimrod's
abstract syntax trees. Templates are processed in the semantic pass of the
compiler. They integrate well with the rest of the language and share none
of C's preprocessor macros flaws. However, they may lead to code that is harder
to understand and maintain. So one should use them sparingly.
To *invoke* a template, call it like a procedure.
Example:
.. code-block:: nimrod
template `!=` (a, b: expr): expr =
# this definition exists in the System module
not (a == b)
assert(5 != 6) # the compiler rewrites that to: assert(not (5 == 6))
The ``!=``, ``>``, ``>=``, ``in``, ``notin``, ``isnot`` operators are in fact
templates: This has the benefit that if you overload the ``==`` operator,
the ``!=`` operator is available automatically and does the right thing.
``a > b`` is transformed into ``b < a``.
``a in b`` is transformed into ``contains(b, a)``.
``notin`` and ``isnot`` have the obvious meanings.
Templates are especially useful for lazy evaluation purposes. Consider a
simple proc for logging:
.. code-block:: nimrod
const
debug = True
proc log(msg: string) {.inline.} =
if debug:
stdout.writeln(msg)
var
x = 4
log("x has the value: " & $x)
This code has a shortcoming: If ``debug`` is set to false someday, the quite
expensive ``$`` and ``&`` operations are still performed! (The argument
evaluation for procedures is said to be *eager*).
Turning the ``log`` proc into a template solves this problem in an elegant way:
.. code-block:: nimrod
const
debug = True
template log(msg: expr): stmt =
if debug:
stdout.writeln(msg)
var
x = 4
log("x has the value: " & $x)
The "types" of templates can be the symbols ``expr`` (stands for *expression*),
``stmt`` (stands for *statement*) or ``typedesc`` (stands for *type
description*). These are no real types, they just help the compiler parsing.
The template body does not open a new scope. To open a new scope
use a ``block`` statement:
.. code-block:: nimrod
template declareInScope(x: expr, t: typeDesc): stmt =
var x: t
template declareInNewScope(x: expr, t: typeDesc): stmt =
# open a new scope:
block:
var x: t
declareInScope(a, int)
a = 42 # works, `a` is known here
declareInNewScope(b, int)
b = 42 # does not work, `b` is unknown
Macros
======
If the template mechanism scares you, you will be pleased to hear that
templates are not really necessary: Macros can do anything that templates can
do and much more. Macros are harder to write than templates and even harder
to get right :-). Now that you have been warned, lets see what a macro *is*.
Macros enable advanced compile-time code tranformations, but they
cannot change Nimrod's syntax. However, this is no real restriction because
Nimrod's syntax is flexible enough anyway.
`Macros`:idx: can be used to implement `domain specific languages`:idx:.
To write macros, one needs to know how the Nimrod concrete syntax is converted
to an abstract syntax tree (AST). (Unfortunately the AST is not documented yet.)
There are two ways to invoke a macro:
(1) invoking a macro like a procedure call (`expression macros`:idx:)
(2) invoking a macro with the special ``macrostmt`` syntax (`statement macros`:idx:)
Expression Macros
-----------------
The following example implements a powerful ``debug`` command that accepts a
variable number of arguments (this cannot be done with templates):
.. code-block:: nimrod
# to work with Nimrod syntax trees, we need an API that is defined in the
# ``macros`` module:
import macros
macro debug(n: expr): stmt =
# `n` is a Nimrod AST that contains the whole macro expression
# this macro returns a list of statements:
result = newNimNode(nnkStmtList, n)
# iterate over any argument that is passed to this macro:
for i in 1..n.len-1:
# add a call to the statement list that writes the expression;
# `toStrLit` converts an AST to its string representation:
result.add(newCall("write", newIdentNode("stdout"), toStrLit(n[i])))
# add a call to the statement list that writes ": "
result.add(newCall("write", newIdentNode("stdout"), newStrLitNode(": ")))
# add a call to the statement list that writes the expressions value:
result.add(newCall("writeln", newIdentNode("stdout"), n[i]))
var
a: array[0..10, int]
x = "some string"
a[0] = 42
a[1] = 45
debug(a[0], a[1], x)
The macro call expands to:
.. code-block:: nimrod
write(stdout, "a[0]")
write(stdout, ": ")
writeln(stdout, a[0])
write(stdout, "a[1]")
write(stdout, ": ")
writeln(stdout, a[1])
write(stdout, "x")
write(stdout, ": ")
writeln(stdout, x)
Lets return to the dynamic binding ``r.draw(r)`` notational "problem". Apart
from closures, there is another "solution": Define an infix ``!`` macro
operator which hides it:
.. code-block::
macro `!` (n: expr): expr =
result = newNimNode(nnkCall, n)
var dot = newNimNode(nnkDotExpr, n)
dot.add(n[1]) # obj
if n[2].kind == nnkCall:
# transforms ``obj!method(arg1, arg2, ...)`` to
# ``(obj.method)(obj, arg1, arg2, ...)``
dot.add(n[2][0]) # method
result.add(dot)
result.add(n[1]) # obj
for i in 1..n[2].len-1:
result.add(n[2][i])
else:
# transforms ``obj!method`` to
# ``(obj.method)(obj)``
dot.add(n[2]) # method
result.add(dot)
result.add(n[1]) # obj
r!draw(a, b, c) # will be transfomed into ``r.draw(r, a, b, c)``
Great! 20 lines of complex code to safe a few keystrokes! Obviously, this is
exactly you should not do! (But it makes a cool example.)
Statement Macros
----------------
Statement macros are defined just as expression macros. However, they are
invoked by an expression following a colon.
The following example outlines a macro that generates a lexical analyser from
regular expressions:
.. code-block:: nimrod
macro case_token(n: stmt): stmt =
# creates a lexical analyser from regular expressions
# ... (implementation is an exercise for the reader :-)
nil
case_token: # this colon tells the parser it is a macro statement
of r"[A-Za-z_]+[A-Za-z_0-9]*":
return tkIdentifier
of r"0-9+":
return tkInteger
of r"[\+\-\*\?]+":
return tkOperator
else:
return tkUnknown

View File

@@ -1,215 +0,0 @@
===========================================
Tutorial of the Nimrod Programming Language
===========================================
:Author: Andreas Rumpf
Motivation
==========
Why yet another programming language?
Look at the trends behind all the new programming languages:
* They try to be dynamic: Dynamic typing, dynamic method binding, etc.
In my opinion the most things the dynamic features buy could be achieved
with static means in a more efficient and *understandable* way.
* They depend on big runtime environments which you need to
ship with your program as each new version of these may break compability
in subtle ways or you use recently added features - thus forcing your
users to update their runtime environment. Compiled programs where the
executable contains all needed code are simply the better solution.
* They are unsuitable for systems programming: Do you really want to
write an operating system, a device driver or an interpreter in a language
that is just-in-time compiled (or interpreted)?
So what lacks are *good* systems programming languages. Nimrod is such a
language. It offers the following features:
* It is readable: It reads from left to right (unlike the C-syntax
languages).
* It is strongly and statically typed: This enables the compiler to find
more errors. Static typing also makes programs more *readable*.
* It is compiled. (Currently this is done via compilation to C.)
* It is garbage collected. Big systems need garbage collection. Manuell
memory management is also supported through *untraced pointers*.
* It scales because high level features are also available: It has built-in
bit sets, strings, enumerations, objects, arrays and dynamically resizeable
arrays (called *sequences*).
* It has high performance: The current implementation compiles to C
and uses a Deutsch-Bobrow garbage collector together with Christoper's
partial mark-sweep garbage collector leading to excellent execution
speed and a small memory footprint.
* It has real modules with proper interfaces and supports separate
compilation.
* It is portable: It compiles to C and platform specific features have
been separated and documented. So even if your platform is not supported
porting should be easy.
* It is flexible: Although primilarily a procedural language, generic,
functional and object-oriented programming is also supported.
* It is easy to learn, easy to use and leads to elegant programs.
* You can link an embedded debugger to your program (ENDB). ENDB is
very easy to use - there is no need to clutter your code with
``echo`` statements for proper debugging.
Introduction
============
This document is a tutorial for the programming language *Nimrod*. It should
be a readable quick tour through the language instead of a dry specification
(which can be found `here <manual.html>`_). This tutorial assumes that
the reader already knows some other programming language such as Pascal. Thus
it is detailed in cases where Nimrod differs from other programming languages
and kept short where Nimrod is more or less the same.
A quick tour through the language
=================================
The first program
-----------------
We start the tour with a modified "hallo world" program:
.. code-block:: Nimrod
# This is a comment
# Standard IO-routines are always accessible
write(stdout, "What's your name? ")
var name: string = readLine(stdin)
write(stdout, "Hi, " & name & "!\n")
Save this code to the file "greeting.nim". Now compile and run it::
nimrod compile --run greeting.nim
As you see, with the ``--run`` switch Nimrod executes the file automatically
after compilation. You can even give your program command line arguments by
appending them after the filename that is to be compiled and run::
nimrod compile --run greeting.nim arg1 arg2
Though it should be pretty obvious what the program does, I will explain the
syntax: Statements which are not indented are executed when the program
starts. Indentation is Nimrod's way of grouping statements. String literals
are enclosed in double quotes. The ``var`` statement declares a new variable
named ``name`` of type ``string`` with the value that is returned by the
``readline`` procedure. Since the compiler knows that ``readline`` returns
a string, you can leave out the type in the declaration. So this will work too:
.. code-block:: Nimrod
var name = readline(stdin)
Note that this is the only form of type inference that exists in Nimrod:
This is because it yields a good compromise between brevity and readability.
The ``&`` operator concates strings together. ``\n`` stands for the
new line character(s). On several operating systems ``\n`` is represented by
*two* characters: Linefeed and Carriage Return. That is why
*character literals* cannot contain ``\n``. But since Nimrod handles strings
so well, this is a nonissue.
The "hallo world" program contains several identifiers that are already
known to the compiler: ``write``, ``stdout``, ``readLine``, etc. These
built-in items are declared in the system_ module which is implicitly
imported by any other module.
Lexical elements
----------------
Let us look into Nimrod's lexical elements in more detail: Like other
programming languages Nimrod consists of identifiers, keywords, comments,
operators, and other punctation marks. Case is *insignificant* in Nimrod and
even underscores are ignored: ``This_is_an_identifier`` and this is the same
identifier ``ThisIsAnIdentifier``. This feature enables one to use other
peoples code without bothering about a naming convention that one does not
like.
String literals are enclosed in double quotes, character literals in single
quotes. There exist also *raw* string and character literals:
.. code-block:: Nimrod
r"C:\program files\nim"
In raw literals the backslash is not an escape character, so they fit
the principle *what you see is what you get*. *Long string literals*
are also available (``""" ... """``); they can span over multiple lines
and the ``\`` is not an escape character either. They are very useful
for embedding SQL code templates for example.
Comments start with ``#`` and run till the end of the line. (Well this is not
quite true, but you should read the manual for a proper explanation.)
... XXX number literals
The usual statements - if, while, for, case
-------------------------------------------
In Nimrod indentation is used to group statements.
An example showing the most common statement types:
.. code-block:: Nimrod
var name = readLine(stdin)
if name == "Andreas":
echo("What a nice name!")
elif name == "":
echo("Don't you have a name?")
else:
echo("Boring name...")
for i in 0..length(name)-1:
if name[i] == 'm':
echo("hey, there is an *m* in your name!")
echo("Please give your password: \n")
var pw = readLine(stdin)
while pw != "12345":
echo("Wrong password! Next try: \n")
pw = readLine(stdin)
echo("""Login complete!
What do you want to do?
delete-everything
restart-computer
go-for-a-walk
""")
case readline(stdin)
of "delete-everything", "restart-computer":
echo("permission denied")
of "go-for-a-walk": echo("please yourself")
else: echo("unknown command")
..
Types
-----
Nimrod has a rich type system. This tutorial only gives a few examples. Read
the `manual <manual.html>`_ for further information:
.. code-block:: Nimrod
type
TMyRecord = object
x, y: int
Procedures
----------
Procedures are subroutines. They are declared in this way:
.. code-block:: Nimrod
proc findSubStr(sub: string,
.. _strutils: strutils.html
.. _system: system.html