Files
Nim/doc/manual/trmacros.txt
2016-07-30 16:34:42 +02:00

368 lines
11 KiB
Plaintext

Term rewriting macros
=====================
Term rewriting macros are macros or templates that have not only
a *name* but also a *pattern* that is searched for after the semantic checking
phase of the compiler: This means they provide an easy way to enhance the
compilation pipeline with user defined optimizations:
.. code-block:: nim
template optMul{`*`(a, 2)}(a: int): int = a+a
let x = 3
echo x * 2
The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the
curlies is the pattern to match against. The operators ``*``, ``**``,
``|``, ``~`` have a special meaning in patterns if they are written in infix
notation, so to match verbatim against ``*`` the ordinary function call syntax
needs to be used.
Unfortunately optimizations are hard to get right and even the tiny example
is **wrong**:
.. code-block:: nim
template optMul{`*`(a, 2)}(a: int): int = a+a
proc f(): int =
echo "side effect!"
result = 55
echo f() * 2
We cannot duplicate 'a' if it denotes an expression that has a side effect!
Fortunately Nim supports side effect analysis:
.. code-block:: nim
template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a
proc f(): int =
echo "side effect!"
result = 55
echo f() * 2 # not optimized ;-)
You can make one overload matching with a constraint and one without, and the
one with a constraint will have precedence, and so you can handle both cases
differently.
So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We
cannot really do that however as the following code only swaps arguments
blindly:
.. code-block:: nim
template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a
What optimizers really need to do is a *canonicalization*:
.. code-block:: nim
template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a
The ``int{lit}`` parameter pattern matches against an expression of
type ``int``, but only if it's a literal.
Parameter constraints
---------------------
The `parameter constraint`:idx: expression can use the operators ``|`` (or),
``&`` (and) and ``~`` (not) and the following predicates:
=================== =====================================================
Predicate Meaning
=================== =====================================================
``atom`` The matching node has no children.
``lit`` The matching node is a literal like "abc", 12.
``sym`` The matching node must be a symbol (a bound
identifier).
``ident`` The matching node must be an identifier (an unbound
identifier).
``call`` The matching AST must be a call/apply expression.
``lvalue`` The matching AST must be an lvalue.
``sideeffect`` The matching AST must have a side effect.
``nosideeffect`` The matching AST must have no side effect.
``param`` A symbol which is a parameter.
``genericparam`` A symbol which is a generic parameter.
``module`` A symbol which is a module.
``type`` A symbol which is a type.
``var`` A symbol which is a variable.
``let`` A symbol which is a ``let`` variable.
``const`` A symbol which is a constant.
``result`` The special ``result`` variable.
``proc`` A symbol which is a proc.
``method`` A symbol which is a method.
``iterator`` A symbol which is an iterator.
``converter`` A symbol which is a converter.
``macro`` A symbol which is a macro.
``template`` A symbol which is a template.
``field`` A symbol which is a field in a tuple or an object.
``enumfield`` A symbol which is a field in an enumeration.
``forvar`` A for loop variable.
``label`` A label (used in ``block`` statements).
``nk*`` The matching AST must have the specified kind.
(Example: ``nkIfStmt`` denotes an ``if`` statement.)
``alias`` States that the marked parameter needs to alias
with *some* other parameter.
``noalias`` States that *every* other parameter must not alias
with the marked parameter.
=================== =====================================================
Predicates that share their name with a keyword have to be escaped with
backticks: `` `const` ``.
The ``alias`` and ``noalias`` predicates refer not only to the matching AST,
but also to every other bound parameter; syntactically they need to occur after
the ordinary AST predicates:
.. code-block:: nim
template ex{a = b + c}(a: int{noalias}, b, c: int) =
# this transformation is only valid if 'b' and 'c' do not alias 'a':
a = b
inc a, c
Pattern operators
-----------------
The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns
if they are written in infix notation.
The ``|`` operator
~~~~~~~~~~~~~~~~~~
The ``|`` operator if used as infix operator creates an ordered choice:
.. code-block:: nim
template t{0|1}(): untyped = 3
let a = 1
# outputs 3:
echo a
The matching is performed after the compiler performed some optimizations like
constant folding, so the following does not work:
.. code-block:: nim
template t{0|1}(): untyped = 3
# outputs 1:
echo 1
The reason is that the compiler already transformed the 1 into "1" for
the ``echo`` statement. However, a term rewriting macro should not change the
semantics anyway. In fact they can be deactivated with the ``--patterns:off``
command line option or temporarily with the ``patterns`` pragma.
The ``{}`` operator
~~~~~~~~~~~~~~~~~~~
A pattern expression can be bound to a pattern parameter via the ``expr{param}``
notation:
.. code-block:: nim
template t{(0|1|2){x}}(x: untyped): untyped = x+1
let a = 1
# outputs 2:
echo a
The ``~`` operator
~~~~~~~~~~~~~~~~~~
The ``~`` operator is the **not** operator in patterns:
.. code-block:: nim
template t{x = (~x){y} and (~x){z}}(x, y, z: bool) =
x = y
if x: x = z
var
a = false
b = true
c = false
a = b and c
echo a
The ``*`` operator
~~~~~~~~~~~~~~~~~~
The ``*`` operator can *flatten* a nested binary expression like ``a & b & c``
to ``&(a, b, c)``:
.. code-block:: nim
var
calls = 0
proc `&&`(s: varargs[string]): string =
result = s[0]
for i in 1..len(s)-1: result.add s[i]
inc calls
template optConc{ `&&` * a }(a: string): untyped = &&a
let space = " "
echo "my" && (space & "awe" && "some " ) && "concat"
# check that it's been optimized properly:
doAssert calls == 1
The second operator of `*` must be a parameter; it is used to gather all the
arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"``
is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``)
which is flattened into a call expression; thus the invocation of ``optConc``
produces:
.. code-block:: nim
`&&`("my", space & "awe", "some ", "concat")
The ``**`` operator
~~~~~~~~~~~~~~~~~~~
The ``**`` is much like the ``*`` operator, except that it gathers not only
all the arguments, but also the matched operators in reverse polish notation:
.. code-block:: nim
import macros
type
Matrix = object
dummy: int
proc `*`(a, b: Matrix): Matrix = discard
proc `+`(a, b: Matrix): Matrix = discard
proc `-`(a, b: Matrix): Matrix = discard
proc `$`(a: Matrix): string = result = $a.dummy
proc mat21(): Matrix =
result.dummy = 21
macro optM{ (`+`|`-`|`*`) ** a }(a: Matrix): untyped =
echo treeRepr(a)
result = newCall(bindSym"mat21")
var x, y, z: Matrix
echo x + y * z - x
This passes the expression ``x + y * z - x`` to the ``optM`` macro as
an ``nnkArgList`` node containing::
Arglist
Sym "x"
Sym "y"
Sym "z"
Sym "*"
Sym "+"
Sym "x"
Sym "-"
(Which is the reverse polish notation of ``x + y * z - x``.)
Parameters
----------
Parameters in a pattern are type checked in the matching process. If a
parameter is of the type ``varargs`` it is treated specially and it can match
0 or more arguments in the AST to be matched against:
.. code-block:: nim
template optWrite{
write(f, x)
((write|writeLine){w})(f, y)
}(x, y: varargs[untyped], f: File, w: untyped) =
w(f, x, y)
Example: Partial evaluation
---------------------------
The following example shows how some simple partial evaluation can be
implemented with term rewriting:
.. code-block:: nim
proc p(x, y: int; cond: bool): int =
result = if cond: x + y else: x - y
template optP1{p(x, y, true)}(x, y: untyped): untyped = x + y
template optP2{p(x, y, false)}(x, y: untyped): untyped = x - y
Example: Hoisting
-----------------
The following example shows how some form of hoisting can be implemented:
.. code-block:: nim
import pegs
template optPeg{peg(pattern)}(pattern: string{lit}): Peg =
var gl {.global, gensym.} = peg(pattern)
gl
for i in 0 .. 3:
echo match("(a b c)", peg"'(' @ ')'")
echo match("W_HI_Le", peg"\y 'while'")
The ``optPeg`` template optimizes the case of a peg constructor with a string
literal, so that the pattern will only be parsed once at program startup and
stored in a global ``gl`` which is then re-used. This optimization is called
hoisting because it is comparable to classical loop hoisting.
AST based overloading
=====================
Parameter constraints can also be used for ordinary routine parameters; these
constraints affect ordinary overloading resolution then:
.. code-block:: nim
proc optLit(a: string{lit|`const`}) =
echo "string literal"
proc optLit(a: string) =
echo "no string literal"
const
constant = "abc"
var
variable = "xyz"
optLit("literal")
optLit(constant)
optLit(variable)
However, the constraints ``alias`` and ``noalias`` are not available in
ordinary routines.
Move optimization
-----------------
The ``call`` constraint is particularly useful to implement a move
optimization for types that have copying semantics:
.. code-block:: nim
proc `[]=`*(t: var Table, key: string, val: string) =
## puts a (key, value)-pair into `t`. The semantics of string require
## a copy here:
let idx = findInsertionPosition(key)
t[idx].key = key
t[idx].val = val
proc `[]=`*(t: var Table, key: string{call}, val: string{call}) =
## puts a (key, value)-pair into `t`. Optimized version that knows that
## the strings are unique and thus don't need to be copied:
let idx = findInsertionPosition(key)
shallowCopy t[idx].key, key
shallowCopy t[idx].val, val
var t: Table
# overloading resolution ensures that the optimized []= is called here:
t[f()] = g()