mirror of
https://github.com/nim-lang/Nim.git
synced 2025-12-29 01:14:41 +00:00
276 lines
8.2 KiB
Plaintext
276 lines
8.2 KiB
Plaintext
=========================================================
|
|
Term rewriting macros for Nimrod
|
|
=========================================================
|
|
|
|
:Author: Andreas Rumpf
|
|
|
|
Term rewriting macros are macros or templates that have not only a *name* but
|
|
also a *pattern* that is searched for after the semantic checking phase of
|
|
the compiler: This means they provide an easy way to enhance the compilation
|
|
pipeline with user defined optimizations:
|
|
|
|
.. code-block:: nimrod
|
|
template optMul{`*`(a, 2)}(a: int): int = a+a
|
|
|
|
let x = 3
|
|
echo x * 2
|
|
|
|
The compiler now rewrites ``x * 2`` as ``x + x``. The code inside the
|
|
curlies is the pattern to match against. The operators ``*``, ``**``,
|
|
``|``, ``~`` have a special meaning in patterns if they are written in infix
|
|
notation, so to match verbatim against ``*`` the ordinary function call syntax
|
|
needs to be used.
|
|
|
|
|
|
Unfortunately optimizations are hard to get right and even the tiny example
|
|
is **wrong**:
|
|
|
|
.. code-block:: nimrod
|
|
template optMul{`*`(a, 2)}(a: int): int = a+a
|
|
|
|
proc f(): int =
|
|
echo "side effect!"
|
|
result = 55
|
|
|
|
echo f() * 2
|
|
|
|
We cannot duplicate 'a' if it denotes an expression that has a side effect!
|
|
Fortunately Nimrod supports side effect analysis:
|
|
|
|
.. code-block:: nimrod
|
|
template optMul{`*`(a, 2)}(a: int{noSideEffect}): int = a+a
|
|
|
|
proc f(): int =
|
|
echo "side effect!"
|
|
result = 55
|
|
|
|
echo f() * 2 # not optimized ;-)
|
|
|
|
So what about ``2 * a``? We should tell the compiler ``*`` is commutative. We
|
|
cannot really do that however as the following code only swaps arguments
|
|
blindly:
|
|
|
|
.. code-block:: nimrod
|
|
template mulIsCommutative{`*`(a, b)}(a, b: int): int = b*a
|
|
|
|
What optimizers really need to do is a *canonicalization*:
|
|
|
|
.. code-block:: nimrod
|
|
template canonMul{`*`(a, b)}(a: int{lit}, b: int): int = b*a
|
|
|
|
The ``int{lit}`` parameter pattern matches against an expression of
|
|
type ``int``, but only if it's a literal.
|
|
|
|
|
|
|
|
Parameter constraints
|
|
=====================
|
|
|
|
The parameter constraint expression can use the operators ``|`` (or),
|
|
``&`` (and) and ``~`` (not) and the following predicates:
|
|
|
|
=================== =====================================================
|
|
Predicate Meaning
|
|
=================== =====================================================
|
|
``atom`` The matching node has no children.
|
|
``lit`` The matching node is a literal like "abc", 12.
|
|
``sym`` The matching node must be a symbol (a bound
|
|
identifier).
|
|
``ident`` The matching node must be an identifier (an unbound
|
|
identifier).
|
|
``call`` The matching AST must be a call/apply expression.
|
|
``lvalue`` The matching AST must be an lvalue.
|
|
``sideeffect`` The matching AST must have a side effect.
|
|
``nosideeffect`` The matching AST must have no side effect.
|
|
``param`` A symbol which is a parameter.
|
|
``genericparam`` A symbol which is a generic parameter.
|
|
``module`` A symbol which is a module.
|
|
``type`` A symbol which is a type.
|
|
``var`` A symbol which is a variable.
|
|
``let`` A symbol which is a ``let`` variable.
|
|
``const`` A symbol which is a constant.
|
|
``result`` The special ``result`` variable.
|
|
``proc`` A symbol which is a proc.
|
|
``method`` A symbol which is a method.
|
|
``iterator`` A symbol which is an iterator.
|
|
``converter`` A symbol which is a converter.
|
|
``macro`` A symbol which is a macro.
|
|
``template`` A symbol which is a template.
|
|
``field`` A symbol which is a field in a tuple or an object.
|
|
``enumfield`` A symbol which is a field in an enumeration.
|
|
``forvar`` A for loop variable.
|
|
``label`` A label (used in ``block`` statements).
|
|
``nk*`` The matching AST must have the specified kind.
|
|
(Example: ``nkIfStmt`` denotes an ``if`` statement.)
|
|
``alias`` States that the marked parameter needs to alias
|
|
with *some* other parameter.
|
|
``noalias`` States that *every* other parameter must not alias
|
|
with the marked parameter.
|
|
=================== =====================================================
|
|
|
|
The ``alias`` and ``noalias`` predicates refer not only to the matching AST,
|
|
but also to every other bound parameter; syntactially they need to occur after
|
|
the ordinary AST predicates:
|
|
|
|
.. code-block:: nimrod
|
|
template ex{a = b + c}(a: int{noalias}, b, c: int) =
|
|
# this transformation is only valid if 'b' and 'c' do not alias 'a':
|
|
a = b
|
|
inc a, b
|
|
|
|
|
|
Pattern operators
|
|
=================
|
|
|
|
The operators ``*``, ``**``, ``|``, ``~`` have a special meaning in patterns
|
|
if they are written in infix notation.
|
|
|
|
|
|
The ``|`` operator
|
|
------------------
|
|
|
|
The ``|`` operator if used as infix operator creates an ordered choice:
|
|
|
|
.. code-block:: nimrod
|
|
template t{0|1}(): expr = 3
|
|
let a = 1
|
|
# outputs 3:
|
|
echo a
|
|
|
|
The matching is performed after the compiler performed some optimizations like
|
|
constant folding, so the following does not work:
|
|
|
|
.. code-block:: nimrod
|
|
template t{0|1}(): expr = 3
|
|
# outputs 1:
|
|
echo 1
|
|
|
|
The reason is that the compiler already transformed the 1 into "1" for
|
|
the ``echo`` statement. However, a term rewriting macro should not change the
|
|
semantics anyway. In fact they can be deactived with the ``--patterns:off``
|
|
command line option or temporarily with the ``patterns`` pragma.
|
|
|
|
|
|
The ``{}`` operator
|
|
-------------------
|
|
|
|
A pattern expression can be bound to a pattern parameter via the ``expr{param}``
|
|
notation:
|
|
|
|
.. code-block:: nimrod
|
|
template t{(0|1|2){x}}(x: expr): expr = x+1
|
|
let a = 1
|
|
# outputs 2:
|
|
echo a
|
|
|
|
|
|
The ``~`` operator
|
|
------------------
|
|
|
|
The ``~`` operator is the **not** operator in patterns:
|
|
|
|
.. code-block:: nimrod
|
|
template t{x = (~x){y} and (~x){z}}(x, y, z: bool): stmt =
|
|
x = y
|
|
if x: x = z
|
|
|
|
var
|
|
a = false
|
|
b = true
|
|
c = false
|
|
a = b and c
|
|
echo a
|
|
|
|
|
|
The ``*`` operator
|
|
------------------
|
|
|
|
The ``*`` operator can *flatten* a nested binary expression like ``a & b & c``
|
|
to ``&(a, b, c)``:
|
|
|
|
.. code-block:: nimrod
|
|
var
|
|
calls = 0
|
|
|
|
proc `&&`(s: varargs[string]): string =
|
|
result = s[0]
|
|
for i in 1..len(s)-1: result.add s[i]
|
|
inc calls
|
|
|
|
template optConc{ `&&` * a }(a: string): expr = &&a
|
|
|
|
let space = " "
|
|
echo "my" && (space & "awe" && "some " ) && "concat"
|
|
|
|
# check that it's been optimized properly:
|
|
doAssert calls == 1
|
|
|
|
|
|
The second operator of `*` must be a parameter; it is used to gather all the
|
|
arguments. The expression ``"my" && (space & "awe" && "some " ) && "concat"``
|
|
is passed to ``optConc`` in ``a`` as a special list (of kind ``nkArgList``)
|
|
which is flattened into a call expression; thus the invocation of ``optConc``
|
|
produces:
|
|
|
|
.. code-block:: nimrod
|
|
`&&`("my", space & "awe", "some ", "concat")
|
|
|
|
|
|
The ``**`` operator
|
|
-------------------
|
|
|
|
The ``**`` is much like the ``*`` operator, except that it gathers not only
|
|
all the arguments, but also the matched operators in reverse polish notation:
|
|
|
|
.. code-block:: nimrod
|
|
import macros
|
|
|
|
type
|
|
TMatrix = object
|
|
dummy: int
|
|
|
|
proc `*`(a, b: TMatrix): TMatrix = nil
|
|
proc `+`(a, b: TMatrix): TMatrix = nil
|
|
proc `-`(a, b: TMatrix): TMatrix = nil
|
|
proc `$`(a: TMatrix): string = result = $a.dummy
|
|
proc mat21(): TMatrix =
|
|
result.dummy = 21
|
|
|
|
macro optM{ (`+`|`-`|`*`) ** a }(a: TMatrix): expr =
|
|
echo treeRepr(a)
|
|
result = newCall(bindSym"mat21")
|
|
|
|
var x, y, z: TMatrix
|
|
|
|
echo x + y * z - x
|
|
|
|
This passes the expression ``x + y * z - x`` to the ``optM`` macro as
|
|
an ``nnkArgList`` node containing::
|
|
|
|
Arglist
|
|
Sym "x"
|
|
Sym "y"
|
|
Sym "z"
|
|
Sym "*"
|
|
Sym "+"
|
|
Sym "x"
|
|
Sym "-"
|
|
|
|
(Which is the reverse polish notation of ``x + y * z - x``.)
|
|
|
|
|
|
Parameters
|
|
==========
|
|
|
|
Parameters in a pattern are type checked in the matching process. If a
|
|
parameter is of the type ``varargs`` it is treated specially and it can match
|
|
0 or more arguments in the AST to be matched against:
|
|
|
|
.. code-block:: nimrod
|
|
template optWrite{
|
|
write(f, x)
|
|
((write|writeln){w})(f, y)
|
|
}(x, y: varargs[expr], f: TFile, w: expr) =
|
|
w(f, x, y)
|
|
|