split from #24425
Matching `tyGenericBody` performs a match on the last child of the
generic body, in this case the uninstantied `tyObject` type. If the
object contains no generic fields, this ends up being the same type as
all instantiated ones, but if it does, a new type is created. This fails
the `sameObjectTypes` check that subtype matching for object types uses.
To fix this, also consider that the pattern type could be the generic
uninstantiated object type of the matched type in subtype matching.
split from #24425
The added test did not work previously. The result of `getTypeImpl` is
the uninstantiated AST of the original type symbol, and the macro
attempts to use this type for the result. To fix the issue, the provided
`typedesc` argument is used instead.
fixes#17958
In `transf`, conversions in subscript expressions are skipped (with
`skipConv`'s rules). This is because array indexing can produce
conversions to the range type that is the array's index type, which
causes a `RangeDefect` rather than an `IndexDefect` (and also
`--rangeChecks` and `--indexChecks` are both considered). However this
causes problems when explicit conversions are used, between types of
different bitsizes, because those also get skipped.
To fix this, we only skip the conversion if:
* it's a hidden (implicit) conversion
* it's a range check conversion (produces `nkChckRange`)
* the subscript is on an array type and the result type of the
conversion has the same bounds as the array index type
And `skipConv` rules also still apply (int/float classification).
Another idea would be to prevent the implicit conversion to the array
index type from being generated. But there is no good way to do this:
matching to the base type instead prevents types like `uint32` from
implicitly converting (i.e. it can convert to `range[0..3]` but not
`int`), and analyzing whether this is an array bound check is easier in
`transf`, since `sigmatch` just produces a type conversion.
The rules for skipping the conversion could also receive some other
tweaks: We could add a rule that changing bitsizes also doesn't skip the
conversion, but this breaks the `uint32` case. We could simplify it to
only removing implicit skips to specifically fix#17958, but this is
wrong in general.
We could also add something like `nkChckIndex` that generates index
errors instead but this is weird when it doesn't have access to the
collection type and it might be overkill.
The lower half of cgen contains the main proc and HCR init code which
cause a large diff, so they are excluded from this PR. In general things
like generated defines, line directives, the stacktrace macros (`nimfr_`
etc) are also not done, since there are not exact equivalents for these
in NIFC (NIFC does generate line directives but based on the NIF line
info mechanism).
fixes#14067, fixes#15004, fixes#19019
Anonymous procs are [added to
scope](8091d76306/compiler/semstmts.nim (L2466))
with the name `:anonymous`. This means that if they have the same
signature in a scope, they can consider each other as redefinitions. To
prevent this, mark their symbols as `sfGenSym` so they do not get added
to scope or cause any name conflicts. The commented out `and not isAnon`
check wouldn't work because `isAnon` would not be true if the proc is
being resemmed, in which case the name field in the proc AST would have
the symbol of the anonymous proc rather than being empty.
There is a separate problem of default values in generic/normal procs
not opening new scopes which is partially responsible for #19019.
This adapts most of ccgstmts to cbuilder, missing C++ exceptions and
`genCaseGeneric`/`genIfForCaseUntil`. Rebased version of #24399 which
was split into some other PRs and since has had conflicts with #24416.
fixes#24415
Since #23978 (which is in 2.0), all generic types that alias to another
type now insert a `tyAlias` layer in their value. However the
`skipGenericAlias` etc code which `sigmatch` uses is not updated for
this, so `tyAlias` is now skipped in these.
The relevant code in sigmatch is:
67ad1ae159/compiler/sigmatch.nim (L1668-L1673)
This behavior is also suspicious IMO, not skipping a structural
`tyGenericInst` alias can be useful for code like #10220, but this is
currently arbitrarily decided based on "depth" and whether the alias is
to another `tyGenericInst` type or not. Maybe in the future we could
enforce the use of a nominal type.
fixes issue described in https://forum.nim-lang.org/t/12579
In #24065 explicit generic parameter matching was made to fail matches
on arguments with unresolved types in generic contexts (the sigmatch
diff, following #24010), similar to what is done for regular calls since
#22029. However unlike regular calls, a failed match in a generic
context for a standalone explicit generic instantiation did not convert
the expression into one with `tyFromExpr` type, which means it would
error immediately given any unresolved parameter. This is now done to
fix the issue.
For explicit generic instantiations on single non-overloaded symbols, a
successful match is still instantiated. For multiple overloads (i.e.
symchoice), if any of the overloads fail the match, the entire
expression is considered untyped and any instantiations are not used, so
as to not void overloads that would match later. This means even
symchoices without unresolved arguments aren't instantiated, which may
be too restrictive, but it could also be too lenient and we might need
to make symchoice instantiations always untyped. The behavior for
symchoice is not sound anyway given it causes #9997 so this is something
to consider for a redesign.
Diff follows #24276.
Not all uses of switch statements are implemented, `genCaseGeneric` and
`genIfForCaseUntil` still exist. The main purpose is to finish
`ccgreset` and `ccgtrav`.
Finishes `genEnumInfo` as followup to #24351. As #24381 mentions this
covers every use of `for` loops in the codegen.
---------
Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
Doing this early is useful so we can move the indentation logic into
`Builder` itself rather than mix it with the block logic in `ccgstmts`
(the `if` statements in #24381 have not been indented properly either).
However it also means `Builder` is now used for code that still
generates raw C code, so the diff won't be as clean when these get
updated.
This commit fixes/adds tests for and fixes several issues with `JOIN`
operator parsing:
- For OUTER joins, LEFT | RIGHT | FULL specifier is not optional
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
OUTER JOIN b
ON a.id = b.id
""")
```
- For NATURAL JOIN and CROSS JOIN, ON and USING clauses are forbidden
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
CROSS JOIN b
ON a.id = b.id
""")
```
- JOIN should parse as part of FROM, not after WHERE
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
WHERE a.id IS NOT NULL
INNER JOIN b
ON a.id = b.id
""")
```
- LEFT JOIN should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
LEFT JOIN b
ON a.id = b.id
""") == "select id from a left join b on a.id = b.id;"
```
- NATURAL JOIN should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
NATURAL JOIN b
""") == "select id from a natural join b;"
```
- USING should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
JOIN b
USING (id)
""") == "select id from a join b using (id );"
```
- Multiple JOINs should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
JOIN b
ON a.id = b.id
LEFT JOIN c
USING (id)
""") == "select id from a join b on a.id = b.id left join c using (id );"
```
follows up #24381
This is about another 30% of ccgexprs (remaining is up to around line
3000), stopped right before the point where #24391 is required.
The `Genode::log` and `Genode::Cstring` calls in `genEcho` are left as
is but could be mangled in the future.
C++ member procs are not implemented, and `codegenDecl` in general is
also not adapted, although it had to be included in the generation of
proc params for simplicity. Guessing `codegenDecl` and C++ stuff are
supposed to map to pragmas on NIFC, or are just not supported.
Most of what ccgexprs uses is now ported to cbuilder, so this PR makes
around ~25% of ccgexprs use it, along with adding `if` stmts (no
`while`/`switch` and `for` which is only used as `for (tmp = a; tmp < b;
tmp++)`). The `if` builder does not add indents for blocks since we
can't make `Builder` an object yet rather than an alias to `string`,
this will likely be one of the last refactors.
Somewhat unrelated but `ccgtypes` is not ready yet because proc
signatures are not implemented.
fixes#24378
```nim
type Win = typeof(`body`)
doAssert not supportsCopyMem((int, Win))
```
`semAfterMacroCall` doesn't skip the children aliases types in the tuple
typedesc construction while the normal program seem to skip the aliases
types somewhere
`(int, Win)` is kept as `(int, alias string)` instead of expected `(int,
string)`
The types of the typed operations are not necessarily correct, the mixed
types of the operands in some cases are left the same to keep the C
integer promotion working as expected. To fix any wrong types the
integer promotion can be done explicitly instead, but this is a behavior
change to codegen.
Different from `intLiteral`, we add procs that just try to generate
integer values, assuming they're not an edge case like `<= low(int32)`
or `> high(int32)`.
refs #23625, refs #24289
Encountered in #24360 but could not reproduce minimally: overloading on
static parameters can work with the normal compile commands but crash
`nim check`. Static overloading relies on `tryConstExpr` which recovers
from things like `globalError` and fails softly, in this case this can
happen when a variable etc. is not available to evaluate in the VM. But
with `nim check`, the compiler does not throw an exception in this case,
and instead tries to keep generating the entire expression in the VM,
which can cause crashes.
To fix this, when the compiler has no error outputs even on `nim check`,
we raise a global error so that the VM code generation stops early. This
fixes both `tryConstExpr` and speeds up `nim check`, because no error
outputs means we don't need cascading errors.
Links for succ/pred/inc/dec were incorrect since they link to a symbol
with `int` as their second type.
Uses local referencing instead so that it links to the correct symbol.
Doesn't change the other links since they worked and doing the same
local referencing for `high`/`low` would've make them link to the group
of procs instead of the specific ones for ordinals
pre-existing issues:
```nim
block:
type
FooObj = object
data: int
Foo = ref ref FooObj
proc delete(self: Foo) =
echo self.data
var s: Foo
new(s, delete)
```
it crashed with arc/orc in 1.6.x and 2.x.x
```nim
block:
type
Foo = ref int
proc delete(self: Foo) =
echo self[]
var s: Foo
new(s, delete)
```
The simple fix is to add a type restriction for the type `T` for arc/orc
versions
```nim
proc new*[T: object](a: var ref T, finalizer: proc (x: T) {.nimcall.})
```
fixes#13417, fixes#19703
When passing an expression to an `openarray` iterator parameter: If the
expression is a statement list (considered "complex"), it's assigned in
a non-deep-copying way to a temporary variable first, then this variable
is used as a parameter. If it's not a statement list, i.e. a call or a
symbol, the parameter is substituted directly with the given expression.
In the case of calls, this results in the call potentially being
executed more than once, or can cause redefined variables in the
codegen.
To fix this, calls are also considered as "complex" assignments to
openarrays, as long as the return type of the call is not `openarray` as
the generated assignment in that case has issues/is unimplemented
(caused a segfault [here in
datamancer](47ba4d81bf/src/datamancer/dataframe.nim (L1580))).
As for why creating a temporary isn't the default only with exceptions
for things like `nkSym`, the "non-deep-copying" way of assignment
apparently still causes arrays to be copied according to a comment in
the code. I'm not sure to what extent this is true: if it still happens
on ARC/ORC, if it happens for every array length, or if we can fix it by
passing arrays by reference. Otherwise, a more general way to assign to
openarrays might be needed, but I'm not sure if the compiler can easily
do this.
This PR is somewhat large, worst case it can be split into one with just
assignments and one with just fields/derefs etc.
Assignments with calls as values have not been touched so they can be
done when calls are implemented. Similarly codegen with complex logic
i.e. `genEnumInfo`, `genTypeInfoV2`, `unaryExpr` is not completely
ported yet so they can be done in standalone PRs.
closes https://github.com/nim-lang/RFCs/issues/380, fixes#4773, fixes
#14729, fixes#16755, fixes#18150, fixes#22984, refs #11167 (only some
comments fixed), refs #12620 (needs tiny workaround)
The compiler gains a concept of root "nominal" types (i.e. objects,
enums, distincts, direct `Foo = ref object`s, generic versions of all of
these). Exported top-level routines in the same module as the nominal
types that their parameter types derive from (i.e. with
`var`/`sink`/`typedesc`/generic constraints) are considered attached to
the respective type, as the RFC states. This happens for every argument
regardless of placement.
When a call is overloaded and overload matching starts, for all
arguments in the call that already have a type, we add any operation
with the same name in the scope of the root nominal type of each
argument (if it exists) to the overload match. This also happens as
arguments gradually get typed after every overload match. This restricts
the considered overloads to ones attached to the given arguments, as
well as preventing `untyped` arguments from being forcefully typed due
to unrelated overloads. There are some caveats:
* If no overloads with a name are in scope, type bound ops are not
triggered, i.e. if `foo` is not declared, `foo(x)` will not consider a
type bound op for `x`.
* If overloads in scope do not have enough parameters up to the argument
which needs its type bound op considered, then type bound ops are also
not added. For example, if only `foo()` is in scope, `foo(x)` will not
consider a type bound op for `x`.
In the cases of "generic interfaces" like `hash`, `$`, `items` etc. this
is not really a problem since any code using it will have at least one
typed overload imported. For arbitrary versions of these though, as in
the test case for #12620, a workaround is to declare a temporary
"template" overload that never matches:
```nim
# neither have to be exported, just needed for any use of `foo`:
type Placeholder = object
proc foo(_: Placeholder) = discard
```
I don't know what a "proper" version of this could be, maybe something
to do with the new concepts.
Possible directions:
A limitation with the proposal is that parameters like `a: ref Foo` are
not attached to any type, even if `Foo` is nominal. Fixing this for just
`ptr`/`ref` would be a special case, parameters like `seq[Foo]` would
still not be attached to `Foo`. We could also skip any *structural* type
but this could produce more than one nominal type, i.e. `(Foo, Bar)`
(not that this is hard to implement, it just might be unexpected).
Converters do not use type bound ops, they still need to be in scope to
implicitly convert. But maybe they could also participate in the nominal
type consideration: if `Generic[T] = distinct T` has a converter to `T`,
both `Generic` and `T` can be considered as nominal roots.
The other restriction in the proposal, being in the same scope as the
nominal type, could maybe be worked around by explicitly attaching to
the type, i.e.: `proc foo(x: T) {.attach: T.}`, similar to class
extensions in newer OOP languages. The given type `T` needs to be
obtainable from the type of the given argument `x` however, i.e.
something like `proc foo(x: ref T) {.attach: T.}` doesn't work to fix
the `ref` issue since the compiler never obtains `T` from a given `ref
T` argument. Edit: Since the module is queried now, this is likely not
possible.
---------
Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
fixes#24236
Locally tested to generate a 100 KB file for TCC. Empty flexible array
size is standard in C99 but maybe some compilers still don't support it.
At the very least an array size of 1000000 should be rare.