When parsing `a = 1` with `langPython`, Eof is reported unexpectedly.
Fix: allow other languages to fallback to "Identifier" when it is not a
keyword.
This patch is useful as this is a highlighter. `Eof` as annoying.
fixes#24449
The standalone `seq` type is a `tyBuiltInTypeClass` with a single child
of kind `tySequence`, which itself has no children. This is also the
case for most other `tyBuiltInTypeClass` kinds. However this can cause a
crash in sigmatch when calling `isEmptyContainer` on this child type,
which expects the sequence type to have children. This check was added
in #5557 to prevent empty collections like `@[]` from matching their
respective typeclass, but it's not useful when matching against another
typeclass (which is done here to resolve an ambiguity). So to avoid the
crash, this empty container check is disabled when matching against
another typeclass.
closes#22369, closes#23197, closes#24385, refs #21328
According to #21328 the standard library on Unix should be installed in
`/usr/lib/nim/lib`, however the installer was not updated for this
change, hence the problem as described in
https://github.com/nim-lang/Nim/issues/23197#issuecomment-2031386896.
Have not tested if this fixes the problem but the comment heavily
implies it does. The problem is also in 2.0 so it could be backported
but I can't say for sure that it works and doesn't break anything.
fixes#24450
The new concepts were previously not included in
[containsGenericType][1] which prevents them from being instantiated.
Here they are included by being added to `tyTypeClasses` though this
doesn't have to be done, they can also be added manually to
`containsGenericTypeIter`, but this might be too specific.
[1]:
a2031ec6cf/compiler/types.nim (L1507-L1517)
closes#24372, refs #20091
This was added in #20091 for some reason but doesn't actually work and
only makes error messages more obscure. So for now, it's disabled.
Can also be backported to 2.0 if necessary.
follows up #24425, fixes#18861, fixes#22445
Since #24425 generic object and distinct types now accurately link back
to their generic instantiations. To work around this issue previously,
type matching checked if generic objects/distinct types were
*structurally* equal, which caused false positives with types that
didn't use generic parameters in their structures. This structural check
is now removed, in cases where generic objects/distinct types require a
nontrivial equality check, the generic parameters of the `typeInst`
fields are checked for equality instead.
The check is copied from `tyGenericInst`, but the check in
`tyGenericInst` is not always sufficient as this type can be skipped or
unreachable in the case of `ref object`s.
fixes#22479, fixes#24374, depends on #24429 and #24430
When instantiating generic types which directly have nominal types
(object, distinct, ref/ptr object but not enums[^1]) as their values,
the nominal type is now copied (in the case of ref objects, its child as
well) so that it receives a fresh ID and `typeInst` field. Previously
this only happened if it contained any generic types in its structure,
as is the case for all other types.
This solves #22479 and #24374 by virtue of the IDs being unique, which
is what destructors check for. Technically types containing generic
param fields work for the same reason. There is also the benefit that
the `typeInst` field is correct. However issues like #22445 aren't
solved because the compiler still uses structural object equality checks
for inheritance etc. which could be removed in a later PR.
Also fixes a pre-existing issue where destructors bound to object types
with generic fields would not error when attempting to define a user
destructor after the fact, but the error message doesn't show where the
implicit destructor was created now since it was only created for
another instance. To do this, a type flag is used that marks the generic
type symbol when a generic instance has a destructor created. Reusing
`tfCheckedForDestructor` for this doesn't work.
Maybe there is a nicer design that isn't an overreliance on the ID
mechanism, but the shortcomings of `tyGenericInst` are too ingrained in
the compiler to use for this. I thought about maybe adding something
like `tyNominalGenericInst`, but it's really much easier if the nominal
type itself directly contains the information of its generic parameters,
or at least its "symbol", which the design is heading towards.
[^1]: See [this
test](21420d8b09/lib/std/enumutils.nim (L102))
in enumutils. The field symbols `b0`/`b1` always have the uninstantiated
type `B` because enum fields don't expect to be generic, so no generic
instance of `B` matches its own symbols. Wouldn't expect anyone to use
generic enums but maybe someone does.
fixes#17571
Objects in the VM are represented as object constructor nodes that
contain every single field, including ones in different case branches.
This is so that every field has a unique invariant index in the object
constructor that can be written to and read from. However when
converting this node back into semantic code, fields from inactive case
branches can remain in the constructor which causes bad codegen,
generating assignments to fields from other case branches.
To fix this, fields from inactive branches are now detected in
`semmacrosanity.annotateType` (called in `fixupTypeAfterEval`) and
marked to prevent the codegen of their assignments. In #24441 these
fields were excluded from the resulting node, but this causes issues
when the node is directly supposed to go back into the VM, for example
as `const` values. I don't know if this is the only case where this
happens, so I wasn't sure about how to keep that implementation working.
fixes#24434
In https://github.com/nim-lang/Nim/pull/24432
```nim
let ex = "NIM_EXTERNC N_NIMCALL(void, nimLoadProcs$1)(void) {$2}$N$N" %
[(i.ord - '0'.ord).rope, extract(el)]
```
```nim
procs.addDeclWithVisibility(ExternC):
procs.addProcHeader(ccNimCall, "nimLoadProcs" & $(i.ord - '0'.ord), "void", cProcParams())
```
extern "C" makes a function-name in C++ have C linkage; it should be
effaced with C compiler
Follows up #24423, needed more refactoring than I expected, sorry for
ugly diff.
With this pretty much all of the raw C code generating parts of the
codegen are abstracted into the cbuilder API (to my knowledge at least).
The current design of NIFC does not implement everything the codegen
generates, such things have mostly not been adapted, they are the
following along with how I'm guessing they could be implemented:
* C++ specific codegen: Maybe a dialect of NIFC for generating C++?
* `codegenDecl` pragma: Could be passed as a pragma to NIFC
* C macros, currently only used for line info IIRC i.e. `nimln_(123)`:
Just inline them when generating NIFC
* Other C defines & `#line`: Maybe as NIFC directives or line infos?
* There is also [this
`#ifndef`](21420d8b09/compiler/cgen.nim (L2249))
when generating headers but NIFC shouldn't need it
* `alignof`/`offsetof`: Is in `cbuilder` but not implemented in NIFC,
should be easy
For now we can disable C++ and the `codegenDecl` pragma when generating
NIFC but since cbuilder is mostly designed to generate NIFC as a flag
when booting the compiler, this hinders the ability to run the CI
against NIFC. Maybe we could also make cbuilder able to generate both C
and NIFC at runtime, this would be a large refactor but wouldn't be too
difficult.
Other missing abstractions before being able to generate NIFC are:
* Primitive types and symbols i.e. `int`, `void*`, `NI`, `NIM_NULL` are
currently still constant string literals, `NU8`, `NU16` etc are also
sometimes generated like `"NU" & $bits`.
* NIFC identifiers, i.e. adding `.c` to imported symbols and properly
mangling generated ones. Not sure how difficult this is going to be.
fixes#24402
```nim
iterator myPairsInline*[T](twoDarray: seq[seq[T]]): (int, seq[T]) {.inline.} =
for indexValuePair in twoDarray.pairs:
yield indexValuePair
proc innerTestTotalMem() =
var my2dArray: seq[seq[int32]] = @[]
# fill with some data...
for i in 0'i32..100:
var z = @[i, i+1]
my2dArray.add z
for oneDindex, innerArray in myPairsInline(my2dArray):
discard
innerTestTotalMem()
```
In `for oneDindex, innerArray in myPairsInline(my2dArray)`, `oneDindex`
and `innerArray` becomes `cursors` because they satisfy the criterion of
`isSimpleIteratorVar`. On the one hand, it is not correct to have them
point to the temporary generated by tuple unpacking, which left the
memory of the temporary uncleaned up. On the other hand, we don't need
to generate a temporary for a symbol node when unpacking the tuple.
split from #24425
Matching `tyGenericBody` performs a match on the last child of the
generic body, in this case the uninstantied `tyObject` type. If the
object contains no generic fields, this ends up being the same type as
all instantiated ones, but if it does, a new type is created. This fails
the `sameObjectTypes` check that subtype matching for object types uses.
To fix this, also consider that the pattern type could be the generic
uninstantiated object type of the matched type in subtype matching.
split from #24425
The added test did not work previously. The result of `getTypeImpl` is
the uninstantiated AST of the original type symbol, and the macro
attempts to use this type for the result. To fix the issue, the provided
`typedesc` argument is used instead.
fixes#17958
In `transf`, conversions in subscript expressions are skipped (with
`skipConv`'s rules). This is because array indexing can produce
conversions to the range type that is the array's index type, which
causes a `RangeDefect` rather than an `IndexDefect` (and also
`--rangeChecks` and `--indexChecks` are both considered). However this
causes problems when explicit conversions are used, between types of
different bitsizes, because those also get skipped.
To fix this, we only skip the conversion if:
* it's a hidden (implicit) conversion
* it's a range check conversion (produces `nkChckRange`)
* the subscript is on an array type and the result type of the
conversion has the same bounds as the array index type
And `skipConv` rules also still apply (int/float classification).
Another idea would be to prevent the implicit conversion to the array
index type from being generated. But there is no good way to do this:
matching to the base type instead prevents types like `uint32` from
implicitly converting (i.e. it can convert to `range[0..3]` but not
`int`), and analyzing whether this is an array bound check is easier in
`transf`, since `sigmatch` just produces a type conversion.
The rules for skipping the conversion could also receive some other
tweaks: We could add a rule that changing bitsizes also doesn't skip the
conversion, but this breaks the `uint32` case. We could simplify it to
only removing implicit skips to specifically fix#17958, but this is
wrong in general.
We could also add something like `nkChckIndex` that generates index
errors instead but this is weird when it doesn't have access to the
collection type and it might be overkill.
The lower half of cgen contains the main proc and HCR init code which
cause a large diff, so they are excluded from this PR. In general things
like generated defines, line directives, the stacktrace macros (`nimfr_`
etc) are also not done, since there are not exact equivalents for these
in NIFC (NIFC does generate line directives but based on the NIF line
info mechanism).
fixes#14067, fixes#15004, fixes#19019
Anonymous procs are [added to
scope](8091d76306/compiler/semstmts.nim (L2466))
with the name `:anonymous`. This means that if they have the same
signature in a scope, they can consider each other as redefinitions. To
prevent this, mark their symbols as `sfGenSym` so they do not get added
to scope or cause any name conflicts. The commented out `and not isAnon`
check wouldn't work because `isAnon` would not be true if the proc is
being resemmed, in which case the name field in the proc AST would have
the symbol of the anonymous proc rather than being empty.
There is a separate problem of default values in generic/normal procs
not opening new scopes which is partially responsible for #19019.
This adapts most of ccgstmts to cbuilder, missing C++ exceptions and
`genCaseGeneric`/`genIfForCaseUntil`. Rebased version of #24399 which
was split into some other PRs and since has had conflicts with #24416.
fixes#24415
Since #23978 (which is in 2.0), all generic types that alias to another
type now insert a `tyAlias` layer in their value. However the
`skipGenericAlias` etc code which `sigmatch` uses is not updated for
this, so `tyAlias` is now skipped in these.
The relevant code in sigmatch is:
67ad1ae159/compiler/sigmatch.nim (L1668-L1673)
This behavior is also suspicious IMO, not skipping a structural
`tyGenericInst` alias can be useful for code like #10220, but this is
currently arbitrarily decided based on "depth" and whether the alias is
to another `tyGenericInst` type or not. Maybe in the future we could
enforce the use of a nominal type.
fixes issue described in https://forum.nim-lang.org/t/12579
In #24065 explicit generic parameter matching was made to fail matches
on arguments with unresolved types in generic contexts (the sigmatch
diff, following #24010), similar to what is done for regular calls since
#22029. However unlike regular calls, a failed match in a generic
context for a standalone explicit generic instantiation did not convert
the expression into one with `tyFromExpr` type, which means it would
error immediately given any unresolved parameter. This is now done to
fix the issue.
For explicit generic instantiations on single non-overloaded symbols, a
successful match is still instantiated. For multiple overloads (i.e.
symchoice), if any of the overloads fail the match, the entire
expression is considered untyped and any instantiations are not used, so
as to not void overloads that would match later. This means even
symchoices without unresolved arguments aren't instantiated, which may
be too restrictive, but it could also be too lenient and we might need
to make symchoice instantiations always untyped. The behavior for
symchoice is not sound anyway given it causes #9997 so this is something
to consider for a redesign.
Diff follows #24276.
Not all uses of switch statements are implemented, `genCaseGeneric` and
`genIfForCaseUntil` still exist. The main purpose is to finish
`ccgreset` and `ccgtrav`.
Finishes `genEnumInfo` as followup to #24351. As #24381 mentions this
covers every use of `for` loops in the codegen.
---------
Co-authored-by: Andreas Rumpf <rumpf_a@web.de>
Doing this early is useful so we can move the indentation logic into
`Builder` itself rather than mix it with the block logic in `ccgstmts`
(the `if` statements in #24381 have not been indented properly either).
However it also means `Builder` is now used for code that still
generates raw C code, so the diff won't be as clean when these get
updated.
This commit fixes/adds tests for and fixes several issues with `JOIN`
operator parsing:
- For OUTER joins, LEFT | RIGHT | FULL specifier is not optional
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
OUTER JOIN b
ON a.id = b.id
""")
```
- For NATURAL JOIN and CROSS JOIN, ON and USING clauses are forbidden
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
CROSS JOIN b
ON a.id = b.id
""")
```
- JOIN should parse as part of FROM, not after WHERE
```nim
doAssertRaises(SqlParseError): discard parseSql("""
SELECT id FROM a
WHERE a.id IS NOT NULL
INNER JOIN b
ON a.id = b.id
""")
```
- LEFT JOIN should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
LEFT JOIN b
ON a.id = b.id
""") == "select id from a left join b on a.id = b.id;"
```
- NATURAL JOIN should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
NATURAL JOIN b
""") == "select id from a natural join b;"
```
- USING should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
JOIN b
USING (id)
""") == "select id from a join b using (id );"
```
- Multiple JOINs should parse
```nim
doAssert $parseSql("""
SELECT id FROM a
JOIN b
ON a.id = b.id
LEFT JOIN c
USING (id)
""") == "select id from a join b on a.id = b.id left join c using (id );"
```