Fix iterable resolution, prefer iterator overloads (#25679)

This fixes type resolution for `iterable[T]`. I want to proceed with RFC [#562](https://github.com/nim-lang/RFCs/issues/562) and this is the main blocker for composability. Fixes #22098 and, arguably, #19206 ```nim import std/strutils template collect[T](it: iterable[T]): seq[T] = block: var res: seq[T] = @[] for x in it: res.add x res const text = "a b c d" let words = text.split.collect() doAssert words == @[ "a", "b", "c", "d" ] ``` In cases like `strutils.split`, where both proc and iterator overload exists, the compiler resolves to the `func` overload causing a type mismatch. The old mode resolved `text.split` to `seq[string]` before the surrounding `iterable[T]` requirement was applied, so the argument no longer matched this template. It should be noted that, compared to older sequtils templates, composable chains based on `iterable[T]` require an iterator-producing expression, e.g. `"foo".items.iterableTmpl()` rather than just `"foo".iterableTmpl()`. This is actually desirable: it keeps the iteration boundary explicit and makes iterable-driven templates intentionally not directly interchangeable with older untyped/loosely-typed templates like those in `sequtils`, whose internal iterator setup we have zero control over (e.g. hard-coding adapters like `items`). Also, I noticed in `semstmts` that anonymous iterators are always `closure`, which is not that surprising if you think about it, but still I added a paragraph to the manual. Regarding implementation: From what I gathered, the root cause is that `semOpAux` eagerly pre-types all arguments with plain flags before overload resolution begins, so by the time `prepareOperand` processes `split` against the `iterable[T]`, the wrong overload has already won. The fix touches a few places: - `prepareOperand` in `sigmatch.nim`: When `formal.kind == tyIterable` and the argument was already typed as something else, it's re-semchecked with the `efPreferIteratorForIterable` flag. The recheck is limited to direct calls (`a[0].kind in {nkIdent, nkAccQuoted, nkSym, nkOpenSym}`) to avoid recursing through `semIndirectOp`/`semOpAux` again. - `iteratorPreference` field `TCandidate`, checked before `genericMatches` in `cmpCandidates`, gives the iterator overload a win without touching the existing iterator heuristic used by `for` loops. **Limitations:** The implementation is still flag-driven rather than purely formal-driven, so the behaviour is a bit too broad `efWantIterable` can cause iterator results to be wrapped as `tyIterable` in iterable-admitting contexts, not only when `iterable[T]` match is being processed. `iterable[T]` still does not accept closure iterator values such as`iterator(): T {.closure.}`. It only matches the compiler's internal `tyIterable`, not arbitrary iterator-typed values. The existing iterator-preference heuristic is still in place, because when I tried to remove it, some loosely-related regressions happened. In particular, ordinary iterator-admitting contexts and iterator chains still rely on early iterator preference during semchecking, before the compiler has enough surrounding context to distinguish between value/iterator producing overloads. Full heuristic removal would require a broader refactor of dot-chain/intermediate-expression semchecking, which is just too much for me ATM. This PR narrows only the tyIterable-specific cases. **Future work:** Rework overload resolution to preserve additional information of matching iterator overloads for calls up to the point where the iterator-requiring context is established, to avoid re-sem in `prepareOperand`. Currently there's no good channel to store that information. Nodes can get rewritten, TCandidate doesn't live long enough, storing in Context or some side-table raises the question how to properly key that info.
2026-04-30 03:03:57 +00:00 · 2026-04-01 23:01:55 +04:00
parent 9c07bb94c1
commit be29bcd402
8 changed files with 144 additions and 39 deletions
--- a/doc/manual.md
+++ b/doc/manual.md
@@ -2628,10 +2628,10 @@ Overload resolution
 In a call `p(args)` where `p` may refer to more than one
 candidate, it is said to be a symbol choice. Overload resolution will attempt to
 find the best candidate, thus transforming the symbol choice into a resolved symbol.
-The routine `p` that matches best is selected following a series of trials explained below. 
+The routine `p` that matches best is selected following a series of trials explained below.
 In order: Category matching, Hierarchical Order Comparison, and finally, Complexity Analysis.

-If multiple candidates match equally well after all trials have been tested, the ambiguity 
+If multiple candidates match equally well after all trials have been tested, the ambiguity
 is reported during semantic analysis.

 First Trial: Category matching
@@ -2664,7 +2664,7 @@ resolved symbol.
 For example, if a candidate with one exact match is compared to a candidate with multiple
 generic matches and zero exact matches, the candidate with an exact match will win.

-Below is a pseudocode interpretation of category matching, `count(p, m)` counts the number 
+Below is a pseudocode interpretation of category matching, `count(p, m)` counts the number
 of matches of the matching category `m` for the routine `p`.

 A routine `p` matches better than a routine `q` if the following
@@ -2692,11 +2692,11 @@ type A[T] = object
 ```

 Matching formals for this type include `T`, `object`, `A`, `A[...]` and `A[C]` where `C` is a concrete type, `A[...]`
-is a generic typeclass composition and `T` is an unconstrained generic type variable. This list is in order of 
+is a generic typeclass composition and `T` is an unconstrained generic type variable. This list is in order of
 specificity with respect to `A` as each subsequent category narrows the set of types that are members of their match set.

 In this trial, the formal parameters of candidates are compared in order (1st parameter, 2nd parameter, etc.) to search for
-a candidate that has an unrivaled specificity. If such a formal parameter is found, the candidate it belongs to is chosen 
+a candidate that has an unrivaled specificity. If such a formal parameter is found, the candidate it belongs to is chosen
 as the resolved symbol.

 Third Trial: Complexity Analysis
@@ -2951,13 +2951,13 @@ proc sort*[I: Index; T: Comparable](x: var Indexable[I, T])

 In the above example, `Comparable` and `Indexable` are types that will match any type that
 can can bind each definition declared in the concept body. The special `Self` type defined
-in the concept body refers to the type being matched, also called the "implementation" of 
-the concept. Implementations that match the concept are generic matches, and the concept 
+in the concept body refers to the type being matched, also called the "implementation" of
+the concept. Implementations that match the concept are generic matches, and the concept
 typeclasses themselves work in a similar way to generic type variables in that they are never
 concrete types themselves (even if they have concrete type parameters such as `Indexable[int, int]`)
-and expressions like `typeof(x)` in the body of `proc sort` from the above example will return the 
+and expressions like `typeof(x)` in the body of `proc sort` from the above example will return the
 type of the implementation, not the concept typeclass. Concepts are useful for providing information
-to the compiler in generic contexts, most notably for generic type checking, and as a tool for 
+to the compiler in generic contexts, most notably for generic type checking, and as a tool for
 [Overload resolution]. Generic type checking is forthcoming, so this will only explain overload
 resolution for now.

@@ -2984,7 +2984,7 @@ Concept overload resolution

 When an operand's type is being matched to a concept, the operand's type  is set as the "potential
 implementation". For each definition in the concept body, overload resolution is performed by substituting `Self`
-for the potential implementation to try and find a match for each definition. If this succeeds, the concept 
+for the potential implementation to try and find a match for each definition. If this succeeds, the concept
 matches. Implementations do not need to exactly match the definitions in the concept. For example:

 ```nim
@@ -3008,7 +3008,7 @@ This leads to confusing and impractical behavior in most situations, so the rule
 1. if a concept is being compared with `T` or any type that accepts all other types (`auto`) the concept
 is more specific
 2. if the concept is being compared with another concept the result is deferred to [Concept subset matching]
-3. in any other case the concept is less specific then it's competitor 
+3. in any other case the concept is less specific then it's competitor

 Currently, the concept evaluation mechanism evaluates to a successful match on the first acceptable candidate
 for each defined binding. This has a couple of notable effects:
@@ -4610,10 +4610,10 @@ for any type (with some exceptions) by defining a routine with the name `[]`.
  ```nim
  type Foo = object
    data: seq[int]
-  
+
  proc `[]`(foo: Foo, i: int): int =
    result = foo.data[i]
-  
+
  let foo = Foo(data: @[1, 2, 3])
  echo foo[1] # 2
  ```
@@ -4624,12 +4624,12 @@ which has precedence over assigning to the result of `[]`.
  ```nim
  type Foo = object
    data: seq[int]
-  
+
  proc `[]`(foo: Foo, i: int): int =
    result = foo.data[i]
  proc `[]=`(foo: var Foo, i: int, val: int) =
    foo.data[i] = val
-  
+
  var foo = Foo(data: @[1, 2, 3])
  echo foo[1] # 2
  foo[1] = 5
@@ -4861,7 +4861,14 @@ default to being inline, but this may change in future versions of the
 implementation.

 The `iterator` type is always of the calling convention `closure`
-implicitly; the following example shows how to use iterators to implement
+implicitly.
+
+Unlike named iterators, anonymous iterator expressions evaluate
+to the `iterator` type. In practice, this means a named iterator declaration
+without `{.closure.}` defaults to inline, but an expression like `let it =
+iterator(): int = yield 1` produces a callable closure iterator value.
+
+The following example shows how to use iterators to implement
 a `collaborative tasking`:idx: system:

  ```nim
@@ -6401,7 +6408,7 @@ The default for symbols of entity `type`, `var`, `let` and `const`
 is `gensym`. For `proc`, `iterator`, `converter`, `template`,
 `macro`, the default is `inject`, but if a `gensym` symbol with the same name
 is defined in the same syntax-level scope, it will be `gensym` by default.
-This can be overridden by marking the routine as `inject`. 
+This can be overridden by marking the routine as `inject`.

 If the name of the entity is passed as a template parameter, it is an `inject`'ed symbol:

@@ -7242,7 +7249,7 @@ identifier is considered ambiguous, which can be resolved in the following ways:

  write(stdout, x) # error: x is ambiguous
  write(stdout, A.x) # no error: qualifier used
-  
+
  proc bar(a: int): int = a + 1
  assert bar(x) == x + 1 # no error: only A.x of type int matches

@@ -9324,4 +9331,3 @@ It is not valid to pass an lvalue of a supertype to an `out T` parameter:

 However, in the future this could be allowed and provide a better way to write object
 constructors that take inheritance into account.
-