-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structural sum type #527
Comments
So for defining a normal ADT, you first need to define a |
I already wrote all the points above but I guess more succinctly:
Also "defining a normal ADT" does not have to be any different than that. It would be really simple to build a macro that defines one of these types based on ADT syntax (I used the syntax type List[T] {.adt.} = Nil | Node[T](value: T, next: List[T])
# becomes
type
Nil = object # or Nil[T] = object
Node[T] = object
value: T
next: List[T]
List[T] = union(Nil, Node[T]) If anything "you have to define distinct/object types for each variant" is a good thing, because the information about each variant is available as an existing type. You also get to act on these as a "set of types", meaning you can break them down into their parts, or tack on new types easily. Imagine you have a type like this: type
FooKind = enum fooInt, fooFloat, foo32Array
Foo = object
case kind: FooKind
of fooInt:
i: int
of fooFloat:
f: float
of foo32Array:
arr: array[32, Foo] Now imagine if you had a large seq of type
FooKind = enum fooInt, fooFloat, foo32Array
Foo = object
case kind: FooKind
of fooInt:
i: int
of fooFloat:
f: float
of foo32Array:
arr: array[32, Foo]
FooIntFloat = object
case kind: FooKind
of fooInt:
i: int
of fooFloat:
f: float
else: discard
proc intFloatOnly(foo: Foo): FooIntFloat =
case foo.kind
of fooInt: FooIntFloat(kind: fooInt, i: foo.i)
of fooFloat: FooIntFloat(kind: fooFloat, f: foo.f)
else:
raise newException(FieldDefect, "runtime error for non-int-float kind!")
proc backToFoo(fif: FooIntFloat): Foo =
case fif.kind
of fooInt: Foo(kind: fooInt, i: fif.i)
of fooFloat: Foo(kind: fooFloat, f: fif.f)
else: discard # unreachable
var s = newSeqOfCap[FooIntFloat](2000)
proc generateFoo(n: int): Foo =
if n mod 2 == 1:
Foo(kind: fooInt, i: n)
else:
Foo(kind: fooFloat, f: n.float)
proc consumeFoo(foo: Foo) =
echo foo
for i in 1..2000:
let foo = generateFoo(i)
s.add(intFloatOnly(foo))
for x in s:
let foo = backToFoo(x)
consumeFoo(foo) ADTs just make the syntax for this nicer, and actually make it worse because you cannot reuse That is, unless you introduce some "restricted ADT" type like Now with this proposal: type
Int = distinct int
Float = distinct float
Array32 = distinct array[32, Foo]
Foo = union(Int, Float, Array32)
# or, assuming an `adt` macro exists
type Foo = adt Int(int) | Float(float) | Array32(array[32, Foo])
var s = newSeqOfCap[union(Int, Float)](2000)
proc generateFoo(n: int): Foo =
if n mod 2 == 1:
Int(n)
else:
Float(n.float)
proc consumeFoo(foo: Foo) =
echo foo
for i in 1..2000:
let foo = generateFoo(i)
case foo
of Int: s.add(Int(foo))
of Float: s.add(Float(foo))
# we get to deal with invalid cases at the callsite because it's much less cumbersome
else: raise newException(FieldDefect, "shouldn't happen")
for x in s:
let foo =
case x
of Int: Foo(Int(x))
of Float: Foo(Float(x))
consumeFoo(foo) I have to be clear here though that I am not pushing the |
Something this could maybe do away with to make the implementation reasonable is order-invariance, i.e. A common use case I didn't mention above would be: type Error = distinct string
proc foo(x: int): union(int, Error) =
if x < 0:
return Error("argument was negative")
if x mod 2 == 0:
return Error("argument was odd")
result = (x - 1) div 2
let res = foo(122)
case res
of Error: echo "got error: ", string(Error(res))
of int: echo "got successful result: ", int(res) Stuff like this would not be affected by order relevance. Adding onto that example, this being language level means we could optimize things like |
order-invariance is a completely alien concept for Nim and I don't like it. |
There are two levels of order-invariance - I argue here that there are significant benefits to have it at the ABI level along with other freedoms such as moving fields around between objects, flattening them in other ways than current |
This design fundamentally merges a runtime value (often called "tag") with a typename. This seems to have unforeseeable interactions with Nim's meta-programming: type
U = union(Foo, Bar)
macro inspectType(t: typedesc)
inspectType Foo # valid, Foo is a type.
macro inspectValue[T](t: static T)
inspectValue Foo # also valid?
The RFC also offers no solution for pattern matching. But conversions like The fact that this sum type is structural is not a huge benefit as the 2 most important structural types are easily replicated via generics: |
Once again, I arrive at something like: type
Node = ref enum
of BinaryOpr:
x, y: Node
of UnaryOpr:
x: Node
of Name(string)
Option[T] = enum
of None()
of Some(T)
Either[A, B] = enum
of Le(A)
of Ri(B)
|
This isn't a huge frontend issue, expressions can already have type
It wouldn't be different from the current situation with object variant branch access, which I realize now pattern matching is better for. If we don't reuse existing types, the symbols like |
Correct, but we have been making enum symbols smarter ("overloadable") already. |
Abstract
Add a structural, unordered sum type to the language, equivalent in the backend to a corresponding object variant type.
Motivation
From #525:
On top of these, I would add:
reset()
on variant discriminators #56)#525 and many other proposals propose some version of ADTs to deal with these problems. However this still has issues:
case
syntax which is ambiguous with the existingcase
syntax which can include complex expressions in its discriminators that evaluate to valuesobject
andtuple
typesDescription
In this proposal I will use the temporary placeholder syntax of
union(A, B, ...)
to represent these sum types. I like the syntax{A, B, ...}
instead (at least as sugar) due to both 1. mirroring with(A, B, ...)
tuple syntax and 2. similarities in behavior with sets, but this syntax might be hard to make out in this post.Basically we add a new type kind to the language that has an indefinite number of children that are all nominally unique concrete types, i.e.
A
,B = distinct A
andC = distinct A
,D[T] = distinct A
can form a typeunion(A, B, C, D[A], D[B])
butA
,B = A
,C = sink A
,D[T] = A
etc can only formunion(A)
. In any caseunion(A, B)
is also equal tounion(B, A)
, meaning internally the children of the type are sorted under some scheme.The type relation between
A
andunion(A, ...)
is that they are convertible. The subtype relation as in inheritance might also work but for now this seems the most simple.In the backend,
Foo = union(A, B, ...)
(where the children are sorted) becomes equivalent to something like:For the sake of safety in the case of uninitialized values or efficiency in the case of things like
Option
we can also introduce a none/nil kind that represents no type in the union. This would be unavailable on the frontend, values of these types must always be initialized to a value of some type in the union.Construction
Construction of values of these types is as simple as a type conversion between the element type to the sum type. That is:
first transforms into the following at type-check time:
which the backend can then turn into:
Destructuring & inspection
We can very trivially reuse the existing
case
andof
frontend mechanisms to check which type they are at runtime with (I believe) zero incompatibilities. And again destructuring just becomes a type conversion.In the backend:
A limitation is that there is no good way to have the information of the exact type of the union as a value on the frontend (what would normally be
x.kind
), but we could maybe alleviate this by allowing to attach these to an enum type, i.e.union(fooA: A, fooB: B, ...)
. But then the question arises of whether these types are compatible with other union types of the same elements but with a different/no attached enum. In any case you can generate a case statement likecase x of A: fooA of B: fooB ...
but this would be less efficient than just using thekind
in the backend.Other points
A frequently mentioned use case in Proper sum types and structural pattern matching #525 was recursion with pointer indirection. In the current language this works in union typeclasses but not in tuple types: the manual mentions that "In order to simplify structural type checking, recursive tuples are not valid". Maybe recursive unions can just be nominal? Or the canonicalization scheme (which tuples don't have) can account for recursion.
This is not an alternative solution to pattern matching or object variants, it's just an alternate solution to ADTs for the current problems with object variants, and ADTs happen to require pattern matching while this doesn't (but this is still compatible with pattern matching). People tend to see object variants as black and white, worse or better than other representations of sum types but in practice they both have their uses especially in such a general purpose language.
type List[T] = adt Nil | Node[T](value: T, next: List[T])
This has partially been implemented in a library (https://github.com/alaviss/union/) but a library solution is not sufficient for proper use of this:
union(A, B, C)
should probably not be equal tounion(A, union(B, C))
sinceunion(B, C)
is still a concrete type; we can have an operation+
that "merges" union types, so thatA + union(B, C) + D
orunion(A, B) + union(C, D)
all give a flattenedunion(A, B, C, D)
. This implies unions have set behavior, which might be the generics limitation mentioned above; it is nontrivial and in some cases impossible to infer generic parameters from these types.Links
or
or|
in nim) which is not the goal here https://crystal-lang.org/reference/1.8/syntax_and_semantics/union_types.htmlCode Examples
Backwards Compatibility
Should be completely compatible
The text was updated successfully, but these errors were encountered: