Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: union types #466

Draft
wants to merge 19 commits into
base: staging
Choose a base branch
from
Draft

feat: union types #466

wants to merge 19 commits into from

Conversation

b1ek
Copy link
Member

@b1ek b1ek commented Sep 7, 2024

ref #442

it can match multiple types, like Text | Num | Null, and also failable ones like Text | Num | Num?.. the whole thing is covered by tests, you can see the code there

mks-h

This comment was marked as outdated.

@mks-h

This comment was marked as outdated.

@mks-h
Copy link
Member

mks-h commented Sep 8, 2024

The second issue persists — the function returns 42 where it shouldn't, due to type comparisons using partial equality check. Here's a new example:

fun test(a: Num | [Num] | Text): Num {
    if {
        a is Num: return 42
        a is [Num]: return 69
    }
}

echo test("hi, hello" as Num | Text)

I really don't think it is the right approach to implement manual PartialEq, as it makes it impossible to strictly compare two types. This lesson was already learned in #300. You should instead call the union subset checking logic at the right places, like function parameters and the return statement.

Moreover, the current implementation breaks the transitive quality implied by derived Eq, which is a logic error. See std::cmp::Eq documentation.

@mks-h
Copy link
Member

mks-h commented Sep 8, 2024

Not a fan of the hash idea for the Type enum, this is getting out of hand. Please take a different approach from manually implementing PartialEq.

@b1ek
Copy link
Member Author

b1ek commented Sep 8, 2024

i don't know what you are talking about. the program doesnt compile on latest commit. the hashes are non collidable now

@mks-h
Copy link
Member

mks-h commented Sep 8, 2024

I've just checked out a fresh instance of your branch, and the code from the last example compiles perfectly (as it should), but the function returns 42 — and it should not.

@mks-h
Copy link
Member

mks-h commented Sep 8, 2024

the hashes are non collidable now

The hashes are an over engineered solution that doesn't work. Even though you wrote documentation for it, I still cannot (and frankly don't want to) comprehend how they function, and the random magic numbers are genuinely scaring me. We aren't writing embedded code here, there should be no such complexity for simple type checking logic.

@mks-h
Copy link
Member

mks-h commented Sep 9, 2024

While implementing generic arrays in #472, I've created a method where you can put the logic for verifying if the type is a subset of another one. This method is hooked up to all the right places — function arguments, and return statement. So those will automatically allow a type that is a subset of another type. Unlike with manual PartialEq, that wont happen when comparing types in expressions.

You can already try to build your PR on top of mine, or on top of master after I merge mine. Of course, you're welcome to review it first.

Comment on lines 20 to 31
fn eq_union_normal(one: &[Type], other: &Type) -> bool {
one.iter().any(|x| (*x).to_string() == other.to_string())
}

fn eq_unions(one: &[Type], other: &[Type]) -> bool {
let bigger;
let smaller = if one.len() < other.len() { bigger = other; one } else { bigger = one; other };

smaller.iter().all(|x| {
Self::eq_union_normal(bigger, x)
})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those types will not really be equal. Look at this example:

fun foo(): Num | [Num] {
  return [1]
}

fun bar(x: Num) {
  return x + 1
}

echo bar(foo())

This should raise an error. Just because value can be of some type, we cannot say that it is. We lose type safety here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo union types should not be automatically equal to normal type, but a normal type may be equal to a union type if the latter has the same type in its list.

like

Text | Num  !=  Text
Text        ==  Text | Num
Text        !=  Null | Num

and to convert Text | Num to Text, one would check it with an if x is Text which would convert x to Text inside that if block

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I don't know if you can check it this way. It's more of a one way casting ability

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typescript does it that way (mostly)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly is crucial here. We shouldn't be able to say that Text == Text | Num or Text | Num == Text

I know that thanks to this, you're able to do this:

fun foo(x: Text | Num) ...

foo(5)

But it creates an issue that I've shown before. We need to tackle this issue by checking if a value given to a var/param is the same as declared type or is in the Union type. Not by equating Text type with a union type that has Text in it.

You need to modify for example how run_function_with_args works

Copy link
Member

@Ph0enixKM Ph0enixKM Sep 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @KrosFire. @b1ek, please review this line of code. Instead of == (there actually is != but for simplicity I'll use equal sign) there should be a subset_compare function, that for union it checks if the value is part of the union, and for the regular values it applies just == operator as normal.

Explanation

When it comes to types - we are working on sets, not values. For primitive types it's easy as they are singletons:

 Text    Num    Bool
{Text}, {Num}, {Bool}

Here we can simply {Text} == {Text} because that indeed is the case.

When it comes to the more complex types like unions:

 Text | Num   Bool | [Text]
{Text, Num}, {Bool, [Text]} 

We need to compare if one set is a subset of another set and for that we cannot rely on Rust's PartialEq. This requires a comparison <=. So in the line that I've linked above - there should actually be <= because we want to check if this is the same value OR is it a subset of this value. This is a not an eq. It's an leq aka subset.

Overloading == with <= will cause bad behaviors later when we might want to >= instead.

Comment on lines 58 to 78
fn type_hash(&self) -> u32 {
match self {
Type::Null => 0x00000001,
Type::Text => 0x00000002,
Type::Bool => 0x00000003,
Type::Num => 0x00000004,
Type::Generic => 0x00000005,

Type::Array(t) => t.type_hash() + 0x00000100,

Type::Failable(t) => {
if let Type::Failable(_) = **t {
panic!("Failable types can't be nested!");
}
t.type_hash() + 0x00010000
},

Type::Union(_) => unreachable!("Type hash is not available for union types! Use the PartialEq trait instead"),
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some proof, but it's late and I'm lazy XD

Try by doing some tests, or math. Or try with hexes to be sure

Suggested change
fn type_hash(&self) -> u32 {
match self {
Type::Null => 0x00000001,
Type::Text => 0x00000002,
Type::Bool => 0x00000003,
Type::Num => 0x00000004,
Type::Generic => 0x00000005,
Type::Array(t) => t.type_hash() + 0x00000100,
Type::Failable(t) => {
if let Type::Failable(_) = **t {
panic!("Failable types can't be nested!");
}
t.type_hash() + 0x00010000
},
Type::Union(_) => unreachable!("Type hash is not available for union types! Use the PartialEq trait instead"),
}
}
}
fn type_hash(&self) -> u32 {
match self {
Type::Null => 1 << 0,
Type::Text => 1 << 1,
Type::Bool => 1 << 2,
Type::Num => 1 << 3,
Type::Generic => 1 << 4,
Type::Array(t) => t.type_hash() + (1 << 5),
Type::Failable(t) => {
if let Type::Failable(_) = **t {
panic!("Failable types can't be nested!");
}
t.type_hash() +(1 << 6)
},
Type::Union(types) => sum(dedup(types)) + (1 << 7),
}
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i removed that function the same day i wrote it, you should review on the latest commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, i forgot to push it.. lmao

Comment on lines 20 to 31
fn eq_union_normal(one: &[Type], other: &Type) -> bool {
one.iter().any(|x| (*x).to_string() == other.to_string())
}

fn eq_unions(one: &[Type], other: &[Type]) -> bool {
let bigger;
let smaller = if one.len() < other.len() { bigger = other; one } else { bigger = one; other };

smaller.iter().all(|x| {
Self::eq_union_normal(bigger, x)
})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly is crucial here. We shouldn't be able to say that Text == Text | Num or Text | Num == Text

I know that thanks to this, you're able to do this:

fun foo(x: Text | Num) ...

foo(5)

But it creates an issue that I've shown before. We need to tackle this issue by checking if a value given to a var/param is the same as declared type or is in the Union type. Not by equating Text type with a union type that has Text in it.

You need to modify for example how run_function_with_args works

Copy link
Member

@Ph0enixKM Ph0enixKM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@b1ek please also add tests (edge cases) mentioned by members in this conversation

src/modules/types.rs Outdated Show resolved Hide resolved
Comment on lines 46 to 48
(Type::Union(one), Type::Union(other)) => Type::eq_unions(one, other),
(Type::Union(one), other) => Type::eq_union_normal(one, other),
(other, Type::Union(one)) => Type::eq_union_normal(one, other),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps implement PartialEq for just the Union struct? I agree with @mks-h that we shouldn't make the same mistake with defining that custom PartialEq

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that thought crossed my mind. i will expirement around that today

Co-authored-by: Phoenix Himself <[email protected]>
@Ph0enixKM
Copy link
Member

Never mind I just realized I reviewed an old piece of code

@Ph0enixKM
Copy link
Member

Actually we are going to have a duplicate code when #472 gets merged. Can we wait for it and extend the is_subset_of function declared in this issue?

@Mte90 Mte90 removed their request for review September 20, 2024 08:28
@b1ek b1ek marked this pull request as draft September 23, 2024 08:27
Copy link
Member

@Ph0enixKM Ph0enixKM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code simplification requested

@@ -39,6 +39,17 @@ fn run_function_with_args(meta: &mut ParserMetadata, mut fun: FunctionDecl, args
// Check if the function argument types match
if fun.is_args_typed {
for (index, (arg_name, arg_type, given_type)) in izip!(fun.arg_names.iter(), fun.arg_types.iter(), args.iter()).enumerate() {

// union type matching works differently in functions
if let Type::Union(union) = &arg_type {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't Union be also handled in Type::is_allowed_in?

@@ -54,7 +59,12 @@ impl Display for Type {
write!(f, "[{}]", t)
},
Type::Failable(t) => write!(f, "{}?", t),
Type::Generic => write!(f, "Generic")
Type::Generic => write!(f, "Generic"),

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function try_parse_type could be simply extended with a recursive approach. Here is a code block that illustrates how can it be achieved without introducing new functions:

    if token(meta, "|").is_ok() {
        // We're parsing this function recursively
        match (res, try_parse_type(meta)) {
            // And we flatten the result into a single union
            (Ok(lhs), Ok(rhs)) => return Ok(Type::Union([&[lhs], &rhs[..]].concat()))
            (Err(_), _) => error!(meta, tok, "Expected type before '|'.")
            (_, Err(_)) => error!(meta, tok, "Expected type after '|'.")
        }
    }

Here is the full function for a reference:

Full implementation of try_parse_type
// Tries to parse the type - if it fails, it fails quietly
pub fn try_parse_type(meta: &mut ParserMetadata) -> Result<Type, Failure> {
    let tok = meta.get_current_token();
    let res = match tok.clone() {
        Some(matched_token) => {
            match matched_token.word.as_ref() {
                "Text" => {
                    meta.increment_index();
                    Ok(Type::Text)
                },
                "Bool" => {
                    meta.increment_index();
                    Ok(Type::Bool)
                },
                "Num" => {
                    meta.increment_index();
                    Ok(Type::Num)
                },
                "Null" => {
                    meta.increment_index();
                    Ok(Type::Null)
                },
                "[" => {
                    let index = meta.get_index();
                    meta.increment_index();
                    if token(meta, "]").is_ok() {
                        Ok(Type::Array(Box::new(Type::Generic)))
                    } else {
                        match try_parse_type(meta) {
                            Ok(Type::Array(_)) => error!(meta, tok, "Arrays cannot be nested due to the Bash limitations"),
                            Ok(result_type) => {
                                token(meta, "]")?;
                                Ok(Type::Array(Box::new(result_type)))
                            },
                            Err(_) => {
                                meta.set_index(index);
                                Err(Failure::Quiet(PositionInfo::at_eof(meta)))
                            }
                        }
                    }
                },
                // Error messages to help users of other languages understand the syntax
                text @ ("String" | "Char") => {
                    error!(meta, tok, format!("'{text}' is not a valid data type. Did you mean 'Text'?"))
                },
                number @ ("Number" | "Int" | "Float" | "Double") => {
                    error!(meta, tok, format!("'{number}' is not a valid data type. Did you mean 'Num'?"))
                },
                "Boolean" => {
                    error!(meta, tok, "'Boolean' is not a valid data type. Did you mean 'Bool'?")
                },
                array @ ("List" | "Array") => {
                    error!(meta, tok => {
                        message: format!("'{array}'<T> is not a valid data type. Did you mean '[T]'?"),
                        comment: "Where 'T' is the type of the array elements"
                    })
                },
                // The quiet error
                _ => Err(Failure::Quiet(PositionInfo::at_eof(meta)))
            }
        },
        None => {
            Err(Failure::Quiet(PositionInfo::at_eof(meta)))
        }
    };

    if token(meta, "?").is_ok() {
        res = Ok(res.map(|t| Type::Failable(Box::new(t))));
    }
    
    if token(meta, "|").is_ok() {
        // We're parsing this function recursively
        match (res, try_parse_type(meta)) {
            // And we flatten the result into a single union
            (Ok(lhs), Ok(rhs)) => return Ok(Type::Union([&[lhs], &rhs[..]].concat()))
            (Err(_), _) => error!(meta, tok, "Expected type before '|'.")
            (_, Err(_)) => error!(meta, tok, "Expected type after '|'.")
        }
    }

    res
}

@Ph0enixKM Ph0enixKM changed the base branch from master to staging December 19, 2024 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants