Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the new MetadataValidator in C#, Java, JavaScript, Ruby, PHP, and MATLAB #3256

Conversation

InsertCreativityHere
Copy link
Member

@InsertCreativityHere InsertCreativityHere commented Dec 11, 2024

C++ and Parser Metadata

  • all that changed was some style for consistency and readability.

MATLAB, PHP, and Ruby

  • We don't support any language-specific metadata in these languages. But now we will explicitly reject metadata for these
    languages, whereas before we were just silently ignoring it (ie. if a user put php:type now we reject it).

JS

  • We had no validation of javascipt metadata before this PR. Now we do.
  • Previously, the code-gen allowed js:defined-in on any user-defined type, now we restrict it to only classes and interfaces.
    The docs say that this metadata is for forward-declarations, so we should only support it on forward-declarable types.

C#

  • We are now validating metadata on interfaces (we weren't before)
  • We are now validating inner metadata on sequences, dictionaries, and constants (we weren't before)
  • slice2cs was validating file metadata in #included files? (we no longer do this)
  • Removed validation for the 'cs:type' metadata which may or may not of existed?

Java

  • We are now validating metadata on interfaces (we weren't before)
  • We are now validating inner metadata on dictionaries, and constants (we weren't before)
  • slice2java was validating file metadata in #included files? (we no longer do this)
  • Removed 'validation' for these dead directives: ["java:tie", "java:optional"]
  • Restricted the 'java:serialVersionUID' metadata to only be allowed on 'class', 'exception' and 'struct' (the only things where we check for it).
  • Disallow using both 'java:type' and 'java:buffer' on the same sequence; both compete for the mapped type.

@@ -73,68 +73,62 @@ Slice::validateMetadata(const UnitPtr& p, string_view prefix, map<string, Metada
// So, we add all the language-agnostic metadata validation into the provided list.

// "amd"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that I changed here was the style. Compared to brace initialization, I found that setting each field is:

  1. more readable, because you can see the field names to know what is being set.
  2. reduces clutter, by only settings the fields I care about, instead of having to repeat default values just so I can set a field which is further down the struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use structure brace initialization. You an specify the field names when initializing a struct.

@@ -350,7 +344,7 @@ MetadataVisitor::isMetadataValid(const MetadataPtr& metadata, const SyntaxTreeBa

// Fourth, we check what the metadata was applied to - does that Slice definition support this metadata?
SyntaxTreeBasePtr appliedTo;
if (info.acceptedContexts == MetadataApplicationContext::Definitions)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also changed the name acceptedContexts to be singular.

Making it plural gives the impressions this enum is a bit-flag, where you can 'set multiple contexts'.
That's not what it is; you can only set a single value for it.

@@ -105,7 +105,7 @@ Slice::Metadata::Metadata(string rawMetadata, string file, int line) : GrammarBa
{
// Check if the metadata starts with a language prefix.
// NOTE: It is important that this list is kept in alphabetical order!
constexpr string_view languages[] = {"cpp", "cs", "java", "js", "matlab", "php", "python", "ruby", "swift"};
constexpr string_view languages[] = {"cpp", "cs", "java", "js", "matlab", "php", "python", "rb", "swift"};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should match the names of their respective compilers.
(Except python, where we used an inconsistent name for it's metadata...)

@@ -859,33 +859,28 @@ Slice::Gen::validateMetadata(const UnitPtr& u)
map<string, MetadataInfo> knownMetadata;

// "cpp:array"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. all that changed was the style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for my earlier commit. Another option is to add a constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (by using brace initialization).

@@ -230,7 +227,7 @@ namespace Slice
// metadata of the type's original definition, as well as any optional
// metadata that typically represents a data member or parameter.
//
static bool hasTypeMetadata(const TypePtr&, const MetadataList& = MetadataList());
static bool hasTypeMetadata(const SequencePtr&, const MetadataList& = MetadataList());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only called on sequences and is all about checking for sequence metadata.

Comment on lines +1029 to +1031
// Check if either 1) No type metadata was applied to this sequence at all or 2) 'java:type' was applied to
// where the sequence is used (which overrides any metadata on the definition) or 3) 'java:type' was applied
// to the sequence definition, and there is no metadata overriding it where the sequence is used.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An unrelated bug I stumbled across. We weren't always checking for these metadata in the same order, which could cause us to generate uncompilable code(because the types wouldn't match). Pretty niche corner case, but still.

You'll see me messing around with java:type, java:buffer, and java:serializable a few times in this file.
All I'm doing is making sure we check them in a consistent (and correct) order.

Comment on lines -1722 to -1723
BuiltinPtr builtin = dynamic_pointer_cast<Builtin>(seq->type());
if (builtin && builtin->kind() == Builtin::KindByte)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validator already ensures that java:serializable only goes on byte sequences.
No need to repeat the validation here.

Comment on lines -1741 to -1743
if (builtin && (builtin->kind() == Builtin::KindByte || builtin->kind() == Builtin::KindShort ||
builtin->kind() == Builtin::KindInt || builtin->kind() == Builtin::KindLong ||
builtin->kind() == Builtin::KindFloat || builtin->kind() == Builtin::KindDouble))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. No need to repeat the validation here.

@@ -2192,49 +1809,16 @@ Slice::JavaGenerator::getTypeMetadata(const MetadataList& metadata, string& inst
}

bool
Slice::JavaGenerator::hasTypeMetadata(const TypePtr& type, const MetadataList& localMetadata)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two simplifications:

  1. This is only called for sequences, so we can eliminate alot of casting/checking of types.
  2. This function included validation of the metadata, which was already being performed by the validator.


if ((builtin->kind() == Builtin::KindByte || builtin->kind() == Builtin::KindShort ||
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, no need to repeat validation again.

@InsertCreativityHere InsertCreativityHere marked this pull request as ready for review December 11, 2024 17:11
@externl externl requested a review from Copilot December 11, 2024 19:07

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 2 out of 17 changed files in this pull request and generated no suggestions.

Files not reviewed (15)
  • cpp/src/Slice/MetadataValidation.cpp: Language not supported
  • cpp/src/Slice/MetadataValidation.h: Language not supported
  • cpp/src/Slice/Parser.cpp: Language not supported
  • cpp/src/slice2cpp/Gen.cpp: Language not supported
  • cpp/src/slice2cs/CsUtil.cpp: Language not supported
  • cpp/src/slice2cs/CsUtil.h: Language not supported
  • cpp/src/slice2java/Gen.cpp: Language not supported
  • cpp/src/slice2java/JavaUtil.h: Language not supported
  • cpp/src/slice2js/Gen.cpp: Language not supported
  • cpp/src/slice2js/Gen.h: Language not supported
  • cpp/src/slice2matlab/Main.cpp: Language not supported
  • cpp/src/slice2php/Main.cpp: Language not supported
  • cpp/src/slice2php/PHPUtil.h: Language not supported
  • cpp/src/slice2rb/RubyUtil.cpp: Language not supported
  • cpp/src/slice2rb/RubyUtil.h: Language not supported
@@ -73,68 +73,62 @@ Slice::validateMetadata(const UnitPtr& p, string_view prefix, map<string, Metada
// So, we add all the language-agnostic metadata validation into the provided list.

// "amd"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use structure brace initialization. You an specify the field names when initializing a struct.

@@ -859,33 +859,28 @@ Slice::Gen::validateMetadata(const UnitPtr& u)
map<string, MetadataInfo> knownMetadata;

// "cpp:array"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for my earlier commit. Another option is to add a constructor.

Comment on lines 98 to 99
os << "failed to parse custom serialVersionUID value for " << p->kindOf() << " `" << p->scoped()
<< "'; generating default value";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your PR, I think this should be an error, not a warning.

MetadataInfo serialVersionUIDInfo;
serialVersionUIDInfo.validOn = {typeid(ClassDecl), typeid(Slice::Exception), typeid(Struct)};
serialVersionUIDInfo.acceptedArgumentKind = MetadataArgumentKind::SingleArgument;
knownMetadata.emplace("java:serialVersionUID", std::move(serialVersionUIDInfo));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do the validation serialVersionUIDInfo here in an extraValidation function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!
I moved the extra stoll check into the metadata validation, instead of doing it on the spot.

// "cs:attribute"
MetadataInfo attributeInfo;
attributeInfo.validOn = {
typeid(Unit),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Unit correspond with file scoped metadata?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is exactly correct.

typeid(Operation),
typeid(Slice::Exception),
typeid(Struct),
typeid(Enum),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing Enumerators here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there's an open issue for adding it (#3246), but I didn't want to change too much in this PR.
So my validator stays as close to the original validation, including any mistakes that were in it.

After I've updated the compilers to use my MetadataValidator, then I'll go back and fix all the bugs that it's uncovered.

if (auto mod = dynamic_pointer_cast<Module>(p))
{
// Top-level modules are contained by the 'Unit'. Non-top-level modules are contained in 'Module's.
if (!dynamic_pointer_cast<Unit>(mod->container()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would merge the two nested if clauses using && condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed as suggested.

@@ -3703,17 +3703,13 @@ void
Slice::Gen::HelperVisitor::visitSequence(const SequencePtr& p)
{
BuiltinPtr builtin = dynamic_pointer_cast<Builtin>(p->type());
if (!hasTypeMetadata(p) && builtin && builtin->kind() <= Builtin::KindString)
bool mappedToCustomType = p->hasMetadata("java:type");
if (builtin && builtin->kind() < Builtin::KindObject && !mappedToCustomType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builtin && builtin->kind() < Builtin::KindObject Don't we have a more idiomatic way to do this check? It is not clear what it means < Builtin::KindObject

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did not, but I added a method named mapsToJavaBuiltinType that we call now instead.
It's the same logic, but hopefully the name makes it more digestable.

Comment on lines 3706 to 3712
if (!hasTypeMetadata(p) && builtin && builtin->kind() <= Builtin::KindString)
bool mappedToCustomType = p->hasMetadata("java:type");
if (builtin && builtin->kind() < Builtin::KindObject && !mappedToCustomType)
{
return; // No helpers for sequence of primitive types
}
else if (hasTypeMetadata(p) && !p->hasMetadata("java:type"))
{
return; // No helpers for custom metadata other than java:type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if I get this right:

The previous check for the second if applies to non built-in types before, With the new check there is no need for this because java:bufer and java:serializable only apply to sequences of built-in types?

os << "ignoring invalid serialVersionUID for " << p->kindOf() << " `" << p->scoped()
<< "'; generating default value";
ostringstream os;
os << "failed to parse custom serialVersionUID value for " << p->kindOf() << " `" << p->scoped()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
os << "failed to parse custom serialVersionUID value for " << p->kindOf() << " `" << p->scoped()
os << "failed to parse custom serialVersionUID value for " << p->kindOf() << " '" << p->scoped()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with @externl's suggestion and moved this check into the metadata validation section,
and I used single quotes (as I always do when writing new code).

builtin->kind() == Builtin::KindInt || builtin->kind() == Builtin::KindLong ||
builtin->kind() == Builtin::KindFloat || builtin->kind() == Builtin::KindDouble))
static const string bytebuffer = "java:buffer";
if ((seq->hasMetadata(bytebuffer) || hasMetadata(bytebuffer, metadata)) && !hasMetadata("java:type", metadata))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&& !hasMetadata("java:type", metadata) is this to ensure java:type metadata in the parameter overrides java:buffer in the definition?

Do we allow both java:buffer and java:type in the same definition? Or is just one in the definition and override it in the param with another?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this to ensure java:type metadata in the parameter overrides java:buffer in the definition?

Precisely. But more generally we want to make sure that parameter metadata always overrides definition metadata.

Do we allow both java:buffer and java:type in the same definition?

No, we explicitly check that you can only have one of java:type, java:buffer or java:serializable on your definition. The only way to have "two of them" is by putting one on the definition and one on the parameter.
And in this case, parameter metadata is always given priority.

So we know that if seq->hasMetadata(bytebuffer) is true, then the only place we need to check is the parameter's metadata (and for completeness, java:serializable can only go on the definition, so the only thing left to check for is whether java:type was put on the parameter).


// This check for 'hasMetadata("java:type")' looks redundant, but is necessary to ensure
// that 'java:type' was placed on the _definition_ and not on where the sequence was used.
if (seq->hasMetadata("java:type") && seq->hasMetadata("java:buffer"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the check seq->hasMetadata("java:buffer") isn't this validator only called for java:buffer? Or is the comment above supposed to refer to java:buffer instead of java:type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, the comment was supposed to refer to java:buffer, I got myself mixed up.
Fixed.

if (auto mod = dynamic_pointer_cast<Module>(p))
{
// Top-level modules are contained by the 'Unit'. Non-top-level modules are contained in 'Module's.
if (!dynamic_pointer_cast<Unit>(mod->container()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would merge the nested if conditions here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also changed!

MetadataInfo attributeInfo;
attributeInfo.validOn = {
typeid(Unit),
typeid(Module),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NET does not support attributes on C# namespaces:
https://learn.microsoft.com/en-us/dotnet/api/system.attributetargets?view=net-9.0

So this looks bogus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this was the old behavior which I just tried to preserve.
I added this to the C# metadata issue I already opened: #3246 (comment)

continue;
}
return "'cs:generic:" + argument +
"' is not supported on sequences of objects; only 'List' is supported for object sequences";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use 'objects', as it's confusing. sequence of classes? sequence of class instances?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with sequences of classes types, I can see how "objects" would be confusing. Now it reads:

 "'cs:generic:" + argument + "' is not supported on sequences of classes; only 'List' is supported for such sequences"


// "cs:property"
MetadataInfo propertyInfo;
propertyInfo.validOn = {typeid(ClassDecl), typeid(Slice::Exception), typeid(Struct)};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For validOn, we must used ClassDecl/InterfaceDecl and not ClassDef/InterfaceDef?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, after my PR for simplifying the Decl/Def situation, you must always pass in the Decls.
There's a comment saying so in MetadataValidator.h:

/// Note that `ClassDef` and `InterfaceDef` should never appear in this list, since class & interface metadata
/// is always stored on the corresponding `ClassDecl` and `InterfaceDecl` types, which should be used instead.
std::list<std::reference_wrapper<const std::type_info>> validOn;

getsetInfo.acceptedArgumentKind = MetadataArgumentKind::NoArguments;
knownMetadata.emplace("java:getset", std::move(getsetInfo));

// "java:implements"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work?

Without partial (like in C#), how is the user going to implement any of the interface methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs:

When used with structs, this metadata can only refer to interfaces without operations. With classes or interfaces, the generated Java interface will be marked as abstract. The code is responsible for registering a value factory if the Slice class is transferred over-the-wire and uses this metadata to implement native Java interfaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: even though it's documented, it's not usable and should be removed.

@InsertCreativityHere InsertCreativityHere merged commit d9a2cd9 into zeroc-ice:main Dec 16, 2024
19 checks passed
InsertCreativityHere added a commit to InsertCreativityHere/compiler-comparison that referenced this pull request Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants