-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Metadata
work with single table synthesizers
#2140
Make Metadata
work with single table synthesizers
#2140
Conversation
sdv/single_table/base.py
Outdated
self._validate_inputs(enforce_min_max_values, enforce_rounding) | ||
self.metadata = metadata | ||
self.metadata = single_metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this works but the only issue I see is that the get_metadata
method should return the new metadata class, and with this it will return the single table one.
@npatki I think the get_metadata method is the only public way to access the metadata. Is that right? The synthesizers have a .metadata
attribute, but that doesn't seem to be in the docs so Idk if that also should return the new metadata class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could update get_metadata()
to return a Metadata
object instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Another question I have for @npatki is if that method should return a SingleTableMetadata in the case where that's what the user provided, or just always return the new object? I'm thinking of the case where the person saved the synthesizer on an older version, loads it up and does
synthesizer.get_metadata()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amontanez24 @lajohn4747 I imagined that get_metadata()
would always return the new, streamlined Metadata
object no matter when the synthesizer was first created.
From this point onwards, the streamlined Metadata object is the only external-facing metadata object in play. All references to SingleTableMetadata
and MultiTableMetadata
would either be deprecated or deleted in the docs.
Metadata
work with single table synthesizers
sdv/metadata/metadata.py
Outdated
@@ -50,6 +51,25 @@ def load_from_dict(cls, metadata_dict, single_table_name=None): | |||
instance._set_metadata_dict(metadata_dict, single_table_name) | |||
return instance | |||
|
|||
@classmethod | |||
def load_from_single_table_metadata(cls, single_metadata_table, table_name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this method be private? (_
).
I will take a look into converting everything over now instead of trying to make Metadata produce a SingleTableMetadata. Doing it here: #2144 This current PR just pushes the eventual conversion down the line, but with the above PR, we could easily just delete SingleTableMetadata from synthesizers. |
sdv/single_table/base.py
Outdated
@@ -99,6 +107,14 @@ def __init__( | |||
): | |||
self._validate_inputs(enforce_min_max_values, enforce_rounding) | |||
self.metadata = metadata | |||
if isinstance(metadata, Metadata): | |||
self.metadata = metadata._convert_to_single_table() | |||
self.real_metadata = metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll remove this attribute and just convert it to the new metadata when calling get_metadata
@amontanez24 why not take over #2144 instead? |
I think rewriting the code to support the new Metadata class in internal classes is overkill. To maintain backwards compatibility, we have to make sure to convert the class over everywhere that uses it like you were doing in that PR. I think it ends up being harder to read and riskier when the change is really only required for user facing purposes. If we were to do another major release, then we could just drop these old classes and write everything with the new class which would be cleaner, but until then I don't think it's worth it. I'm also not sure we'll remove the concept of "SingleTableMetadata". The internal classes like constraints and the DataProcessor only work with one table at a time, so that construct still seems to make sense for them |
@@ -269,7 +267,7 @@ def get_parameters(self): | |||
|
|||
def get_metadata(self): | |||
"""Return the ``Metadata`` for this synthesizer.""" | |||
return self.real_metadata | |||
return Metadata.load_from_dict(self.metadata.to_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, we always get a new instance of the metadata
(it's okay, I just want to confirm that this is the expected return).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the docs say we return a copy so it should be fine
Co-authored-by: Andrew Montanez <[email protected]>
Co-authored-by: Andrew Montanez <[email protected]>
Co-authored-by: Andrew Montanez <[email protected]>
Co-authored-by: Andrew Montanez <[email protected]>
Co-authored-by: Andrew Montanez <[email protected]>
resolves #2128
CU-86b19amxa
Made metadata work with single table synthesizers by converting it to a
SingleTableMetadata
before inserting into the synthesizers.Added a variable called
.real_metadata
that is used to hold theMetadata
version. This will help with phasing outSIngleTableMetadata
in the future.get_metadata
is updated to always return thereal_metadata
which should be of type classMetadata
for interface consistency. Altered and fixed some tests so they ensureMetadata
is the type that is produced.This approach allows us to keep backward compatibility with
SingleTableMetadata
but still force the move over toMetadata
. We can converge methods later without the need of creating branch of metadata handling in all the our synthesizers.