-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide CleanCat schema with a context #39
Comments
What would be an interface for the context? I'm afraid we might have to introduce a breaking change in the library's API. Current interface for field validation code is passing value in parameters. If we add another parameter like And we can't put context into an attribute on the Field object since it's instantiated once during schema definition and never changes. |
Stepping back for a second, what is the intended use of CleanCat schemas?
Just want to make sure that we want to use CleanCat schemas for both (1) and (2)* before we discuss the exact interface.
Yep, this SGTM.
If we bump the package version up to Cc @thomasst @jkemp101 @mpessas. * Also, if we want to use CleanCat schemas for both application-level and domain-level validation, then should we always use a separate schema class for each? |
Passing a context so you can validate references and making it v1.0.0 SGTM. Would like to see some examples on how this will effect reference fields. Also, we should consider making it clear that the convention to override functions should include additional args like |
Can we have an example of the problem we are trying to solve here?
|
I think the general problem is that you don't want to query the database from within a cleancat schema, e.g. validating reference fields. |
I agree this is mixing up shape validation and content validation, but it's so tempting to do so. Imagine this "before" state: class MyObjectSchema(Schema):
user_id = Integer()
another_user_id = Integer()
my_field = String()
@view
def create_object():
database = ...
try:
my_object = MyObjectSchema(request.json()).full_clean()
except ValidationError as e:
raise BadRequest(e.args[0])
if not database.user_exists(my_object['user_id']):
raise BadRequest({'field-errors': {'user_id': 'User does not exist'}})
if not database.user_exists(my_object['another_user_id']):
raise BadRequest({'field-errors': {'another_user_id': 'User does not exist'}})
save(my_object) CleanCat gives us very structured error reporting which API clients can correlate to original fields. I would love to (a) also have this structure generated for me and (b) more declarative validation. For example, when I define a schema, I don't care that user ID is an int or a string or a more complex object. I just want to say "user". This is how it could look like (notice how we don't re-create the error structure by hand anymore): class UserField(Integer):
def validate(self, value, context):
value = super().validate(value, context)
if not context['database'].user_exists(value)
raise ValidationError('User does not exist')
class MyObjectSchema(Schema):
user_id = UserField()
another_user_id = UserField()
my_field = String()
@view
def create_object():
database = ...
try:
my_object = MyObjectSchema(request.json(), context={'database': database}).full_clean()
except ValidationError as e:
raise BadRequest(e.args[0])
save(my_object) |
What cleancat is missing is support for This allows one to provide more complex field-level validation: the If cleancat did have support for this, one would write: class MySchema(cleancat.Schema):
def __init__(self, db):
self._db = db
def clean_user_id(self):
if not db.user_exists(self.user_id):
raise ValidationError('User does not exist')
But that approach already has a potential problem: it probably fetches the user twice, once in the field validation and once (presumably) in the rest of the code — this would be quite easy to miss. Moreover, (regarding your example) it does not allow for grouping database queries to the About the error structure, is there a reason we cannot raise a
Sometimes schemas are indeed useful to validate a dictionary of fields and values (no matter if they come from an HTTP request or not). But having support for So ideally I would want to see both: cleancat schemas supporting |
I actually didn't prioritize implementing this in cleancat because this isn't meant to do generic validation (e.g. something common like validate references -- Django doesn't do that). It's meant to handle specific validation cases that only apply to a given field. Maybe you could argue that in our cases every reference should have its explicit validation function, which makes the code less DRY, but I'd rather see if we can handle this through the field class. |
Well, you can always have The problem with the latter is that The "context" belongs to the form, not the field. |
The distinction @wojcikstefan made earlier I think is pretty important, and is why I'm against adding a context to cleancat schemas and validation. Consider these questions:
IMO cleancat is good at answering the first, but answering the second and third should be out of scope. I'd actually prefer the "before" state in @tsx's above example because there's a clear distinction between the static "is my data the right shape" and dynamic "is the user allowed to see this other user" validation paths. Providing an opaque context within the schema would serve to further conflate these concerns. |
It's fine if cleancat doesn't do that, but there should be some way to generate an error dict that contains all the validation errors for all of your 3 questions, and the nice thing about having it all in the schema is that you would just have the schema that takes care of all the validation. How do you suggest validation scenarios that require database queries and other information should be handled? |
Ok so thinking about this a bit more and talking about it offline, here's what I've arrived at: Basically let's copy the So our normal The Here's an example with a user field: def user_visibility_validator(value: str, context: CleanCatContext):
repo = UserRepo(session=context.session):
try:
user = repo.fetch_by_id(value)
except DoesNotExist:
raise ValidationError
if not user_visible_to_user(user, context.current_user):
raise ValidationError
def user_converter(value: str, context: CleanCatContext) -> User:
repo = UserRepo(session=context.session):
return repo.fetch_by_id(value)
class UserField(cleancat.String):
def clean(self, value):
super().clean(value)
if len(value) != EXPECTED_USER_ID_LENGTH or not value.startswith('user_'):
raise ValidationError
class MySchema(cleancat.Schema):
user = UserField(
validators=[user_visibility_validator], converters=[user_converter]
)
# somewhere in a view
schema = MySchema({'user': ...})
schema.full_clean()
with atomic() as session:
# calls all validators, then calls all converters
schema.validate(
context=CleanCatContext(current_user=current_user, session=session)
) |
Looks good.
|
If you need to validate across fields it would probably make sense to just extend My thought with |
There should be a way to provide a CleanCat schema not only with the data we want to validate, but also the context in which we want to validate it.
Currently, you have to rely on global variables (like Flask's
g
andrequest
) to determine the context you're working in, which is bad for readability, testing, separation of concerns, etc.The text was updated successfully, but these errors were encountered: