-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Until #59
Until #59
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the PR, this looks like a great addition, and the careful unit tests for edge cases etc. are much appreciated.
I have some questions, see what you think.
Also, I've just recently added black to our set of tools, it would help if you could install pre-commit to apply all the linters/formatters - see https://parsy.readthedocs.io/en/latest/contributing.html Otherwise I can do it.
src/parsy/__init__.py
Outdated
# behaves as times if parser is not defined | ||
if not other: | ||
return self.times(min, max)(stream, index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this helpful behaviour? I would have thought that if you don't want other
, you should be using times
. Was there a particular use case that was driving this? I can't imagine doing p.until(None)
if I wanted p.times(min=1)
.
Personally I would just delete this branch, we don't do any similar None checks in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not having it checked is fine.
src/parsy/__init__.py
Outdated
@@ -181,6 +181,53 @@ def at_least(self, n): | |||
def optional(self): | |||
return self.times(0, 1).map(lambda v: v[0] if v else None) | |||
|
|||
def until(self, other, min=1, max=float('inf'), consume_other=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should discuss the default of min=1
. It's not consistent with many
, and I think surprising.
For example, this code:
>>> foo = regex('.').until(string('x'))
>>> foo.parse_partial('xxx')
I would naturally read this as "parse any character repeatedly until you reach an x
". So the result, which is (['x'], 'xx')
as it stands, is more surprising than ([], 'xxx')
IMO.
To back that up, consider this progression:
>>> foo.parse_partial("abx")
>>> foo.parse_partial("ax")
>>> foo.parse_partial("x")
I would expect the progression:
(['a', 'b'], 'x')
(['a'], 'x')
([], 'x')
But in fact the last raises a ParseError
.
I'm open to other counter examples though. It boils down to the difference between *
and +
in regexes, but I think the existing behaviour of many
is significant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is exactly like * and + and I am fine with either. I will change it to min=0.
src/parsy/__init__.py
Outdated
f'at most {max} items') | ||
|
||
# failed, try parser | ||
result = self.__call__(stream, index).aggregate(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be self(stream, index)
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i was just being (too) verbose.
I will make the changes and push soon. |
Streamlined code
Oh and I would rather have you handle the pre-commit for now. |
Thanks so much, I merged a squashed version of this with black applied and a few small tweaks. |
This is an implementation of the manyTill method mentioned in Issue #14.
It adds functionality similar to the PR #31, which seems to have stalled, but this time implemented as a method.
It has options for min, max and consume_other, which is False by default.