-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SplitAt implementation #316
base: master
Are you sure you want to change the base?
Conversation
There is missing a test for infinite sequence. Something like:
|
@leandromoh That test can never work. Note the remarks comments: /// <remarks>
/// This method uses deferred execution semantics but iterates the
/// <paramref name="source"/> entirely when either of the sequences
/// resulting from split are iterated.
/// </remarks> Since the sequence will be iterated entirely, an infinite one will never end or stop when memory has been exhausted. |
I replaced your definition by the following and all tests passed, including that which I comment for Infinite sequence. Would the following version be preferred after some optimizations?
where
|
@leandromoh So you completely changed the implementation to make it work with infinite sources. I'd say it's a little detail you missed out earlier. 😆 I have a hard time reasoning about the code now due to all the side-effects ( Thanks for your suggestions & ideas. Keep 'em coming! |
The idea was to create the simplest draft to pass in all tests. Complex code turns hard to reason about. As C.A.R. Hoare quote: “There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
Since my draft passed in all wroten test, I suppose it is a perfectly valid implementation.
I think all LINQ operators must work with infinit sequences, except aggregation functions (
As I said, its only a draft without all possible optmizations. If we take it as the current implementation for this PR, we can add more optimizations for it, and In the worst case, we can apply the mini memoize to it too and turn it efficient;
I think it shoud be avoided, and the sequence must be as lazy iterated as possible. Imagine that source is long, for example 10.000 elements, and user just split it at 10 and consume only its 25 first elements (first = 10, and second = 15). the current implementation iterate 9975 elements for nothing. This dont make sense and user of course dont expect that. Of course, it is also not performatic.
I think it is more important that methods works with infinit sequences than to have optmizations. Users are not too concerned about optimizations but expect method works with an infinit list if he is not iterating it entirely.
you are welcome, I am here to help and contribute 😆
Being honest, the fact of #211 had been submitted 6 or 7 months ago and it is not still be acceptable for merge made me a bit exhausted. I will be happy when it be released, but for now, I think it is better dont account or wait for it. |
Love that quote by Hoare every time I read it.
True, but aren't you assuming that the tests are perfect? 😃 Look how in #319 I made a capital mistake because I was missing a test (not a problem of coverage). I agree with all your points about staying as lazy as possible. I realized the limitations and implementation choices, which is why I called them out in the remarks sections. I'll reconsider based on your feedback.
Sorry to hear that. If it's any consolation, it's been very consuming to review and fix.
That's a shame because a lot of work & thought went into it. I feel now that my initial inclination to just go with the version in System.Interactive may have been the right call. |
No problem. All we have a life outside github, with duties, and priorities.
I think the problem is not about our work spent, but about define a MVP. The Memoize PR's release is stopped 4 months only because "Dispose should really mean Reset,". |
@atifaziz Any news? |
@leandromoh Not much news, I'm afraid. 😞 I don't know if you followed #337, but I think the idea of using a reader monad to express partial consumption would work here with large and infinite sequences. For example: var source = MoreEnumerable.Generate(0, x => x + 1);
var (xs, ys) = source.SplitAt(10, ys => ys.Take(25)); So |
Merge conflicts resolved: - MoreLinq/MoreLinq.csproj - MoreLinq.Test/TestingSequence.cs
# Conflicts: # MoreLinq.Test/MoreLinq.Test.csproj
2a094a8 updates the implementation based on my revised proposal for #315 where split parts are returned as sub-sequences instead of a pivoted couple (tuple of 2) of parts. 📝 Notes:
|
@atifaziz I liked a lot this new implementation. How about receive a predicate (Func<T, int, bool>) to indicate a split? This allow user split by other criterious. You can write the Index overload in terms of this new overload wrapping int[] offesets with the predicate. Ex.: (e, Index) => offsets.Contains(Index). |
@leandromoh Glad to hear.
It's an idea but have to be careful about avoiding confusion with |
I did not know that Segment can receive an predicate. Why dont write SplitAt in terms of Segment? |
It would almost work but it can't be used because |
|
This PR is an implementation of
SplitAt
as proposed in #315.