-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #8099: Add binary:join/2 to stdlib #8100
Issue #8099: Add binary:join/2 to stdlib #8100
Conversation
CT Test Results 2 files 96 suites 1h 7m 38s ⏱️ Results for commit 7a3bd5c. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
lib/stdlib/src/binary.erl
Outdated
<<"a, b, c">> | ||
``` | ||
""". | ||
-doc(#{since => <<"OTP 27.0">>}). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it'll make it into OTP 27 or not but I'm assuming that RC is still open 👍
lib/stdlib/src/binary.erl
Outdated
-spec join([binary()], binary()) -> binary(). | ||
join([H], _Separator) -> H; | ||
join([H | T], Separator) -> | ||
join(T, Separator, H). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as an option
join(T, Separator, H). | |
lists:foldl(fun(Element, Acc) -> <<Acc/binary, Separator/binary, Element/binary>> end, H, T). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that not lose some performance though? Performance being the main reason for this implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's very little change performance wise in this suggestion nor do I see the big difference in reduction of complexity? Care to elaborate why you'd prefer this option? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a will to replace generic code patterns with standard library functions (to use high-order-functions instead of manual recursion).
335d9fc
to
26e06f7
Compare
4b2771c
to
e0df3f0
Compare
Squashed to get rid of some of the commit noise |
e0df3f0
to
036b654
Compare
3fc013a
to
fa6d3b7
Compare
Fwiw, there was an attempt at this a few years back, that ended up not making it due to a core team decision. It's possible, though, that stuff's changed since then. |
lib/stdlib/src/binary.erl
Outdated
%% Starting with an empty binary convinces the compiler to use the new "private append" optimisation | ||
Acc = <<>>, | ||
join(T, Separator, <<Acc/binary, H/binary>>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bjorng @jhogberg @michalmuskala
I understand why starting with the empty string makes the compiler use the new "private append" optimisation. However I wonder if this could be generalized?
While H
cannot be privately appended because it comes from an external function, the next iteration of join/3
can be privately appended, because no one uses the intermediate binary. Therefore, for those "closed loops", would it make sense to have a bit in the binary that tells when to private append or not? Generally speaking, everything after the first iteration would be privately appended. This would allow private append to happen in more situations, although I am not aware of the costs of reserving one extra bit for binaries.
PS: I may be completely off mark here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why starting with the empty string makes the compiler use the new "private append" optimisation. However I wonder if this could be generalized?
While
H
cannot be privately appended because it comes from an external function, the next iteration ofjoin/3
can be privately appended, because no one uses the intermediate binary.
The private append operation is faster because it does fewer tests and less working than general append operation. Therefore, the private append must only be used with a binary that has been specially prepared.
However, it should be possible for the compiler to generalize the optimization. If the call looks like:
join(T, Separator, H)
the compiler could rewrite it to:
Acc = <<>>,
join(T, Separator, <<Acc/binary, H/binary>>);
Not sure that this code pattern is common enough to make the optimization worthwhile to implement, though. It would not be needed if the clause is rewritten to:
join([_ | _]=List, Separator) when is_binary(Separator) ->
join(List, Separator, <<>>);
After the first release candidate, we generally focus on bug fixes and polishing of features already included or planned for the release. To ensure that Erlang/OTP 27 will be as good as it possibly can be, we need to minimize the time we spend on things not to be included in the release. Therefore, we will not review this pull request until after OTP 27 has been released. If we have not came back to it before September, feel free to remind us. |
83b55e7
to
9aed9c9
Compare
Now we are back working on Erlang/OTP 28. OTP Technical Board will have to it approve this addition to the Meanwhile, I've pushed a commit with suggested clean ups and proper error handling. Please review. If you approve, please rebase on the latest |
9aed9c9
to
7a3bd5c
Compare
Thank you! Squashed and rebased now 👍 |
Thanks! Added to our daily builds. |
Thanks for the pull request. |
See linked issue for details.