-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: k_pipe: Rewrite k_pipe API #83283
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a pretty solid path to me. Some notes after a very quick read-through, the big ones being "fix the races" and "use ring_buf if possible". You're also going to (obviously) need test coverage for all the cases here, including ideally a stress that will bang pipes with variant read/write/buffer sizes from multiple threads on each end.
The process and architecture people will want to chime in as to migration strategy. As you'll discover I'm a lot (like... a lot) more tolerant of churn and cowboy evolution than most of the project. My guess is that people are going to demand a more conservative rollout. Maybe give the two APIs distinct names and let them live in the tree simultaneously for a version before flagging the old stuff deprecated. Put each under kconfigs such that they can be independently en/disabled, etc...
Also recognize that treewide deprecation means combing through all the in-tree code to make sure that nothing else is using the old layers, deleting or filtering test cases using the wrong stuff, etc... It's tedious and annoying and everyone will hate you when their precious out-of-tree code stops building; just be prepared. :)
kernel/pipe.c
Outdated
pipe->tail = 0U; | ||
pipe->count = 0U; | ||
|
||
// or -ECONNRESET ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. You do seem to be trawling pretty deep into errno.h to try to find stuff I haven't seen before. FWIW, I think that's usually more harm than good unless you're extremely careful about documenting what each code means and how a user is expected to react. 99%+ of the time -EINVAL or -EAGAIN is all that's required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve made sure to document the behavior carefully to address potential ambiguity.
In this case, the use of this specific error code is intended for a scenario where we want to signal to a consumer thread that the current work order or process is no longer valid, without needing to close the pipe or disrupt the threads that rely on it. I believe this is a valid use case that justifies going beyond the more common -EINVAL
or -EAGAIN
.
e46dad2
to
a7b74f2
Compare
Thanks for your initial review @andyross! I see an oportunity here I'd like to explore; you mention to use the |
That seems like an eminently reasonable choice too. Though FWIW if you have a pipe based on pre-existing primitives then its should probably live in lib/ somewhere and not kernel/. Whether the overhead is too high is sort of an app question. As it happens k_sem is really pretty lean and optimal, and I'd guess that kind of tool would benchmark faster than the pipe we have already. But surely you could do better at the bottom of the stack. I love optimization, but don't know that we should demand it. |
In addition to the points that Andy raised, there are a couple of implications that are worth pointing out related to the copying of data.
|
a7b74f2
to
17a7e7e
Compare
I’ve really enjoyed the learning experience of working with the scheduler, particularly with the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments:
This was mentioned already, but you should consider using the ring_buffer
facility rather than open coding your own. This will save some binary
footprint not to have 2 different implementations linked in. Furthermore,
the most costly part in yours is by far the modulus operator on some targets
which the generic code dispense with.
You say:
I looked into the
ring_buffer.c
and it looks like it is two things in
one? Both an object queue and a ring_buffer.c for bytes.
Not sure what you mean here, however if it is the ring_buf_item_*
stuff
that looks confusing then you may safely ignore it. And better look in
ring_buffer.h
first instead.
The size of the ring_buffer struct alone would almost double the size of
the k_pipe object so I opted to not use it.
Given those fields you will no longer need, the k_pipe object would grow by
only 16 bytes. Unless you are using a gazillon of pipes, the tradeoff is in
favor of code reuse both in terms of binary size and maintainability.
(Since I almost had all the logic in place for the ring-buffer already).
Sorry but such arguments don't count. ;-)
Next, the pipe implementation itself is lacking. If you want to write
10 bytes but there is room for only 5 bytes then you should write those 5
bytes, flag any waiters, and wait up to the timeout to write the remaining 5
bytes. If suddenly you're woken up and there is only room for 3 bytes then
you should write those 3 bytes, flag potential waiters, readjust the timeout
for what remains of the original one, and wait again for the last 2 bytes.
Same logic goes for the reader. Currently you return early when there is not
enough room despite the timeout not being reached.
Timeout readjusting is tricky. You may look at sys_timepoint_timeout()
usage
in kheap.c for an example of how to do it simply.
17a7e7e
to
1357469
Compare
The distinction you’ve pointed out between this implementation and other kernel objects is intentional. By waking up as soon as there’s a chance to send or receive data, it aims to maximize concurrency, reduce blocking, and decrease complexity. I understand that this behavior deviates from existing patterns, which might cause confusion or misaligned expectations. However, this new behavior aligns closely with the concept of a byte-stream, which I believe makes it more intuitive for certain use cases.
The issue of indeterminate time elapsing between wakeup and execution is indeed a valid concern, especially when considering scenarios where the timeout might expire during this window. The current design assumes that the timeout applies to the operation's initiation, not its completion. Achieving precise timing for both initiation and completion would be challenging. A possible middle ground might be to document the timeout behavior explicitly in the API to set clear expectations for users. Regarding data transfers larger than the pipe buffer, this scenario is effectively handled by allowing operations to perform partial reads and writes. This mechanism ensures that even when the data size exceeds the buffer capacity, the transfer can proceed incrementally until completion. |
Ahaha Fair! I’ve taken another look at the implementation and made revisions to utilize the ring_buffer as suggested. While I initially hesitated due to the additional overhead, I agree that the tradeoff in terms of binary size and maintainability justifies the change. Reusing existing infrastructure is the better long-term choice. Regarding the
While I understand the reasoning behind partial writes and adjusting the timeout dynamically, I believe that returning early when there is insufficient space aligns better with the design goals of simplicity and efficiency. Continuously retrying with adjusted timeouts adds complexity (and possibly overhead), which might not be necessary for most use cases. Instead, handling partial writes and managing timeouts at a higher level can keep the implementation cleaner, more predictable and easier to test.
Thank you for the advice, I'll keep it in mind depending on the outcome of this discussion :) |
Something that struck me when I made the move to The
I recommend keeping
What are your thoughts on this? |
There's already a CONFIG_PIPES that gets the pipe code, just have that "select RING_BUFFER". It's true that our kconfigs are sometimes finer grained than they should be, but there's no shame in having the kernel depend on code in lib/ that's what it's there for. |
What are your thoughts on this?
Many things rely on ring_buffer.c already, so I'm rather surprized it is
not always compiled. I'd say it is light and quick to compile so no real
gain in leaving it out, given that bitarray.c is even bigger and always
built.
|
I propose removing the Including these functions in the interface suggests that their results can be relied upon for decision-making, which risks subtle bugs or misuse. If their intended purpose is to provide a "snapshot" for non-critical use cases, their utility seems limited. For a more robust and intuitive interface, I recommend removing these functions entirely, as doing so would simplify the API and reduce potential confusion for users. That said, these functions are currently used in |
aa754c0
to
814edfd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took another pass through. Looking cleaner now. The ring_buf usage looks correct and straightforward. To repeat though: I'm absolutely not above appreciating a rewritten data structure, but if you do it you need to show clear and obvious advantages. :)
As others are pointing out, the semantics of short counts are a little confusing. Zephyr IPC almost always takes timeout arguments, so ideally this code should sit in a loop trying to iteratively read/write until the timeout expires instead of returning on the first byte.
Code that actually does want to see the partial results (which is usually a rare edge case kind of thing, the whole idea of "file descriptor" style I/O is that it synthesizes a synchronous interface over async hardware behavior) can always try to read one byte as a sentinel and follow it with a call using K_NO_WAIT to grab the rest.
814edfd
to
c1f63fa
Compare
Twister 1 ( The following modification to the trace mappings resolves the issue for me:
Twister 2, seems to fail due to timeout rellated issues both in the |
- Shift the unix socket implementation to use the ring-buffer instead. Since it is already protected by semaphores, this approach would ensure correctness without needing availability functions.
OK. I do like that. Makes that code much more efficient as well as it
actually does not need the overhead of a full pipe facility
(doesn't want thread suspension).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
Some more details: I'd suggest removing the wait_cond_t
abstraction
and simply deal with the pipe full/empty test at the wait_for()
call
site instead. Doing so will let the compiler inline them which is likely
to produce smaller and faster code compared to passing function pointers.
And with regards to tracing: you have many exit points invoking the same
trace facility. Might be worth consolidating them with something like:
[...]
if (rc == -EAGAIN) {
k_spin_unlock(&pipe->lock, key);
rc = read ? read : -EAGAIN;
goto exit;
} [...]
}
rc = read;
exit:
SYS_PORT_TRACING_OBJ_FUNC_EXIT(k_pipe, read, pipe, rc);
return rc;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good for me at this point, with one nitpick about being 100% sure we audited the synchronization in socketpair.
As I mention earlier there is a issue I don't know how to resolve.
Twister 1 needs a PR to the Twister 2 tests seems to fail on a specific target |
this is using renode, you can install it and run it as a simulator like qemu. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still an issue.
You have:
static inline int wait_for_data(_wait_q_t *waitq, struct k_pipe *pipe, k_spinlock_key_t *key,
k_timeout_t timeout)
{
[...]
} else if (!pipe_empty(pipe)) {
return 0;
}
return -EAGAIN;
}
Let's assume this is called by k_pipe_write()
because the pipe is full.
The thread goes to sleep until a reader has consumed data from the pipe.
Now if the pipe is completely empty you return -EAGAIN
even if there is
plenty of space to write data? This looks like a bug that the test suite
should catch.
I'd suggest moving the last test to k_pipe_read()
since this is the only
caller that cares about such condition and have the final return to be 0.
@@ -77,7 +77,7 @@ config OBJ_CORE_SEM | |||
|
|||
config OBJ_CORE_PIPE | |||
bool "Integrate pipe into object core framework" | |||
default y if PIPES | |||
default y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So pipes are now enabled by default, this might have some footprint implication on those not using the pipe, why does it need to be enabled by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it need to be enabled by default?
Strictly speaking it deosn't need to be. However, similar data structures are typically enabled automatically when CONFIG_MULTITHREADING
is set. In this case, the pipe implementation and atomic services are the only data structures treated differently.
I assumed enabling it by default would have minimal impact, as the additional pipe structure would only slightly increase build time and should be optimized away during compilation.
Moreover, this approach eliminates the need for an extra Kconfig option to manage mutually exclusive configurations when selecting a pipe implementation during the deprecation process.
Do you prefer an alternative approach to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok... I need to revise my previous comment as I didn't realize there are
separate wait_for_data()
and wait_for_space()
now.
The obvious question is... why?
Those two functions are identical except for the last condition. Hence
my earlier suggestion to remove that last condition, return 0 at the end,
and perform that last condition in the caller instead
The `k_pipe_*` API has been reworked to provide a more consistent and intuitive interface. The new API aims to provide a simple to use byte stream interface that is more in line with the POSIX pipe API. The previous API has been deprecated and will be removed in a future release. Signed-off-by: Måns Ansgariusson <[email protected]>
This commit updates the k_pipe documentation to reflect the rewritten interface. Changes include: * Detailed explanations of k_pipe concepts, initialization, and usage. * Examples for defining, writing to, reading from, flushing, and closing a pipe. * Clarified behavior in edge cases and introduced enhanced error-handling details. Updated suggested use cases and API references. Signed-off-by: Måns Ansgariusson <[email protected]>
This commit adds polling support to the newly rewritten k_pipe interface. Changes include: * Removed ifdef CONFIG_POLL from kernel/poll.c to let both implementations coexist. * Added the needed datastructures to the new k_pipe struct. * k_pipe_write(..) now notifies the poll subsystem that data is available. Signed-off-by: Måns Ansgariusson <[email protected]>
This patch adds object_core support for the k_pipe api rework. Signed-off-by: Måns Ansgariusson <[email protected]>
This commit adds new test cases for the pipe API rework. * basic.c: Sanity check for pipe operations. * concurrency.c: Test pipe operations with multiple threads. * stress.c: Test pipe operations under stress conditions. And moves the old pipe test cases to the deprecated folder. Signed-off-by: Måns Ansgariusson <[email protected]>
Add tracing support for the reworked k_pipe API. Signed-off-by: Måns Ansgariusson <[email protected]>
The struct ring_buf is renamed to struct ring_buffer to be able to coexist with the sys/ring_buffer.h header file. Signed-off-by: Måns Ansgariusson <[email protected]>
The struct ring_buf is renamed to struct ring_buffer to be able to coexist with the sys/ring_buffer.h header file. Signed-off-by: Måns Ansgariusson <[email protected]>
The smp_shell module uses k_fifo, but does not include the kernel header file that defines it. This commit adds the missing include. Signed-off-by: Måns Ansgariusson <[email protected]>
Replaced the k_pipe-based implementation in sockpair with ring_buffer based implementation instead. The move to ring_buffer is done to avoid overhead of k_pipe and to align with the new k_pipe API. This does not pose any added risk to concurrency as the read and write functions are protected by semaphores for both spairs. Signed-off-by: Måns Ansgariusson <[email protected]>
Update tests to use the reworked k_pipe API. Signed-off-by: Måns Ansgariusson <[email protected]>
Thank you @peter-mitsis, @npitre, @andyross, and @nashif for your reviews! I think I've made the appropriate changes based on your feedback. The PR that resolves the Twister 1: zephyrproject-rtos/percepio#14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driving by to refresh +1. One naming nitpick I just noticed, but doesn't rise to a respin if no one else cares.
* | ||
* @param pipe Address of the pipe. | ||
*/ | ||
__syscall void k_pipe_flush(struct k_pipe *pipe); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Naming things" nitpick: this is more properly a "reset" operation and not a "flush". Normally "flushing" something causes the output to drain, blocking the caller if necessary. No data is expected to be dropped or lost (c.f. stdio fflush()).
The
k_pipe_*
API has been reworked to provide a more consistent andintuitive interface. The new API aims to provide a simple to use byte
stream interface that is more in line with the POSIX pipe API.
The previous API has been deprecated and will be removed in a future
release.
Signed-off-by: Måns Ansgariusson [email protected]