-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drivers: ramdisk: Add support for embedded disk images as backing store of RAM disks #82603
base: main
Are you sure you want to change the base?
Conversation
Hello @hIAhmad, and thank you very much for your first pull request to the Zephyr project! |
drivers/disk/ramdisk.c
Outdated
@@ -142,6 +142,23 @@ static const struct disk_operations ram_disk_ops = { | |||
.buf = UINT_TO_POINTER(DT_REG_ADDR(DT_INST_PHANDLE(n, ram_region))), \ | |||
} | |||
|
|||
#define RAMDISK_DEVICE_CONFIG_DEFINE_DISKIMG(n) \ | |||
__asm__( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, this should be done with modification to linker scripts, and linker should take care of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@de-nordic Linker scripts are, in general, even less portable and more cumbersome from usability perspective, compared to inline assembly, unless combined with sufficient external tooling. Zephyr does have a rich framework available for leveraging linker scripts so it can indeed be leveraged to some extent. I was aware of it but hadn't explored it yet so given general concerns with linker scripts, I had gone with an entirely inline assembler-based approach. Your comment got me thinking about it so I have reworked the chaneset to take advantage of Zephyr linker scripts framework.
However, .incbin is still being used minimally. Technically, we can get rid of it too but the alternative on linker scripts side - TARGET/INPUT directives of GNU ld - would be more GNU-specific than .incbin, and other GNU linker syntax that is already being used across Zephyr source tree. It will also require adding some additional extensions to existing Zephyr linker scripts machinery, primarily because TARGET/INPUT directives must precede the site of embedding and must also be placed outside of SECTIONS block. AFAIK, existing linker scripts machinery does not have any such provision. It can of course be added, but given this rarely used feature of GNU ld, I am hesitant to rely on it. Let me know how you see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so I was not clear enough: what I meant is that we have macros like ITERABLE_SECTION_RAM that allows you to put objects into sections defined with help of linker.
What I tried to say is maybe we need macro like, don't stick to naming it is example, BINARY_BLOB_SECTION(section, name, path
) that can be used to just inject bin into section instead of directly doing the asm here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this code needs to go out of here, it has no place in the .c file and completely ignores that we have linker generator AND cmake generator for these, so this would completely break usage for anyone using the cmake option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nordicjm This implementation has already been updated. But I am not sure about what you mean by:
... so this would completely break usage for anyone using the cmake option
I guess you are referring to building using CMake directly, instead of building via west? i.e. something like:
cmake -Bbuild -GNinja -DBOARD=reel_board samples/hello_world
ninja -Cbuild
instead of:
west build -b reel_board samples/hello_world
Or are you referring to something else?
If it is just building via cmake instead of west, then I don't see any problem, at least in my testing - it builds and runs fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to that, you can use LD templates (the default) or cmake generators, build with -DCONFIG_CMAKE_LINKER_GENERATOR=y
which uses cmake files, e.g. https://github.com/zephyrproject-rtos/zephyr/blob/main/cmake/linker_script/common/common-rom.cmake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nordicjm Thanks for the pointer - I hadn't come across CMake linker generators yet in my exploration of Zephyr internals. I have tested with this now but again the approach in this changeset did work in both cases - i.e. with LD templates or with CMake linker generator. But a closer look indicates that I see no problem because I am still using GCC toolchains/GNU ld in both cases.
One problem I have noted is the direct use of .section/,global/.type directives. Zephyr's toolchain/linker abstraction would break here if I'd use some other toolchain that does not support this syntax and/or which (by default) does not recognize the section name used.
I have now updated the changeset to use the wrappers Zephyr toolchain abstraction provides for this purpose. To do so, I had to do away with use of inline assembly, however, because the relevant wrapper macros provided by Zephyr toolchain abstraction are not usable in inline assembler blocks. So now, equivalent assembly code is generated in a separate complementary assembly file that would be automatically included in the build, as needed (when GNU extensions are available).
Please have another look and share your thoughts.
aa7d949
to
65e5629
Compare
@hIAhmad I've approve the devicetree. The quotation marks that need to be added and the blank lines at the end of the file have been completed. I apologize for the misunderstanding :( The first PR went well, which is a good start. |
746ad70
to
db01505
Compare
defined for each disk instance. Use it with caution as it makes memory | ||
contents easily accessible. | ||
|
||
disk-image: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zephyr,disk-image-path
this is probably one of the worst offenses of DT mis-usage I've ever seen but, as usual, there is no alternative for multi instance configuration, so I request you change the property name to this (zephyr,
prefix is used to track technical debt that is project-specific software configuration DT props, and -path
is to make it more clear from just the name what it is)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@decsny I have made the requested updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to update the original commits to add these changes, not add more with those changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nordicjm I did make that specific update in the original commit. However subsequent overall updates have been in separate commits. I was a bit unsure of replacing the initial/previous commit in these updates, as contribution guidelines also mention multiple commits in a single PR. Anyway, I have combined (squashed) the commits to a single commit now, that reflects the latest state of the proposed enhancement.
db01505
to
c6d15b3
Compare
drivers/disk/ramdisk.c
Outdated
#define RAMDISK_DEVICE_CONFIG_DEFINE_DISKIMG(n) \ | ||
__asm__( \ | ||
" .section .data\n" \ | ||
" .balign 8\n" \ | ||
" .global disk_buf_" STRINGIFY(n) "\n" \ | ||
"disk_buf_" STRINGIFY(n) ":\n" \ | ||
" .incbin \"" DT_INST_PROP(n, disk_image) "\"\n"); \ | ||
\ | ||
extern uint8_t disk_buf_##n[] __asm__("disk_buf_" STRINGIFY(n)); \ | ||
\ | ||
static struct ram_disk_config disk_config_##n = { \ | ||
.sector_size = DT_INST_PROP(n, sector_size), \ | ||
.sector_count = DT_INST_PROP(n, sector_count), \ | ||
.size = RAMDISK_DEVICE_SIZE(n), \ | ||
.buf = disk_buf_##n, \ | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this implementation uses .incbin directive which is technically
GNU-specific but is widely supported by other mainstream toolchains as well
(for compatibility with GCC). C23 #embed directive can be a more portable
option but toolchains supporting it are just coming out
I see no reasons to use something GNU-specific here or justify it with something that could happen in the distant future. There also no need to introduce this linker script upstream, please define memory region in your application and use phandle ram-region, ram-region = <&my_fancy_binary_image>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already considered that but ram-region property does not solve the problem cleanly:
- Defining a ram-region is an intrusive change to a platform device-tree. You'll have to take memory out of your other existing memory regions, which makes it highly platform (board) specific. Implementation will also be comparatively ugly as you'd need board-specific device-tree overlays to achieve it.
- If disk image size changes during development, you may need to update the device-tree modification that was made to define the custom ram-region. Or you'll need to give it some generous size to account for changes in disk size (a liberty you won't have if RAM is not aplenty).
- You'll also need an application specific linker script which may end up being more GNU-specific than the use of .incbin in this changeset.
- And then you may have to separately map the region to make it accessible (HW dependency).
(Let me know if I misunderstood something and there can be a simpler/cleaner way to use ram-region property for the usecase being addressed here).
With the proposed change-set, all of this gets handled in a transparent manner. All you need is a (generic) device-tree overlay, and you are set. The embedded disk image will get placed in the data region and will automatically get adjusted even if its size changes - nothing extra needs to be done. And when you don't need to embed the disk image anymore, simply remove the overlay (or remove the disk-image-path property, if still using that RAM disk instance) and rebuild.
ramdisk0 {
compatible = "zephyr,ram-disk";
disk-name = "RAM";
sector-size = <512>;
sector-count = <0>;
zephyr,disk-image-path = "../ramdisk.img";
};
Note: sector-count
will be automatically calculated from the sector-size
and size of the disk image being embedded. (In fact, had sector-count
property been optional, there wouldn't have been any need to specify it at all for RAM disk instances using disk images as backing store).
IMHO, the present enhancement is orthogonal to ram-region property (which can be more suitable for the usecases it was originally introduced for, not the one addressed here) so would be a nice enhancement to out-of-box features provided by Zephyr.
Regarding portability concerns, .incbin is already fairly portable. The feature itself is widely available, only the syntax may vary (if you move away from major toolchains). Still if this is a major concern, we can consider using a portable wrapper incbin.h (https://github.com/graphitemaster/incbin). IMHO though, given the toolchains that would be typically used with Zephyr, that would be a bit overkill.
@jfischer-no Please have another look and share your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Defining a ram-region is an intrusive change to a platform device-tree. You'll have to take memory out of your other existing memory regions, which makes it highly platform (board) specific. Implementation will also be comparatively ugly as you'd need board-specific device-tree overlays to achieve it.
No, it is not an intrusive change and there is no problem with being platform-specific as it would be done in your application.
* If disk image size changes during development, you may need to update the device-tree modification that was made to define the custom ram-region. Or you'll need to give it some generous size to account for changes in disk size (a liberty you won't have if RAM is not aplenty).
Sounds like a feature. Not a problem once your development is done.
* You'll also need an application specific linker script which may end up being more GNU-specific than the use of .incbin in this changeset.
But then it is your application problem, no one in the upstream tree has to deal with it.
* And then you may have to separately map the region to make it accessible (HW dependency).
Maybe, but I do not see any differences to your suggestion.
Regarding portability concerns, .incbin is already fairly portable. The feature itself is widely available, only the syntax may vary (if you move away from major toolchains). Still if this is a major concern, we can consider using a portable wrapper incbin.h (https://github.com/graphitemaster/incbin). IMHO though, given the toolchains that would be typically used with Zephyr, that would be a bit overkill.
My -1 stays. #82603 (comment)
if not used, and 'disk-image' property is also not used, a local buffer is | ||
defined for each disk instance. Use it with caution as it makes memory | ||
contents easily accessible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a stray change - if the new property gets accepted, the existing description of 'ram-region' property should be updated because of reference to the local buffer fallback.
The name of the property, however, has been updated since the original commit to zephyr,disk-image-path
so I'll update it here, too.
Or alternatively, the description can be updated to just describe ram-region property itself, in isolation, with no reference to local buffer fallback. It will be slightly less helpful to the user though, e.g. change it to:
ram-region:
type: phandle
description: |
Optional phandle to the memory region to be used as a RAM disk,
Use it with caution as it makes memory contents easily accessible.
c6d15b3
to
84cd11a
Compare
@decsny I just pushed a small update to My apologies for the inconvenience. Kindly have another look. |
Thanks for the extra pieces of information. If performance is a concern when embedding larger file, then we should consider using C-string with hex-escapes instead of byte arrays.
Adjusting file2hex.py to support the C-string hex termination in addition to hex arrays will give us a solution that seems to both consider performance and can be used across toolchains. This will further allow us to keep existing Zephyr CMake function, perhaps with an extra flag, and thus still have a single function to call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far a small observation in the code.
Main concern is introduction of another GNU specific solution.
We are working hard for better toolchain support and if performance for bin file inclusion is a big concern then I would much prefer to find a solution which performs well across toolchains as well as a Zephyr reusable solution that can be used in other parts of build system, see #82603 (comment).
drivers/disk/CMakeLists.txt
Outdated
if(${is_diskimg_path_absolute}) | ||
list(APPEND ramdisk_diskimg_paths ${dt_diskimage_path}) | ||
else() | ||
# Relative paths are assumed to be relative to Zephyr build directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide some more details / reasoning of this choice.
What is the rationale that a relative path in this particular case should be relative to ZEPHYR_BINARY_DIR
and not to APPLICATION_SOURCE_DIR
which we uses in other places ?
afaict the disk image path may be a pre-built file already existing in the file system (for example a bin blob), and such files can be hard to define in dt as an absolute file because it would put restrictions on where to place the workspace.
Also too much variation in how a relative path is treated as relative to what makes it harder to understand the build system, so we need to be careful and have good reasons for decisions.
Note, this comment is atm not a rejection of the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rationale that a relative path in this particular case should be relative to ZEPHYR_BINARY_DIR and not to APPLICATION_SOURCE_DIR which we uses in other places ?
Being new to Zephyr build system, I wasn't aware of APPLICATION_SOURCE_DIR build variable. This is actually closer to what I wanted. Thanks for pointing it out. Will update accordingly.
afaict the disk image path may be a pre-built file already existing in the file system (for example a bin blob), and such files can be hard to define in dt as an absolute file because it would put restrictions on where to place the workspace.
The prebuilt disk image may be created by some tool/SDK that may live outside of Zephyr source tree and build system. So having the provision of specifying absolute path would allow more flexibility (where needed). If the tool allows to arbitrarily specify the output directory for the disk image, a location inside Zephyr tree can be easily specified but this may not always be the case. In the latter case, user will have to make sure to always copy the image to the application directory (or at least, somewhere inside Zephyr source tree), every time it gets updated.
For in-tree usage of this feature, such as the sample/test I am going to add, relative path will be the proper choice. The provision of absolute path is primarily intended for use cases involving external tools/workflows.
@tejlmand Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tejlmand I have updated the build script to use APPLICATION_SOURCE_DIR
build variable instead.
Interesting find- I agree here, if we can create a performant solution that does not rely on GNU extensions, we should do that rather than having a "fallback method" |
Question, so do not alter RP yet: I know that you can create object file from any file using ld command; that command will give you an object file with one section named after the linked binary file. You still have to use the objcopy (at least I think ld does not have options here) to change section name and properties (like name and flags), but afterwards you can access the binary contents of linked file via extern names. I have made a experiment, just to remind myself part of the process:
Note that I did not do the objcopy, but you can already see that binary file can be accessed as an array between symbols binary_wu_txt_start and binary_wu_txt_end. (btw who remembers using binobj in Turbo Pascal :) ?) |
@de-nordic Direct conversion of binary to object format via ld could indeed be a (relatively more generic) option but I didn't use it as it is also technically GNU specific (though not exactly GCC specific) - it would result in a hard dependency on GNU binutils. Then binutils would be required, even if some non-GCC toolchain is being used. If this dependency is fine/acceptable in Zephyr build system, then this can also be a viable option (would be on-par with incbin in terms of build-time efficiency). (Note that strictly speaking a fallback may still be theoretically required even in this case, as if binutils don't support some architecture/ABI, an equivalent approach or a custom "bin2obj" kind of tool would be needed. The practical likelihood is pretty low though.) CC: @tejlmand |
This all depends on the gains vs. the costs. We should strive to avoid too much gnu centric solutions, especially as we have support for non-GNU toolchains, like ARM Compiler, but support for other non-GNU toolchains are being worked on, such as IAR. Therefore the build time gain must be significant enough to approve a GNU centric solution. I have tested with two sizes of bin files, 1mb and 5mb.
Comparing those numbers with a build of bluetooth peripheral_hids, for nrf52840dk which takes:
Evaluating the different solutions, and taking into account that a common solution, regardless of the toolchain, itself has a high value, then based on those numbers, and compared to the overall time of a build, then a hex-array is not well performing. The The incbin and C string approach is even at same level for clang for a 1 mb file. @hIAhmad Feel free to provide more info if there are use-cases / file-sizes I have not considered. I did not test the ld / objdump proposal, but I assume such solution would perform between the incbin and C string solution. Note: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide a test case or sample for embedding binary data which is loaded into ram.
This will make it easier for users to test the solution and also have an example on how to use the feature.
I wasn't sure if a new test-case/sample would be accepted so didn't include one. I will consider adding one now, soon. |
When creating new features then test-cases are always appreciated. |
5516b97
to
29c2a0e
Compare
29c2a0e
to
42fe6eb
Compare
@tejlmand I have added a sample application as per your suggestion. It is a bit different from other existing samples in that it creates the test disk image on the fly as part of the application build, as I don't think adding a binary disk image to Zephyr source tree just for a demo would be a good idea, especially as it won't be configurable either. The sample is disabled for twister runs because it requires Python pyfatfs module (to create FAT disk images in a portable manner) which is not part of standard build dependencies of Zephyr. If this module is installed (as mentioned in the README), twister runs will also work fine (after enabling the test in sample.yaml). Kindly have a look and let me know if it is good to go. |
817648f
to
6f26bcc
Compare
Add an optional devicetree binding, and corresponding support in the RAM disk driver, to allow using a disk image - embedded as a binary blob in the application image - directly as the backing store of a RAM disk instance. This significantly improves usability of deploying a RAM disk in scenarios where it has to be populated with some initial content provided at build time, especially where that content is provided as a disk image created during the overall system-level build steps. Moreover, it will help ease migration for users moving to Zephyr from some legacy RTOS (such as Nucleus) that supported creating RAM disks from existing disk images, directly using the disk image as a RAM disk afterwards. Zephyr's existing RAM disk implementation can also support this usecase with this small update. To embed binary data, .incbin directive is used or, as a generic fallback, binary data is converted into hex character form and embedded as array data. CMake build system is used to select the appropriate approach in a transparent manner. When GNU extensions are not available, the generic fallback method is used, embedding binary data as array data. When GNU extensions are available for use, .incbin directive is used, as it is much more efficient, with respect to build time (especially as size of the binary data to be embedded increases). Signed-off-by: Irfan Ahmad <[email protected]>
Add a new sample application to demonstrate creating and using RAM disks using embedded disk images as the backing store. Signed-off-by: Irfan Ahmad <[email protected]>
6f26bcc
to
90e9822
Compare
disk_size = int(sys.argv[2]) * 1024 | ||
|
||
# Allocate the disk space for the disk image file. | ||
with open(disk_image_path, 'wb') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More of a suggestion:
It is always good to check for exception(except IOError as e:) while file access.
(Here In this case, if the disk space allocation doesn't happen (because of file open failure), we will not get the required disk size.)
Add an optional devicetree binding, and corresponding support in the RAM
disk driver, to allow using a disk image - embedded as a binary blob in the
application image - directly as the backing store of a RAM disk instance.
This significantly improves usability of deploying a RAM disk in scenarios
where it has to be populated with some initial content provided at build
time, especially where that content is provided as a disk image created
during the overall system-level build steps.
Moreover, it will help ease migration for users moving to Zephyr from
some legacy RTOS (such as Nucleus) that supported creating RAM disks
from existing disk images, directly using the disk image as a RAM disk
afterwards. Zephyr's existing RAM disk implementation can also support this
usecase with this small update.
To embed binary data, .incbin directive is used or, as a generic fallback,
binary data is converted into hex character form and embedded as array
data. CMake build system is used to select the appropriate approach in a
transparent manner. When GNU extensions are not available, the generic
fallback method is used, embedding binary data as array data. When GNU
extensions are available for use, .incbin directive is used, as it is much
more efficient, with respect to build time (especially as size of the
binary data to be embedded increases).