Skip to content

Commit

Permalink
Merge pull request #127 from bmribler/format_release_specific
Browse files Browse the repository at this point in the history
Fix format
  • Loading branch information
bmribler authored Mar 16, 2024
2 parents 4cda61d + edf07fe commit 077acbb
Show file tree
Hide file tree
Showing 4 changed files with 194 additions and 343 deletions.
36 changes: 18 additions & 18 deletions documentation/hdf5-docs/release_specifics/new_features_1_10.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: New Features in HDF5 1.10
redirect_from:
redirect_from:
- display/HDF5/New+Features+in+HDF5+Release+1.10
---

Expand All @@ -17,7 +17,7 @@ HDF5 1.10 introduces several new features in the HDF5 library. These new feature
~~~
This release includes changes in the HDF5 storage format. For detailed information on the changes, see: Changes to the File Format Specification
PLEASE NOTE that HDF5-1.8 cannot read files created with the new features described below that are marked with \*.
PLEASE NOTE that HDF5-1.8 cannot read files created with the new features described below that are marked with \*.
These changes come into play when one or more of the new features is used or when an application calls for use of the latest storage format (H5P_SET_LIBVER_BOUNDS). See the RFC for more details.
Expand All @@ -38,23 +38,22 @@ HDF5 now supports building with the AEC library as a replacement library for SZi
Addition of the Splitter and Mirror VFDs
Two VFDs were added in this release:

The Splitter VFD maintains separate R/W and W/O channels for concurrent file writes to two files using a single HDF5 file handle.
The Splitter VFD maintains separate R/W and W/O channels for "concurrent" file writes to two files using a single HDF5 file handle.
The Mirror VFD uses TCP/IP sockets to perform write-only (W/O) file I/O on a remote machine.
Improvements to Performance
Performance has continued to improve in this release. Please see the images under Compatibility and Performance Issues on the Software Changes from Release to Release page.

Addition of Hyperslab Selection Functions
Several hyperslab selection routines introduced in HDF5-1.12 were ported to 1.10. See the Software Changes from Release to Release page for details.

### New Features Introduced in HDF5 1.10.6
### New Features Introduced in HDF5 1.10.6
The following important new features and changes were introduced in HDF5-1.10.6. For complete details see the Release Notes and the Software Changes from Release to Release page:

#### Improvements to the CMake Support
Several improvements were added to the CMake support, including:

Support was added for VS 2019 on Windows (with CMake 3.15).
Support was added for MinGW using a toolchain file on Linux (C only).
#### Virtual File Drivers - S3 and HDFS
#### Virtual File Drivers - S3 and HDFS
Two Virtual File Drivers (VFDs) have been introduced in 1.10.6:

* The S3 VFD enables access to an HDF5 file via the Amazon Simple Storage Service (Amazon S3).
Expand All @@ -76,8 +75,7 @@ The following APIs were introduced to support this feature:
H5F_GET_DSET_NO_ATTRS_HINT

Retrieves the setting for determining whether the specified file does or does not create minimized dataset object headers

H5F_SET_DSET_NO_ATTRS_HINT
H5F_SET_DSET_NO_ATTRS_HINT

Sets the flag to create minimized dataset object headers

Expand Down Expand Up @@ -109,18 +107,18 @@ Series 4 is unmodified CGNS develop
Compact is using compact storage
Compact 192 is also using compact storage
Compact 384 is also using compact storage
The last 3 compact curves are just three different batch jobs on 192, 384, and 552 nodes (with 36 core/node). The Series 2 and 3 curves are not related to the CGNS benchmark, but give a qualitative indication on the scaling behavior of MPI_Bcast. Both read-proc0-and-bcast and compact storage follow MPI_Bcasts trend, which makes sense since both methods rely on MPI_Bcast. (See the RFC for better resolution.)
The last 3 "compact" curves are just three different batch jobs on 192, 384, and 552 nodes (with 36 core/node). The Series 2 and 3 curves are not related to the CGNS benchmark, but give a qualitative indication on the scaling behavior of MPI_Bcast. Both read-proc0-and-bcast and compact storage follow MPI_Bcast's trend, which makes sense since both methods rely on MPI_Bcast. (See the RFC for better resolution.)

#### OpenMPI Support
Support for OpenMPI was added. For known problems and issues please see OpenMPI Build Issues. To better support OpenMPI, all MPI-1 API calls were replaced by MPI-2 equivalents.

#### Chunk Query Functions (RFC)
New functions were added to find locations, sizes and filters applied to chunks of a dataset. This functionality is useful for applications that need to read chunks directly from the file, bypassing the HDF5 library.

H5D_GET_CHUNK_INFO Retrieves information about a chunk specified by the chunk index
H5D_GET_CHUNK_INFO_BY_COORD Retrieves information about a chunk specified by its coordinates
H5D_GET_NUM_CHUNKS Retrieves number of chunks that have nonempty intersection with a specified selection
H5D_GET_CHUNK_INFO Retrieves information about a chunk specified by the chunk index
H5D_GET_CHUNK_INFO_BY_COORD Retrieves information about a chunk specified by its coordinates
H5D_GET_NUM_CHUNKS Retrieves number of chunks that have nonempty intersection with a specified selection

### New Features Introduced in HDF5 1.10.2
Several important features and changes were added to HDF5 1.10.2. See the release announcement and blog for complete details. Following are the major new features:

Expand All @@ -130,25 +128,27 @@ In HDF5 1.8.0, the H5P_SET_LIBVER_BOUNDS function was introduced for specifying
#### Performance Optimizations for HDF5 Parallel Applications
Optimizations were introduced to parallel HDF5 for improving the performance of open, close and flush operations at scale.


#### Using Compression with HDF5 Parallel Applications
HDF5 parallel applications can now write data using compression (and other filters such as the Fletcher32 checksum filter).


### New Features Introduced in HDF5 1.10.1

#### Metadata Cache Image ( RFC ) » Fine-tuning the Metadata Cache *
#### Metadata Cache Image ( RFC ) -> Fine-tuning the Metadata Cache *
HDF5 metadata is typically small, and scattered throughout the HDF5 file. This can affect performance, particularly on large HPC systems. The Metadata Cache Image feature can improve performance by writing the metadata cache in a single block on file close, and then populating the cache with the contents of this block on file open, thus avoiding the many small I/O operations that would otherwise be required on file open and close.

#### Metadata Cache Evict on Close » Fine-tuning the Metadata Cache
#### Metadata Cache Evict on Close -> Fine-tuning the Metadata Cache
The HDF5 library's metadata cache is fairly conservative about holding on to HDF5 object metadata (object headers, chunk index structures, etc.), which can cause the cache size to grow, resulting in memory pressure on an application or system. The "evict on close" property will cause all metadata for an object to be evicted from the cache as long as metadata is not referenced from any other open object.

#### Paged Aggregation ( RFC ) » File Space Management *
#### Paged Aggregation ( RFC ) -> File Space Management *
The current HDF5 file space allocation accumulates small pieces of metadata and raw data in aggregator blocks which are not page aligned and vary widely in sizes. The paged aggregation feature was implemented to provide efficient paged access of these small pieces of metadata and raw data.

#### Page Buffering ( RFC )
Small and random I/O accesses on parallel file systems result in poor performance for applications. Page buffering in conjunction with paged aggregation can improve performance by giving an application control of minimizing HDF5 I/O requests to a specific granularity and alignment.





### New Features Introduced in HDF5 1.10.0

Expand Down Expand Up @@ -188,4 +188,4 @@ See the HDF5 File Format Specification for complete details on the changes. This
* The Data Layout Message was changed: the name was changed, and version 4 of the data layout message was added for the virtual type.
* Additional types of indexes were added for dataset chunks.

HDF5-1.8 cannot read files created with the new features described on this page that are marked with *.
HDF5-1.8 cannot read files created with the new features described on this page that are marked with *.
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ is performing other tasks.

* [Subfiling VFD](http://docs.hdfgroup.org/hdf5/rfc/RFC_VFD_subfiling_200424.pdf)
The basic idea behind sub-filing is to find the middle ground between
single shared file and one file per process ¿ thereby avoiding some
single shared file and one file per process - thereby avoiding some
of the complexity of one file per process, and minimizing the locking
issues of a single shared file on a parallel file system.

Expand Down
30 changes: 15 additions & 15 deletions documentation/hdf5-docs/release_specifics/sw_changes_1.10.md
Original file line number Diff line number Diff line change
Expand Up @@ -329,9 +329,9 @@ In the C Interface (main library)
The following are new C functions in this release:

H5D_GET_CHUNK_STORAGE_SIZE Returns storage amount allocated within a file for a raw data chunk in a dataset
H5F_GET_EOA Retrieves the file’s EOA
H5F_GET_EOA Retrieves the file's EOA
H5F_INCREMENT_FILESIZE
Sets the file’s EOA to the maximum of (EOA, EOF) + increment
Sets the file's EOA to the maximum of (EOA, EOF) + increment

H5F_SET_LIBVER_BOUNDS Enables the switch of version bounds setting for a file
H5FDdriver_query Queries a VFL driver for its feature flags when a file is not available (not documented in Reference Manual)
Expand Down Expand Up @@ -423,9 +423,9 @@ See the Release.txt file for details.
Tools
New options were added to the h5clear utility:

--filesize Print the file’s EOA and EOF
--filesize Print the file's EOA and EOF
--increment=C
Set the file’s EOA to the maximum of (EOA, EOF) + C for the file
Set the file's EOA to the maximum of (EOA, EOF) + C for the file

C is >= 0; C is optional and will default to 1M when not set

Expand Down Expand Up @@ -579,7 +579,7 @@ hid_t

Changed from a 32-bit to a 64-bit value.

hid_t is the type is used for all HDF5 identifiers. This change, which is necessary to accomodate the capacities of modern computing systems, therefore affects all HDF5 applications. If an application has been using HDF5’s hid_t the type, recompilation will normally be sufficient to take advantage of HDF5 Release 1.10.0. If an application uses an integer type instead of HDF5’s hid_t type, those identifiers must be changed to a 64-bit type when the application is ported to the 1.10.x series.
hid_t is the type is used for all HDF5 identifiers. This change, which is necessary to accomodate the capacities of modern computing systems, therefore affects all HDF5 applications. If an application has been using HDF5's hid_t the type, recompilation will normally be sufficient to take advantage of HDF5 Release 1.10.0. If an application uses an integer type instead of HDF5's hid_t type, those identifiers must be changed to a 64-bit type when the application is ported to the 1.10.x series.

New Features and Feature Sets
Several new features are introduced in HDF5 Release 1.10.0.
Expand Down Expand Up @@ -627,7 +627,7 @@ Retrieves the values of the append property that is set up in the dataset access

H5Pset_append_flush

Sets two actions to perform when the size of a dataset’s dimension being appended reaches a specified boundary.
Sets two actions to perform when the size of a dataset's dimension being appended reaches a specified boundary.

H5Pget_object_flush_cb

Expand Down Expand Up @@ -657,19 +657,19 @@ Globally prevents dirty metadata entries from being flushed from the metadata ca

H5Fenable_mdc_flushes

Returns a file’s metadata cache to the standard eviction and flushing algorithm.
Returns a file's metadata cache to the standard eviction and flushing algorithm.

H5Fare_mdc_flushes_disabled



Determines if flushes have been globally disabled for a file’s metadata cache.
Determines if flushes have been globally disabled for a file's metadata cache.

H5Fget_mdc_flush_disabled_obj_ids



Returns a list of all object identifiers for which flushes have been disabled in a file’s metadata cache.
Returns a list of all object identifiers for which flushes have been disabled in a file's metadata cache.


Command-line Tools:
Expand Down Expand Up @@ -1100,13 +1100,13 @@ A new version of the function, H5Rdereference2, is introduced.
The compatiblity macro H5Rdereference is introduced.

Autotools Configuration and Large File Support
Autotools configuration has been extensively reworked and autotool’s handling of large file support has been overhauled in this release.
Autotools configuration has been extensively reworked and autotool's handling of large file support has been overhauled in this release.

See the following sections in RELEASE.txt:

“Autotools Configuration Has Been Extensively Reworked”
“LFS Changes”
RELEASE.txt is found in the release_docs/ subdirectory at the root level of the HDF5 code distribution.
[Autotools Configuration Has Been Extensively Reworked](Autotools Configuration Has Been Extensively Reworked)
[LFS Changes](LFS Changes)
RELEASE.txt can be found in the release_docs/ subdirectory at the root level of the HDF5 code distribution.

Compatibility Report and Comments
Compatibility report for Release 1.10.0 versus Release 1.8.16
Expand All @@ -1115,9 +1115,9 @@ Compatibility report for Release 1.10.0 versus Release 1.8.16

Comments regarding the report

In the C interface, the hid_t change from 32-bit to 64-bit was made in order to address a performance problem that arose when the library “ran out” of valid object identifiers to issue and thus needed to employ an expensive algorithm to find previously issued identifiers that could be re-issued. This problem is avoided by switching the size of the hid_t type to 64-bit integers instead of 32-bit integers in order to make the pool of available integers significantly larger. (H5E_major_t and H5E_minor_t are aliased to hid_t which is why they changed size as well). (An alternate solution to this problem was applied in release HDF5 1.8.5 but this is the cleaner/preferred solution and had to wait until 1.10.0 to be included).
In the C interface, the hid_t change from 32-bit to 64-bit was made in order to address a performance problem that arose when the library "ran out" of valid object identifiers to issue and thus needed to employ an expensive algorithm to find previously issued identifiers that could be re-issued. This problem is avoided by switching the size of the hid_t type to 64-bit integers instead of 32-bit integers in order to make the pool of available integers significantly larger. (H5E_major_t and H5E_minor_t are aliased to hid_t which is why they changed size as well). (An alternate solution to this problem was applied in release HDF5 1.8.5 but this is the cleaner/preferred solution and had to wait until 1.10.0 to be included).

hbool_t will now be defined as a _Bool type when configure determines that it’s available.
hbool_t will now be defined as a _Bool type when configure determines that it's available.

Public structs that have members of type hid_t or hbool_t are affected by the above changes accordingly.

Expand Down
Loading

0 comments on commit 077acbb

Please sign in to comment.