-
Notifications
You must be signed in to change notification settings - Fork 14
/
Copy pathNEWS
267 lines (170 loc) · 8.72 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
CHANGES IN VERSION 1.34.0
-------------------------
NEW FEATURES
o Add 'as.vector' argument to h5mread().
SIGNIFICANT USER-VISIBLE CHANGES
o Improvements to coercions from CSC_H5SparseMatrixSeed, H5SparseMatrix,
TENxMatrix, or H5ADMatrix to SparseArray:
- should be significantly more efficient, thanks to various tweaks that
happened in the SparseArray and Delayed5Array packages;
- support coercing an object with more than 2^31 nonzero values.
o Coercion from any of the class above to a sparseMatrix derivative
now fails early if object to coerce has >= 2^31 nonzero values.
o All *Seed classes in the package now extend the new OutOfMemoryObject
class defined in BiocGenerics (virtual class with no slots).
BUG FIXES
o Fix long standing bug in t() methods for CSC_H5SparseMatrixSeed and
CSR_H5SparseMatrixSeed objects.
o Replace internal calls to rhdf5::H5Fopen(), rhdf5::H5Dopen(), and
rhdf5::H5Gopen(), with calls to new internal helpers .H5Fopen(),
.H5Dopen(), and .H5Gopen(), respectively.
See commit 31a7e06 for more information.
CHANGES IN VERSION 1.32.0
-------------------------
NEW FEATURES
o Some light refactoring of the HDF5 dump management utilities:
- All the settings controlled by the get/setHDF5Dump*() functions are
now formally treated as global options (i.e. they're stored in the
global .Options vector). The benefit is that the settings will always
get passed to the workers in the context of parallel evaluation, even
when using a parallel back-end like BiocParallel::SnowParam.
In other words, all the workers are now guaranteed to use the same
settings as the main R process.
- In addition, getHDF5DumpFile() was further modified to make sure that
it will generate unique "automatique dump files" across workers.
SIGNIFICANT USER-VISIBLE CHANGES
o Change 'with.dimnames' default to TRUE (was FALSE) in writeHDF5Array().
BUG FIXES
o Make sure that chunkdim(x) on a TENxRealizationSink,
CSC_H5SparseMatrixSeed, or CSR_H5SparseMatrixSeed object 'x'
**always** returns dimensions that are at most dim(x), even
when 'x' has 0 rows and/or columns.
CHANGES IN VERSION 1.30.0
-------------------------
NEW FEATURES
o Add 'dim' and 'sparse.layout' args to H5SparseMatrixSeed().
SIGNIFICANT USER-VISIBLE CHANGES
o HDF5Array now imports S4Arrays.
CHANGES IN VERSION 1.28.0
-------------------------
- No changes in this version.
CHANGES IN VERSION 1.26.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Try harder to find and load the matrix rownames of a 10x Genomics dataset.
See commit abafbb9e99ad54a64e5013305486b97daa9442bc.
BUG FIXES
o Handle HDF5 sparse matrices where shape is not an integer vector.
When the shape returned by internal helper .read_h5sparse_dim() is a
double vector it is now coerced to an integer vector. Integer overflows
resulting from this coercion trigger an error with an informative error
message.
See GitHub issue #48.
CHANGES IN VERSION 1.24.0
-------------------------
SIGNIFICANT USER-VISIBLE CHANGES
o Improve error reporting in internal helper .h5openlocalfile()
BUG FIXES
o Make sure updateObject() handles very old HDF5ArraySeed instances.
CHANGES IN VERSION 1.22.0
-------------------------
- No changes in this version.
CHANGES IN VERSION 1.20.0
-------------------------
NEW FEATURES
o Implement the H5SparseMatrix class and H5SparseMatrix() constructor
function. H5SparseMatrix is a DelayedMatrix subclass for representing
and operating on an HDF5 sparse matrix stored in CSR/CSC/Yale format.
o Implement the H5ADMatrix class and H5ADMatrix() constructor function.
H5ADMatrix is a DelayedMatrix subclass for representing and operating
on the central matrix of an ‘h5ad’ file, or any matrix in its '/layers'
group.
o Implement H5File objects. The H5File class provides a formal
representation of an HDF5 file (local or remote, including a file
stored in an Amazon S3 bucket).
o HDF5Array objects now work with files on Amazon S3 (via use of H5File()).
BUG FIXES
o Remove "global counter" files at unload time (commit f7913043).
CHANGES IN VERSION 1.18.0
-------------------------
NEW FEATURES
o Add 'as.sparse' argument to h5mread(), HDF5Array(), HDF5ArraySeed(),
writeHDF5Array(), saveHDF5SummarizedExperiment(), and
HDF5RealizationSink().
Even though it won't change how the data is stored in the HDF5 file
(data will still be stored the usual dense way), the 'as.sparse'
argument allows the user to control whether the HDF5 dataset should
be considered sparse (and treated as such) or not. More precisely,
when HDF5Array() is called with 'as.sparse=TRUE', the returned object
will be considered sparse i.e. blocks in the object will be loaded as
sparse objects during block processing. This should lead to less
memory usage and hopefully overall better performance.
o Add is_sparse() setter for HDF5Array and HDF5ArraySeed objects.
SIGNIFICANT USER-VISIBLE CHANGES
o Change default value of 'verbose' argument from FALSE to NA for
writeHDF5Array(), saveHDF5SummarizedExperiment(), and writeTENxMatrix().
BUG FIXES
o Fix handling of logical NAs in h5mread().
o Fix bug in saveHDF5SummarizedExperiment() when 'chunkdim' is specified.
CHANGES IN VERSION 1.16.0
-------------------------
NEW FEATURES
o New h5writeDimnames()/h5readDimnames() functions for writing/reading
the dimnames of an HDF5 dataset to/from the HDF5 file.
See ?h5writeDimnames for more information.
o Add full support for HDF5Array objects of type "raw":
- writeHDF5Array() now works on a DelayedArray object of type "raw" (it
creates an H5 dataset of type H5T_STD_U8LE).
- The HDF5Array() constructor now should return an HDF5Array object of
type "raw" when pointed to an H5 dataset with an 8-bit width type (e.g.
H5T_STD_U8LE, H5T_STD_U8BE, H5T_STD_I8LE, H5T_STD_I8BE, H5T_STD_B8LE,
H5T_STD_B8BE, etc...)
o Add 'H5type' argument to writeHDF5Array().
o h5mread() now supports contiguous (i.e. unchunked) string data.
SIGNIFICANT USER-VISIBLE CHANGES
o HDF5Array objects now find their dimnames in the HDF5 file.
writeHDF5Array() and as(x, "HDF5Array") know how to write the dimnames
to the HDF5 file, and the HDF5Array() constructor knows how to find
them. See ?writeHDF5Array for more information.
BUG FIXES
o Fix bug causing character data to be truncated when written to HDF5 file.
o Fix h5mread() inefficiency when the user selection covers full chunks.
o h5mread() now handles character NAs consistently with rhdf5::h5read().
o Fix writeHDF5Array() error on character array filled with NAs.
CHANGES IN VERSION 1.14.0
-------------------------
NEW FEATURES
o Add coercions from TENxMatrix (or TENxMatrixSeed) to dgCMatrix
SIGNIFICANT USER-VISIBLE CHANGES
o h5mread() argument 'starts' now defaults to NULL
BUG FIXES
o h5mread() now supports datasets with contiguous layout (i.e. not chunked)
CHANGES IN VERSION 1.12.0
-------------------------
NEW FEATURES
o Add 'prefix' arg to save/loadHDF5SummarizedExperiment()
o Add quickResaveHDF5SummarizedExperiment() for fast re-saving after
initial saveHDF5SummarizedExperiment().
See ?quickResaveHDF5SummarizedExperiment for more information.
o Add h5mread() as a faster alternative to rhdf5::h5read(). It is now
the workhorse behind the extract_array() method for HDF5ArraySeed
objects. This change should significantly speed up block processing
of HDF5ArraySeed-based DelayedArray objects (including HDF5Array
objects).
CHANGES IN VERSION 1.10.0
-------------------------
NEW FEATURES
o Implement the TENxMatrix container (DelayedArray backend for the
HDF5-based sparse matrix representation used by 10x Genomics).
Also add writeTENxMatrix() and coercion to TENxMatrix.
SIGNIFICANT USER-VISIBLE CHANGES
o By default automatic HDF5 datasets (e.g. the dataset that gets written
to disk when calling 'as(x, "HDF5Array")') now are created with chunks
of 1 million array elements (revious default was 1/75 of
'getAutoBlockLength(x)'). This can be controlled with new low-level
utilities get/setHDF5DumpChunkLength().
o By default automatic HDF5 datasets now are created with chunks of
shape "scale" instead of "first-dim-grows-first". This can be
controlled with new low-level utilities get/setHDF5DumpChunkShape().
o getHDF5DumpChunkDim() looses the 'type' and 'ratio' arguments (only 'dim'
is left).