-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathfabio.tex
306 lines (257 loc) · 13.6 KB
/
fabio.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
\documentclass[preprint]{iucr}
\papertype{CP}
\journalcode{J}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\begin{document}
\title{FabIO: easy access to 2D X-ray detector images in Python}
\shorttitle{FabIO}
\author[a]{Erik B.}{Knudsen}
\author[b]{Henning O.}{S{\o}rensen}
\author[c]{Jonathan P.}{Wright}
\author[c]{Ga\"el}{Goret}
\cauthor[c]{J\'er\^ome}{Kieffer}{jerome.kieffer@esrf,fr}{}
\aff[a]{Department of Physics, Technical University of Denmark,
\city{Kongens Lyngby} \country{Denmark}}
\aff[b]{Nano-Science Center, Department of Chemistry, University of Copenhagen,
Universitetsparken 5, \city{Copenhagen}, \country{Denmark} }
\aff[c]{European Synchrotron Radiation Facility, \city{Grenoble}, \country{France}}
\shortauthor{Knudsen et al.}
\maketitle
\begin{synopsis}
A Python module for reading and handling data from two-dimensional X-ray detectors.
\end{synopsis}
\begin{abstract}
FabIO is a Python module written for easy and transparent reading
of raw two-dimensional data from various X-ray detectors. The module provides a
function for reading any image and returning a fabioimage object which
contains both metadata (header information) and the raw data.
All fabioimage object offer additional methods to extract
information about the image and to open other detector images from
the same data series.
\end{abstract}
\section{Introduction}
One obstacle when writing software to analyse data collected from a
two-dimensional detector is to read the raw data into the program,
not least because the data can be stored in many different formats
depending on the instrument used.
To overcome this problem we decided to develop a general module,
FabIO (FABle I/O), to handle reading and writing of two-dimensional
data.
The code-base was initiated by merging parts of our fabian imageviewer
\cite{fabian} and ImageD11 \cite{imaged11} peak-search programs and has
been developed since 2007 as part of the TotalCryst \cite{totalcryst}
program suite for analysis of 3DXRD microscopy data \cite{3dxrd}.
During integration into a range of scientific programs like the FABLE graphical
interface \cite{fable}, EDNA \cite{edna} and
the fast azimuthal integration library, pyFAI \cite{pyfai}; FabIO has gained
several features like handling multi-frame image formats as well as
writing many of the file formats.
We believe FabIO is now ready for a wider audience and could save other
researchers from repeating the work involved in decoding a
binary file format. Table~\ref{format} shows the list of file formats that
FabIO can currently (ver. 0.1.0) read.
\section{FabIO Python module}
Python \cite{python} is a scripting language that is very popular among scientists
and which also allows well structured applications and libraries to be developed.
\subsection{Philosophy}
The intention behind this development was to create a Python module which would
enable easy reading of 2D data images, from any detector without having to
worry about the file format.
Therefore FabIO just needs a file name to open a file and it determines the
file format automatically and deals with gzip \cite{gzip} and bzip2
\cite{bzip2} compression transparently.
Opening a file returns an object which stores the image
{\em data} in memory as a 2D NumPy array \cite{numpy} and the metadata,
called {\em header}, in a python dictionary. Beside the
{\em data} and {\em header} attributes, some methods are provided for reading
the {\em previous} or {\em next} image in a series of images as well as jumping
to a specific file number.
For the user, these auxiliary methods are intended to be independent of
the image format (as far as is reasonably possible).
FabIO is written in an object-oriented style (with classes) but aims at being
used in a scripting environment: special care has been taken to ensure the
library remains easy to use.
Therefore no knowledge of object-oriented programming is required to get
full benefits of the library.
As the development is done in a collaborative and decentralized way; a
comprehensive test suite has been added to reduce the number of regressions
when new features are added or old problems are repaired.
The software is very modular and allows new classes to be added for handling
other data formats easily.
FabIO and its source-code are freely available to everyone on-line \cite{fabio},
licensed under the GNU General Public License version 3 (GPLv3). FabIO is also
available directly from popular Linux distributions like Debian and Ubuntu.
\subsection{Implementation}
The main language used in the development of FabIO is Python \cite{python};
however, some image formats are compressed and require
compression algorithms for reading and writing data.
When such algorithms could not be implemented efficiently using Python or NumPy
native modules were developed, in i.e. standard C code callable from Python
(sometimes generated using Cython \cite{cython}).
This code has to be compiled for each computer architecture and offers
excellent performance.
FabIO is only dependent on the NumPy module and has extra features if two other
optional python modules are available.
For reading XML files (that are used in EDNA) the Lxml module \cite{lxml} is
required and the Python Image Library, PIL \cite{pil} is needed for producing
a PIL image for displaying the image in graphical user interfaces and several
image-processing operations that are not re-implemented in FabIO.
A variety of useful image processing is also available in the scipy.ndimage
module \cite{scipy} and in scikits-image \cite{skimage}.
Images can also be displayed in a convenient interactive manner using
matplotlib \cite{matplotlib} and an IPython shell \cite{ipython}, which is
mainly used for developing data analysis algorithms.
Reading and writing procedure of the various TIFF \cite{tiff} formats is based
on the TiffIO code from PyMCA \cite{pymca}.
In the Python shell, the {\em fabio} module must be imported prior to reading an
image in one of the supported file formats (see Table \ref{format}).
The {\em fabio.open} function creates an instance of the Python class {\em fabioimage},
from the name of a file. This instance, named {\em img} hereafter, stores the
image data in {\em img.data} as a 2D NumPy array. Often the image file contains
more information than just the intensities of the pixels, e.g.
information about how the image is stored and the instrument parameters at the
time of the image acquisition, these metadata are usually stored in
the file header.
Header information, are available in {\em img.header} as a Python
dictionary where keys are strings and values are usually strings or
numeric values.
Information in the header about the binary part of the image (compression,
endianness, shape) are interpreted however, other metadata are exposed as
they are recorded in the file. FabIO allows the user to modify
and, where possible, to save this information (Table \ref{format} summarizes
writable formats).
Automatic translation between file-formats, even if desirable, is sometimes
impossible because not all format have the capability to be extended with
additional metadata.
Nevertheless FabIO is capable of converting one
image data-format into another by taking care of the numerical specifics:
for example float arrays are converted to integer arrays if the output format only
accepts integers.
\subsection{FabIO methods}
One strength of the implementation in an object oriented language
is the possibility to combine functions (or methods) together with data
appropriate for specific formats.
In addition to the header information and image data, every {\em fabioimage}
instance (returned by {\em fabio.open}) has methods inherited from fabioimage
which provide information about the image minimum, maximum and mean values.
In addition there are methods which return the file number, name etc.
Some of the most important methods are specific for certain formats because
the methods are related to how frames in a sequence are handled; these methods
are {\em img.next()}, {\em img.previous()}, and {\em img.getframe(n)}.
The behaviour of such methods varies depending on the
image format: for single-frame format (like mar345), {\em img.next()} will
return the image in next file; for multi-frame format (like GE), {\em
img.next()} will return the next frame within the same file. For formats which are possibly
multi-framed like EDF, the behaviour depends on the actual number of frames per
file (accessible via the {\em img.nframes} attribute).
\section{Installation and usage}
FabIO can, as any Python module, be installed from its sources, available on
sourceforge \cite{fabio} but we advice to use binary packages provided for the
most common platforms on sourceforge: Windows, MacOSX and Linux.
Moreover FabIO is part of the common Linux distributions Ubuntu (since 11.10)
and Debian7 where the package is named {\em python-fabio} and can be installed
via {\em \# apt-get install python-fabio}.
\subsection{Examples}
In this section we have collected some basic examples of how FabIO can be employed.
Opening an image:\\
\begin{verbatim}
import fabio
im100 = fabio.open('Quartz_0100.tif') # Open image file
print(im0.data[1024,1024]) # Check a pixel value
im101 = im100.next() # Open next image
im270 = im1.getframe(270) # Jump to file number 270: Quartz_0270.tif
\end{verbatim}
Normalising the intensity to a value in the header:\\
\begin{verbatim}
img = fabio.open('exampleimage0001.edf')
print(img.header)
{'ByteOrder': 'LowByteFirst',
'DATE (scan begin)': 'Mon Jun 28 21:22:16 2010',
'ESRFCurrent': '198.099',
...
}
# Normalise to beam current and save data
srcur = float(img.header['ESRFCurrent'])
img.data *= 200.0/srcur
img.write('normed_0001.edf')
\end{verbatim}
Interactive viewing with matplotlib:\\
\begin{verbatim}
from matplotlib import pyplot # Load matplotlib
pyplot.imshow(img.data) # Display as an image
pyplot.show() # Show GUI window
\end{verbatim}
\section{Future and perspectives}
The Hierarchical Data Format version 5 \cite{hdf5} is a data format which is
increasingly popular for storage of X-ray and neutron data. To name a few
facilities the synchrotron Soleil \cite{tub05} and the neutron sources
ISIS, SNS and SINQ already use HDF extensively through the NeXus \cite{nexus}
format.
For now, mainly processed or curated data are stored in this format but new detectors are
rumoured to provide native output in HDF5.
FabIO will rely on H5Py \cite{h5py}, which already
provides a good HDF5 binding for Python, as an external dependency, to be able
to read and write such HDF5 files.
In the near future FabIO will be upgraded to work with Python3 (a new version of
Python); this change of version will affect some internals FabIO as string and
file handling have been altered.
This change is already ongoing as many parts of native code in C have already
been translated into Cython \cite{cython} to smoothe the transition, since
Cython generates code compatible with Python3.
This also makes it easier to retain backwards compatibility with the earlier
Python versions.
\section{Conclusion}
FabIO gives an easy way to read and write 2D images when using the
Python computer language.
It was originally developed for X-ray diffraction data but now gives
an easy way for scientists to access and manipulate
their data from a wide range of 2D X-ray detectors.
We welcome contributions to further improve the code and hope to add
more file formats in the future as well as port the existing code base
to the emerging Python3.
\ack{Acknowledgements}
We acknowledge Andy G\"otz and Kenneth Evans for extensive testing when including
the FabIO reader in the Fable ImageViewer \cite{fable}.
We also thank V. Armando Sol\'e for assistance with his TiffIO reader and
Carsten Gundlach for deployment of FabIO at the beamlines i711 and i811,
MAX IV and providing bug reports.
We finally acknowledge our colleagues who have reported bugs and helped to
improve FabIO.
Financial support was granted by the EU 6th Framework NEST/ADVENTURE project
TotalCryst \cite{totalcryst}.
\bibliographystyle{iucr}
\bibliography{biblio}
%\referencelist[biblio]
\begin{table}
\caption{List of file formats that FabIO can read and write (in
alphabetical order). The listed filename extensions are typical examples.
FabIO tries to deduce the actual format from the file itself and only
uses extensions as a fallback if that fails.}
\label{format}
\vspace{1mm}
\begin{center}
\begin{tabular}{llcccc}
Python Module & Detector / Format & Extension & Read & Multi-image & Write\\
\hline
ADSC & ADSC Quantum & .img & $\surd$& - & $\surd$ \\
Bruker & Bruker formats & .sfrm & $\surd$& - & $\surd$ \\
DM3 & Gatan Digital Micrograph & .dm3 & $\surd$& - & - \\
EDF & ESRF data format & .edf & $\surd$& $\surd$ & $\surd$ \\
EDNA-XML & Used by EDNA \cite{edna} & .xml & $\surd$& - & - \\
CBF & CIF binary files & .cbf & $\surd$& - & $\surd$ \\
kcd & Nonius KappaCCD & .kccd & $\surd$& - & - \\
fit2dmask & Used by Fit2D \cite{fit2d} & .msk & $\surd$& - & $\surd$ \\
fit2dspreadsheet & Used by Fit2D \cite{fit2d} & .spr & $\surd$& - & $\surd$ \\
GE & General Electric & - & $\surd$& $\surd$ & - \\
HiPiC & Hamamatsu CCD & .tif & $\surd$& - & - \\
marccd & MarCCD/Mar165 & .mccd & $\surd$& - & $\surd$ \\
mar345 & Mar345 image plate & .mar3450 & $\surd$& - & $\surd$ \\
OXD & Oxford Diffraction & .img & $\surd$& - & $\surd$ \\
pilatus & Dectris Pilatus Tiff & .tif & $\surd$& - & $\surd$ \\
PNM & Portable aNy Map & .pnm & $\surd$& - & - \\
TIFF & Tagged Image File Format & .tif & $\surd$& - & $\surd$ \\
\end{tabular}
\end{center}
\end{table}
\end{document}