H5py-like API to access BLISS scan data¶
Scan data is saved in NeXus-compliant HDF5 files. When reading these files during acquisition, failures will occur often in which case the file needs to be closed and opened again. To avoid having to deal with this issue, blissdata provide an h5py-like API which can be used to read the scan data during and after the experiment without changes in the reader code.
In the future the h5py-like API will also support fetching data from memory (Redis or Lima) when possible.
In the examples below we will use this function to process scan data
def process_scan_data(nxentry):
# Detectors and motors from which to read data
datasets = zip(
nxentry["instrument/samy/value"],
nxentry["instrument/diode1/data"],
nxentry["instrument/eiger1/data"]
)
# Loop over all points of the scan
for y, I0, image in datasets:
print("samy", y)
print("iodet", I0)
print("eiger1", image)
HDF5 files during the experiment¶
from blissdata.h5api import dynamic_hdf5
filename = "/tmp/scans/inhouse/id002211/id00/20221101/sample/sample_0001/sample_0001.h5"
with dynamic_hdf5.File(filename, lima_names=["eiger1"]) as root:
for scan in root: # loops indefinitely
print("\nScan", scan)
process_scan_data(root[scan], lima_names)
The lima_names
is the only deviation from the h5py
API and can be omitted
when the dataset is closed (i.e. nothing is being written to it anymore). Other
non-h5py arguments that can be provided are:
retry_period
: period in seconds to retry failed HDF5 read operationsretry_timeout
: time in seconds to retry failed HDF5 read operations after which aRetryTimeoutException
will be raised
HDF5 files after the experiment¶
When the dataset is closed, no writer will access it anymore. You can use blissdata.h5api.dynamic_hdf5
as shown above or you can use h5py
directly without changing the code (except for the lima_names
argument)
import h5py
filename = "/tmp/scans/inhouse/id002211/id00/20221101/sample/sample_0001/sample_0001.h5"
with h5py.File(filename) as root:
for scan in root: # loops over all scans in the file
print("\nScan", scan)
process_scan_data(root[scan])
Alternatively you can also do this (same code, different import)
from blissdata.h5api import static_hdf5
filename = "/tmp/scans/inhouse/id002211/id00/20221101/sample/sample_0001/sample_0001.h5"
with static_hdf5.File(filename) as root:
for scan in root: # loops over all scans in the file
print("\nScan", scan)
process_scan_data(root[scan])
Static vs. Dynamic¶
The static and dynamic API’s are identical and mimic the read-only part of the h5py
API. They mainly provide the classes
Group(Mapping)
and Dataset(Sequence)
. The values of Group
are of type Group
or Dataset
. Note that a Mapping
mainly provides item getting and iteration while Sequence
mainly provides slicing and iteration.
Although the API’s are the same, they behave differently
Static HDF5 | Dynamic HDF5 | |
---|---|---|
group[name] |
Return immediately | Block until the key is present or the scan is marked as “FINISHED” |
dataset[idx] |
Return immediately | Block until the entire slice is available or until the scan is marked as “FINISHED” |
for name in group |
Stops when all names are yielded | Stops when the scan is marked as “PREPARED” |
for data in dataset |
Stops when all data points are yielded | Stops when the scan is marked as “FINISHED” |
The only exception to this is the top-level Group
in the dynamic HDF5 API. As no scan is associated with the top-level group,
the loop for name in group
never exists and group[name]
black forever until the key is present.