Skip to content

Data policy

The data policy can be enabled and configured in BLISS by adding a dedicated section in the BLISS configuration, either in the file:__init__.yml at the beamline configuration root

scan_saving:
    class: MyScanSaving
    ... # policy dependent configuration

or together with a session configuration (this is particularly useful when the same Beacon configuration is used by multiple endstations).

- class: Session
  name: my_session
  scan_saving:
    class: MyScanSaving
    ... # policy dependent configuration

BLISS currently provides BasicScanSaving and ESRFScanSaving. Adding new data policies is described here.

Basic data policy

The Basic data policy does not require any configuration. You can (but don’t have to) specify the data policy in the scan saving configuration.

scan_saving:
    class: BasicScanSaving

ESRF data policy

A minimal configuration requires enabling ESRFScanSaving in the scan saving configuration and a configuration of the communication with external ESRF data policy services (commonly referred to as ICAT).

In addition the data directories can be configured as well as the dataset metadata send to one of the data policy services.

Enable policy

The minimal scan saving configuration for the ESRF data policy:

scan_saving:
    class: ESRFScanSaving
    beamline: id00

Configure services

In order for BLISS to communicate with the ESRF data policy services, the following configuration should be added to file:__init__.yml located at beamline configuration root:

icat_servers:
    metadata_urls: [URL1, URL2]
    elogbook_url: URL3
    elogbook_token: elogbook-00000000-0000-0000-0000-000000000000
    elogbook_timeout: 0.1  # optional
    feedback_timeout: 0.1  # optional
    queue_timeout: 1  # optional
    disable: False # optional

When disable is True all e-logbook messages are lost but dataset metadata are kept in REDIS until enabled again or until switching to a different proposal.

The different timeouts are optional:

  • elogbook_timeout: time to wait for elogbook message confirmation
  • feedback_timeout: time to wait for retrieving ICAT feedback on the current proposal
  • queue_timeout: time to wait for connection to metadata_urls

Data diagram

Root directories

The ESRF data policy allows configuring the root directory based on proposal type:

scan_saving:
    class: ESRFScanSaving
    beamline: id00
    tmp_data_root: /data/{beamline}/tmp
    visitor_data_root: /data/visitor
    inhouse_data_root: /data/{beamline}/inhouse

Multiple mount points

Multiple mount points can be configured for each proposal type (visitor, inhouse and temp) and optionally for the ICAT services:

scan_saving:
    ...
    inhouse_data_root:
        nfs: /data/{beamline}/inhouse
        lsb: /lsbram/{beamline}/inhouse
    icat_inhouse_data_root: /data/{beamline}/inhouse

The active mount points can be selected in BLISS:

DEMO [1]: SCAN_SAVING.mount_point = "lsb"

The default mount point is SCAN_SAVING.mount_point == "" which selects the first mount point in the configuration.

Directory structure

Legacy directory structures can be enabled by

scan_saving:
    ...
    directory_structure_version: 1

The different versions are

  1. {base_path}/{proposal_dirname}/{beamline}/{proposal_session_name}/{collection_name}/{collection_name}_{dataset_name}
  2. {base_path}/{proposal_dirname}/{beamline}/{proposal_session_name}/raw/{collection_name}/{collection_name}_{dataset_name}
  3. {base_path}/{proposal_dirname}/{beamline}/{proposal_session_name}/RAW_DATA/{collection_name}/{collection_name}_{dataset_name}

Dataset metadata

Under the ESRF data policy, scans are grouped together in a dataset. Each dataset has metadata, which are sent to one of the data policy services and is meant for searching datasets in the data portal.

The ICAT database stores this metadata under a predefined set of database fields, which need to be mapped to properties from BLISS objects. This can be configured in the session configuration:

- class: Session
  name: my_session
  config-objects:
    ...
  icat-metadata:
    definitions: "https://gitlab.esrf.fr/icat/hdf5-master-config/-/raw/master/hdf5_cfg.xml"  # optional
    default:
      secondary_slit: $secondary_slits
      sample.positioners: [$sy, $sz]
      variables: $sx
      optics.positioners: [$robx, $roby]
      detector05: $lima_simulator
      detector06: $beamviewer
      detector07: $fluo_diode.counter
      detector08: $diode1
      detector09: $diode2
      attenuator02: $att2
    techniques:
      TOMO:
        detector01: $tomocam
      XRD:
        detector02: $diffcam
        attenuator01: $att1
      FLUO:
        detector03: $mca1  # metadata group provided by `HasMetadataForDataset.get_metadata()` of controller `mca1`
        detector04.name: $mca2.name  # metadata field provided by the `name` attribute of controller `mca2`

The ICAT database fields will be retrieved from the definitions URL, when specified, or from Bliss when missing.

All BLISS controllers in the session that implement the HasMetadataForDataset protocol will be used when gathering dataset metadata. There are several reasons why you would want to specify a controller explicitly under icat-metadata:

  • the controller is not part of the session (i.e. not listed under config-objects)
  • the controller does not have default metadata groups (HasMetadataForDataset.dataset_metadata_groups() == list())
  • you want to change the default metadata groups (e.g. ["secondary_slit"] instead of ["slits"])
  • you want to select specific controller attributes as metadata instead of what HasMetadataForDataset.dataset_metadata() returns
  • the controller only needs to be included for a specific technique

The metadata of a dataset are a combination of metadata from

  • the controllers under default
  • optionally, the controllers under one or more techniques (see SCAN_SAVING.dataset.techniques on how to select techniques)
  • the controllers in the BLISS session that are not specified explicitly under icat-metadata and with default metadata groups (HasMetadataForDataset.dataset_metadata_groups() != list())

The keys that are allowed can be listed by using demo_session.icat_metadata.available_icat_groups (to be used for controllers) or demo_session.icat_metadata.available_icat_fields (to be used for controller attributes). You can use the ending only when it is unique (for example, secondary_slit can be used instead of the full key instrument.secondary_slit).

DEMO_SESSION [1]: print(demo_session.icat_metadata.available_icat_groups)

    ['SAXS',
    'MX',
    'EM',
    'PTYCHO',
    'PTYCHO.Axis1',
    'PTYCHO.Axis2',
    'FLUO',
    'FLUO.measurement',
    'TOMO',
    'MRT',
    'HOLO',
    'WAXS',
    'sample',
    'sample.notes',
    'sample.positioners',
    'sample.patient',
    'sample.environment',
    'sample.environment.sensors',
    'instrument',
    'instrument.variables',
    'instrument.positioners',
    'instrument.monochromator',
    'instrument.monochromator.crystal',
    'instrument.source',
    'instrument.primary_slit',
    'instrument.secondary_slit',
    'instrument.slits',
    'instrument.xraylens01',
    'instrument.xraylens02',
    'instrument.xraylens03',
    'instrument.xraylens04',
    'instrument.xraylens05',
    'instrument.xraylens06',
    'instrument.xraylens07',
    'instrument.xraylens08',
    'instrument.xraylens09',
    'instrument.xraylens10',
    'instrument.attenuator01',
    'instrument.attenuator01.positioners',
    'instrument.attenuator02',
    'instrument.attenuator02.positioners',
    'instrument.attenuator03',
    'instrument.attenuator03.positioners',
    'instrument.attenuator04',
    'instrument.attenuator04.positioners',
    'instrument.attenuator05',
    'instrument.attenuator05.positioners',
    'instrument.attenuator06',
    'instrument.attenuator06.positioners',
    'instrument.attenuator07',
    'instrument.attenuator07.positioners',
    'instrument.attenuator08',
    'instrument.attenuator08.positioners',
    'instrument.attenuator09',
    'instrument.attenuator09.positioners',
    'instrument.attenuator10',
    'instrument.attenuator10.positioners',
    'instrument.attenuator11',
    'instrument.attenuator11.positioners',
    'instrument.attenuator12',
    'instrument.attenuator12.positioners',
    'instrument.attenuator13',
    'instrument.attenuator13.positioners',
    'instrument.attenuator14',
    'instrument.attenuator14.positioners',
    'instrument.attenuator15',
    'instrument.attenuator15.positioners',
    'instrument.insertion_device',
    'instrument.insertion_device.gap',
    'instrument.insertion_device.taper',
    'instrument.optics',
    'instrument.optics.positioners',
    'instrument.environment',
    'instrument.environment.sensors',
    'instrument.detector01',
    'instrument.detector01.positioners',
    'instrument.detector01.rois',
    'instrument.detector02',
    'instrument.detector02.positioners',
    'instrument.detector02.rois',
    'instrument.detector03',
    'instrument.detector03.positioners',
    'instrument.detector03.rois',
    'instrument.detector04',
    'instrument.detector04.positioners',
    'instrument.detector04.rois',
    'instrument.detector05',
    'instrument.detector05.positioners',
    'instrument.detector05.rois',
    'instrument.detector06',
    'instrument.detector06.positioners',
    'instrument.detector06.rois',
    'instrument.detector07',
    'instrument.detector07.positioners',
    'instrument.detector07.rois',
    'instrument.detector08',
    'instrument.detector08.positioners',
    'instrument.detector08.rois',
    'instrument.detector09',
    'instrument.detector09.positioners',
    'instrument.detector09.rois',
    'instrument.detector10',
    'instrument.detector10.positioners',
    'instrument.detector10.rois',
    'notes']