Adding Support for Beamlines or Lab Facilities ============================================== **Note:** This is an advanced section, you can skip this unless you need or want to extend PyARPES to cover more data formats. Before continuing here, read our :doc:`intro tutorial on adding data sources `. One of the overarching design goals of PyARPES is to provide a completely uniform, pragmatic, and understandable approach to loading and performing common analyses of ARPES data. Practically, this consists of at least Motivation for PyARPES Plugins ------------------------------ 1. Programmatic or automated data loading + saving, including from networked locations 2. Data assuming the same ultimate format and structure, where this is reasonable (excluding SARPES) 3. k-space conversion “just working”, without requiring additional effort from either the PyARPES authors or users once implemented 4. Backward compatibility with Igor Pro experiments and binary data types In order to meet these constraints in light of the large variety of data formats, PyARPES first requires that data is normalized to a single data type (NetCDF) which is well supported by ``xarray`` our data primitive of choice. All plugins are loaded at configuration time (IPython kernel startup or invocation of ``arpes.setup``) from ``arpes.endstations.plugin``. You can add more at runtime by calling .. code:: python arpes.endstations.add_endstation(MyPlugin) How a Plugin Loads Data ----------------------- In order to load data, a plugin need only support the function ``load(scan_description: dict, **kwargs)``. In practice though, most plugins subclass ``arpes.endstations.EndstationBase``, whose ``load`` function looks like this: .. code:: python def load(self, scan_desc: dict=None, **kwargs): """ Loads a scan from a single file or a sequence of files. :param scan_desc: :param kwargs: :return: """ resolved_frame_locations = self.resolve_frame_locations(scan_desc) resolved_frame_locations = [f if isinstance(f, str) else str(f) for f in resolved_frame_locations] frames = [self.load_single_frame(fpath, scan_desc, **kwargs) for fpath in resolved_frame_locations] frames = [self.postprocess(f) for f in frames] concatted = self.concatenate_frames(frames, scan_desc) concatted = self.postprocess_final(concatted, scan_desc) if 'id' in scan_desc: concatted.attrs['id'] = scan_desc['id'] return concatted The core steps are: 1. Find associated files (corresponding to the whole dataset or to “frames” of the dataset) with ``resolve_frame_locations`` 2. Load each frame individually with ``load_single_frame`` 3. Perform some additional work on each frame with ``postprocess`` 4. Concatenate the frames with ``concatenate_frames`` 5. Perform some final work on the constructed dataset with ``postprocess_final`` 6. Add an ``id`` Additionally, during the step of reading the experiment spreadsheet, file shortnames (i.e. ``1``) are translated to full paths by using ``EndstationClass.find_first_file``. If you can get away with default behavior you just need to adjust the class attributes ``_TOLERATED_EXTENSIONS`` and ``_SEARCH_PATTERNS``. Otherwise, you can look at the definition for this function and do something appropriate in your use case. The reason for the “frames” concept is that some beamlines split datasets up over many files (MERLIN at the ALS, as an example), while others produce just one. In the case that only one file is present, ``concatenate_frames`` will return just this data. Writing Your Own Plugin ----------------------- To write your own plugin to be included in PyARPES, make a file containing a single class, subclassing ``EndstationBase``. If it represents a instrument with a hemispherical electron analyzer, subclass as well ``HemisphericalEndstation``. If it is associated with a synchrotron, subclass ``SynchrotronEndstation``. .. code:: python class MySamplePlugin(SynchrotronEndstation, EndstationBase, HemisphericalEndstation): # use this plugin for any data associated with the locations "AMAZING-ARPES-LAB", # "Best lab", or "AAL" PRINCIPAL_NAME = 'AMAZING-ARPES-LAB' ALIASES = ['Best lab', 'AAL',] _TOLERATED_EXTENSIONS = {'.pxt'} # only allow .pxt files _SEARCH_PATTERNS = [ # regex matching names like # "data_Conrad_4.pxt" and "data_Oct19_1.pxt" # # the file number is injected into the `{}` pattern. r'data_[a-zA-Z0-9]+_{}', # You can provide as many as you need. ] RENAME_KEYS = { # Our LabView software weirdly calls the temperature "ThermalEnergy", and # "SFE_0" is the spectrometer center binding energy 'ThermalEnergy': 'temp', 'SFE_0': 'binding_offset', } def load_single_frame(self, frame_path: str=None, scan_desc: dict=None, **kwargs): # data loading logic here... pass In the above, you should fill in ``load_single_frame`` so that it returns a ``xr.Dataset`` with a ``spectrum`` data variable. For examples of how the actual loading code might look, have a look at the definitions of the currently implemented plugins in ``merlin.py`` (SES binary multiframe format), ``MAESTRO.py`` (FITS single frame format), and ``ALG_main.py`` (FITS single frame format). Finally, ensure your plugin is exported in your module’s ``__all__`` attribute .. code:: python __all__ = ('MySamplePlugin',) You can register a plugin after import-time with ``arpes.endstations.add_endstation(MySamplePlugin)``, in which case the code can be anywhere. By contrast if you install from source and place the plugin in the ``arpes/endstations/plugins`` folder they will be loaded automatically. Renaming attributes ~~~~~~~~~~~~~~~~~~~ ``RENAME_KEYS`` can be used to rename attributes in the event that your VIs or spectrometer drivers produce. In the example above, we rename “ThermalEnergy” to “temp” and “SFE_0” to “binding_offset”. You can include as many of these key renamings as you like, in addition to the standard ones performed automatically.