Adding Data Sources: Basic Plugins ================================== This guided section will show you how to add a rudimentary data loading plugin as a demonstration of how to extend PyARPES to allow you to work with your lab’s data. This is the first of a two-part section on data loading and plugins in PyARPES. If your needs are more advanced, you can see the :doc:`second page ` for more details. Loading CSV Files into the PyARPES Format ----------------------------------------- For all analysis work, PyARPES assumes that the data to be manipulated is an `xarray datatype `__, typically an `xarrary.Dataset `__ or an `xarray.DataArray `__. Additionally, ARPES data must be labeled with enough :doc:`standard coordinates ` that we can convert to momentum. Let’s assume that our data comes formatted in two files, one providing the data ``{name}.csv``, and one the coordinates ``{name}.coords.csv``. A standard data file might look like .. code:: text Analyzer Spectrum 35, 41, 43, 112, 229, 433, 654, 584, 262, 105 89, 153, 207, 281, 529, 969, 1061, 602, 236, 101 295, 180, 249, 522, 833, 911, 856, 536, 236, 98 261, 226, 379, 509, 613, 787, 777, 522, 224, 94 271, 268, 338, 397, 568, 746, 703, 478, 217, 93 233, 204, 327, 464, 557, 691, 682, 477, 216, 93 185, 142, 203, 412, 681, 792, 732, 494, 223, 94 189, 130, 141, 206, 395, 740, 934, 615, 233, 96 146, 142, 151, 169, 238, 364, 531, 501, 267, 110 36, 40, 49, 89, 144, 214, 288, 256, 161, 96 and a standard coordinates file might hypothetically look like .. code:: text energy angle -0.425 0.221 -0.369 0.263 -0.313 0.305 -0.258 0.347 -0.202 0.389 -0.146 0.431 -0.090 0.472 -0.034 0.514 0.020 0.556 0.076 0.598 Barebones, Function-based Approach ---------------------------------- The simplest way to handle this task is just to write a data loading function that we can use to load the CSVs. As a first pass, we can load just the data file and turn it into an ``xarray.DataArray``. .. code:: python import xarray as xr import numpy as np def load_csv_datatype(path_to_file: str) -> xr.DataArray: loaded_data = np.loadtxt(path_to_file, delimiter=',', skiprows=1) # skip the Data comment return xr.DataArray(loaded_data) All we need to do now is attach the coordinates. Let’s modify the function to load also the columns from the other file .. code:: python import xarray as xr import numpy as np from pathlib import Path def load_csv_datatype(path_to_file: str) -> xr.DataArray: loaded_data = np.loadtxt(path_to_file, delimiter=',', skiprows=1) # skip the Data comment coordinates_file = str(Path(Path(path_to_file).stem + '.coords.csv').absolute()) # get the dimension names with open(coordinates_file) as f: dim_names = f.readline().split() raw_coordinates = np.loadtxt(coordinates_file, skiprows=1) return xr.DataArray( loaded_data, coords={d: raw_coordinates[:,i] for i, d in enumerate(dim_names)}, dims=dim_names, # attrs={...} <- attributes here ) Writing the Plugin ------------------ You can use the above code for loading this data, with the caveat mentioned above about momentum conversion. Alternatively, we can integrate it into a plugin, which allows registering the data loading code against a labeled “location” sourcing the data, and makes it easier to fill in missing values, ensure a standard representation, and modify behavior between similar but differing data formats. To do this, we subclass ``arpes.endstations.SingleFileEndstation`` .. code:: python ... from arpes.endstations import SingleFileEndstation, add_endstation class CSVDataEndstation(SingleFileEndstation): PRINCIPAL_NAME = 'csv' # allows us to use this code to refer to data labeled with location="csv" _TOLERATED_EXTENSIONS = {'.csv',} # allow only .csv files! def load_single_frame(self, frame_path: str=None, scan_desc: dict = None, **kwargs): data = load_csv_datatype(frame_path) return xr.Dataset({'spectrum': data}) # register it add_endstation(CSVDataEndstation) Now, you can load code with ``CSVDataEndstation.load_from_path``, or with ``CSVDataEndstation.load``. Additionally, you can load using the standard data loading function by passing ``location='csv'``. Because ours is the only one registered against the .csv file format, loading data without the location keyword will use our new class by default. The data loading plugins provide a number of features making it simpler to write data loading code for ARPES, especially in normalizing coordinate units (mm for all distances, rad for all angular measures), and ensuring the coordinates necessary to allow momentum conversion are attached. If you want to learn more about writing data plugins, have a look at the in depth description of how they work in the :doc:`second part ` of this tutorial.