{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Data Abstraction\n",
    "\n",
    "The core data primitive in PyARPES is the `xarray.DataArray`. However, adding additional scientific functionality is needed since `xarray` provides only very general functionality. The approach that we take is described in some detail in the `xarray` documentation at [extending xarray](http://xarray.pydata.org/en/stable/internals.html#extending-xarray), which allows putting additional functionality on all arrays and datasets on particular, registered attributes.\n",
    "\n",
    "In PyARPES we use a few of these:\n",
    "\n",
    "1. `.S` attribute: functionality associated with spectra (physics here)\n",
    "2. `.G` attribute: general abstract functionality that could reasonably be a part of xarray core\n",
    "3. `.F` attribute: functionality associated with curve fitting\n",
    "\n",
    "Caveat: In general these accessors can and do behave slightly differently between datasets and arrays, depending on what makes contextual sense.\n",
    "\n",
    "This section will describe just some of the functionality provided by the `.S` attribute, while the following section will describe some of the functionality on `.G` and the section on curve fitting describes much of what is available through `.F`.\n",
    "\n",
    "Much more can be learned about them by viewing the definitions in `arpes.xarray_extensions`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1",
   "metadata": {},
   "source": [
    "## Data selection\n",
    "\n",
    "### `select_around` and `select_around_data`\n",
    "\n",
    "As an alternative to interpolating, you can integrate in a small rectangular or ellipsoidal region around a point using `.S.select_around`. You can also do this for a sequence of points using `.S.select_around_data`.\n",
    "\n",
    "These functions can be run in either summing or averaging mode using either `mode='sum'` or `mode='mean'` respectively. Using the radius parameter you can specify the integration radius in pixels (`int`) or in unitful (`float`) values for all (pass a single value) or for specific (`dict`) axes.\n",
    "\n",
    "`select_around_data` operates in the same way, except that instead of passing a single point, `select_around_data` expects a dictionary or Dataset mapping axis names to iterable collections of coordinates.\n",
    "\n",
    "As a concrete example, let's consider the `example_data.temperature_dependence` dataset with axes `(eV, phi, T)` consisting of cuts at different temperatures. Suppose we wish to obtain EDCs at the Fermi momentum for each value of the temperature.\n",
    "\n",
    "First we will load the data, and combine datasets to get a full temperature dependence."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import arpes\n",
    "from arpes.io import example_data\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "temp_dep = example_data.temperature_dependence\n",
    "near_ef = temp_dep.sel(eV=slice(-0.05, 0.05), phi=slice(-0.2, None)).sum(\"eV\").spectrum\n",
    "near_ef.S.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3",
   "metadata": {},
   "source": [
    "### Finding $\\phi_F$/$k_F$\n",
    "\n",
    "Now, we want to find the location of the peak in each slice of temperature so we know where to take EDCs.\n",
    "\n",
    "We will do this in two ways:\n",
    "\n",
    "1. Taking the argmax across `phi`\n",
    "2. Curve fitting MDCs\n",
    "\n",
    "#### Argmax"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4",
   "metadata": {},
   "outputs": [],
   "source": [
    "argmax_indices = near_ef.argmax(dim=\"phi\")\n",
    "argmax_phis = near_ef.phi[argmax_indices]\n",
    "\n",
    "fig, ax = plt.subplots()\n",
    "near_ef.S.plot(ax=ax)\n",
    "# ax.scatter(*argmax_phis.G.to_arrays()[::-1], color=\"red\")  # G.to_arrays is deprecated.\n",
    "ax.scatter(argmax_phis.coords[\"phi\"].values, argmax_phis.coords[\"temperature\"].values, color=\"red\")\n",
    "fig"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5",
   "metadata": {},
   "source": [
    "#### Curve fitting\n",
    "\n",
    "This might be okay depending on what we are doing, but curve fitting is also straightforward and gives better results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6",
   "metadata": {},
   "outputs": [],
   "source": [
    "from lmfit.models import LinearModel, LorentzianModel\n",
    "\n",
    "# Fit with two components, a linear background and a Lorentzian peak\n",
    "model = LinearModel(prefix=\"a_\") + LorentzianModel(prefix=\"b_\")\n",
    "lorents_params = LorentzianModel(prefix=\"b_\").guess(\n",
    "    near_ef.sel(temperature=20, method=\"nearest\").values, near_ef.coords[\"phi\"].values\n",
    ")\n",
    "phis = near_ef.S.modelfit(\"phi\", model, params=lorents_params)\n",
    "\n",
    "phis"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "There's a lot here to digest, the result of our curve fit also produced a `xr.Dataset`! This is because it bundles the fitting results (a 1D array of the fitting instances), the original data, and the residual together.\n",
    "\n",
    "We will see more about all of this in the curve fitting section to follow. For now, let's just get the peak centers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "# ax.scatter(*phis.results.F.p(\"b_center\").G.to_arrays()[::-1], color=\"red\")\n",
    "near_ef.S.plot(ax=ax)\n",
    "ax.scatter(\n",
    "    phis.modelfit_results.F.p(\"b_center\").values,\n",
    "    phis.modelfit_results.F.p(\"b_center\").coords[\"temperature\"].values,\n",
    "    color=\"red\",\n",
    ")\n",
    "fig"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9",
   "metadata": {},
   "source": [
    "This looks a lot cleaner, although our peaks may be biased too far to negative `phi` due to the asymmetric background.\n",
    "\n",
    "Here, note that prefix \"b_\" is added to the parameter \"center\" of the LorentzianModel.  (\"b\" means the \"second\" fitting model in this procedure)\n",
    "\n",
    "With the values of the Fermi momentum (in angle space) now in hand, we can select EDCs at the appropriate momentum for each value of the temperature.\n",
    "\n",
    "Let's average in a 10 mrad window."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10",
   "metadata": {},
   "outputs": [],
   "source": [
    "# take the Lorentzian (component `b`) center parameter\n",
    "phi_values = phis.modelfit_results.F.p(\"b_center\")\n",
    "fig, ax = plt.subplots()\n",
    "ax = temp_dep.spectrum.S.select_around_data(\n",
    "    {\"phi\": phi_values}, mode=\"mean\", radius={\"phi\": 0.005}\n",
    ").S.plot(ax=ax)\n",
    "fig"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11",
   "metadata": {},
   "source": [
    "## Exercises\n",
    "\n",
    "1. Change the `phi` range of the selection to see how the fit responds. Can we deal with the asymmetric background this way?\n",
    "2. Inspect the first fit with `phis.results[0].item()`. What can you tell about the fit?\n",
    "2. Select a region of the temperature dependent data away from the band. Perform a broadcast fit for the Fermi edge using `arpes.fits.fit_models.AffineBroadenedFD`. Does the edge position shift at all? Does the edge width change at all? Look at the previous exercise to determine which parameters to look at."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}