MTS Collections for Developers

MTS Collections for Developers#

Some of those mentioned reposiotries in the current page are hosted on ORNL GitLab, and one needs to request access from Yuanpeng Zhang (zhangy3@ornl.gov)

MantidTotalScattering (MTS) is a total scattering data reduction routine which wraps up a series of Mantid algorithms for reducing the neutron total scattering data. Here at SNS, ORNL, we have already put the MTS data reduction framework into practical use for the total scattering data processing, for both NOMAD and POWGEN. From a high-level view, we believe that as long as the neutron events data files are stored in the NeXus format and are compatible with the Mantid framework, we should be able to use MTS for any time-of-flight (TOF) instruments. Here, by compatible, we mean the instrument geometry, sample logs, etc. are incorporated in the data NeXus files and are readable by Mantid.

On the user side, MTS takes in a single JSON file as the input which contains all the information needed for running the data reduction. The information includes the raw data file(s), the calibration file (in HDF5 format, containing the calibration constants for TOF-\(d\) conversion), those characterization files (empty container, empty instruments and vanadium measurements), the sample & vanadium specific information (composition, geometry, etc.), and among other necessary information. Currently, we are still building the detailed instructions for the available entries in the input JSON file for MTS, and once available, we will post the documentation in the current powder diffraction website.

For the NOMAD instrument at SNS, ORNL, we have a GUI called ADDIE to serve as the frontend interface for the underlying MTS engine to run the data reduction and post-processing. Detailed instructions can be found here and here.

On the developer side, although we expect the MTS framework to be generic and adaptable to general neutron TOF instruments, in practice, we still have many instrument specific tweakings that need to be implemented into the MTS routine. Such instrument specific tweakings are quite often entangled with other implementations such as the absorption correction and how the caching should be done accordingly. In this page, we are trying to put down some high-level notes about how the whole pipeline for those typical TOF instruments at SNS, ORNL is constructed. The notes are more for developers who may want to incorporate the MTS routine into their data processing pipeline.

The MTS source codes are hosted on GitHub, and here is the repository, neutrons/mantid_total_scattering.

GitHub action is in place for CI/CD. Some simple checking will be performed, or conda package will be built automatically, depending on the pushing and tagging action.

The main part of the source codes are located in the total_scattering directory in the repo. In the total_scattering/params.py file, the instrument specific configuration file is hard coded. Any time we need to add in a new instrument, we need to introduce a corresponding entry into the list.

The MTS workflow is summarized into diagrams and they are available here.
The source code of the calibration routine for NOMAD is available here.

Calibration of the sample environment with a standard sample (usually diamond for total scattering calibration purpose) is quite often the first step in the data reduction workflow. The repo above contains the source code for running calibration specifically for NOMAD at SNS, ORNL.

The nom_cal is a wrapper bash script that runs the bottom-level Python script (see utils/nom_cal.py) for performing the calibration.

N.B. My personal feel is that the calibration process is strongly instrument dependent and therefore it is very difficult to generalize. We did try to develop some generic routine to reliably create the calibration file for multiple instruments with the same workflow but failed.

The calibration routine can run independently, taking an input JSON file (see the repo README for an example). Also, we managed to incorporate the calibration routine into the auto reduction workflow for both NOMAD and POWGEN at SNS, ORNL. As the data file is available, the auto reduction routine will check the sample composition and the title. Once all the set criteria are satisfied to identify the run is a standard diamond run, the calibration routine will be started automatically.
For the absorption correction, the calculation can be done for a specific instrument even before the data collection, as long as the sample information (composition and geometry) is available. We created a utility abs_pre_calc to perform the pre-caclulation of the absorption correction, and the source codes can be found here. The main script is abs_pre_calc, a wrapper bash script which calls several routines for doing the absorption calculation. It talks to the ITEMS database here at ORNL that stores the sample information for each experiment and sample information will be fetched. The sample information will be populated into a Libreoffice sheet file which will be brought up automatically for us to manually correct the sample information if needed. Each row of the sheet is with a unique sample ID and contains all the information for performing a successful absorption calculation. This sample information sheet file will also be used at the auto reduction stage. Once a data file is available after the data collection, we should be able to know the sample ID from the sample log that is embedded in the data NeXus file. Then we can use the sample ID to find out all the relevant characterization measurements. It is through the sample information sheet file that we prepared with the abs_pre_calc routine that we know the container used for holding the sample (since we need the container information for the absorption calculation).

For the absorption calculation, we are using the numerical integration approach within the Paalman-Pings framework. More details can be found here.

It turns out that absorption spectra for some detectors are very similar to each other so for those detectors, we only need to calculate the absorption correction once. We implemented an automatic grouping mechanism in the abs_pre_calc routine so detectors will be grouped according to the similarity of their absorption spectra.

The pre-calculated absorption and the detector grouping files are saved into dedicated location specific to instruments and with the corresponding implementation in MTS, it knows where to find those pre-calculated absorption and detector grouping files.
Through the pre-calculation of absorption and the following detector grouping, we could obtain the detector grouping file. We can then cache the processed data according to the specific grouping. At SNS (and probably some instruments at HFIR as well), we have the live reduction service running so that as the data is being collected, the streamed live data can be processed on the fly. At this stage, we can then grab the identified grouping of detectors and perform align & focus (into the identified groups) and then we can cache the align-focus outcome, throwing away the events and only keeping the histogram data (with a bin fine enough to enable any potentially needed coarser binning further down the road). At the data reduction stage, the cached file can be directly loaded to boost the performance dramatically. Typically, the data processing time can be reduced to 1/10 of the raw event data processing time. The live reduction script that performs such align & focus and caching can be found here.
At SNS and HFIR, ORNL, all the four powder instruments (NOMAD and POWGEN at SNS, HB-2A and HB-2C at HFIR) have a working auto reduction workflow so that data will be reduced automatically once they are ready on hard disk. The auto reduction script for NOMAD can be found here. The main script is reduce_NOM.py and there is a central configuration file in JSON format auto_config.json to host the parameters and controls over the auto reduction. At the bottom level, the auto reduction routine runs MTS and a lot of the work in the routine is for collecting all the information and preparing the input for running MTS. Regarding the characterization runs, we have a central CSV file auto_exp.csv hosting the information concerning the characterization runs to be used for a certain range or run numbers. This is a look-up table that we will be using to pin down the characterization runs to be used, given a certain data run. The auto reduction script contains the loop over the packing fraction. In MTS, we have the implementation to grab the high \(Q\) region of the reduced structure factor to perform a linear fitting so that the high \(Q\) level can be estimated. This will be written into the output log file together with the expected self-scattering level given the sample composition. The auto reduction routine will then read the output log file and cycle through the packing fraction tweaking until the actual high \(Q\) level agrees with the expected self-scattering level within a pre-defined threshold.