Isoreader supports several dual inlet IRMS data formats. This vignette shows some of the functionality for scan data files. For additional information on operations more generally (caching, combining read files, data export, etc.), please consult the operations vignette. For details on downstream data processing and visualization, see the isoprocessor package.
Note: this vignette is still a work in progress.
Reading scan files is as simple as passing one or multiple file or folder paths to the
iso_read_scan() function. If folders are provided, any files that have a recognized scan file extensions within those folders will be processed (e.g. all
.scn). Here we read several files that are bundled with the package as examples (and whose paths can be retrieved using the
# read scan examples scan_files <- iso_read_scan( iso_get_reader_example("peak_shape_scan_example.scn"), iso_get_reader_example("background_scan_example.scn"), iso_get_reader_example("full_scan_example.scn"), iso_get_reader_example("time_scan_example.scn") ) #> Info: preparing to read 4 data files (all will be cached)... #> Info: reading file 'peak_shape_scan_example.scn' from cache... #> Info: reading file 'background_scan_example.scn' from cache... #> Info: reading file 'full_scan_example.scn' from cache... #> Info: reading file 'time_scan_example.scn' from cache... #> Info: finished reading 4 files in 0.42 secs
scan_files variable now contains a set of isoreader objects, one for each file. Take a look at what information was retrieved from the files using the
scan_files %>% iso_get_data_summary() %>% rmarkdown::paged_table() #> Info: aggregating data summary from 4 data file(s)
In case there was any trouble with reading any of the files, the following functions provide an overview summary as well as details of all errors and warnings, respectively. The examples here contain no errors but if you run into any unexpected file read problems, please file a bug report in the isoreader issue tracker.
Detailed file information can be aggregated for all isofiles using the
iso_get_file_info() function which supports the full select syntax of the dplyr package to specify which columns are of interest (by default, all file information is retrieved).
# all file information scan_files %>% iso_get_file_info(select = c(-file_root)) %>% rmarkdown::paged_table() #> Info: aggregating file info from 4 data file(s), selecting info columns 'c(-file_root)'
# select + rename specific file info columns scan_files2 <- scan_files %>% iso_select_file_info(-file_root) %>% iso_rename_file_info(`Date & Time` = file_datetime) #> Info: selecting/renaming the following file info across 4 data file(s): '-file_root' #> Info: renaming the following file info across 4 data file(s): 'file_datetime'->'Date & Time' # fetch all file info scan_files2 %>% iso_get_file_info() %>% rmarkdown::paged_table() #> Info: aggregating file info from 4 data file(s)
Any collection of isofiles can also be filtered based on the available file information using the function
iso_filter_files. This function can operate on any column available in the file information and supports full dplyr syntax.
# find files that have 'CIT' in the new ID field scan_files2 %>% iso_filter_files(type == "High Voltage") %>% iso_get_file_info() %>% rmarkdown::paged_table() #> Info: applying file filter, keeping 2 of 4 files #> Info: aggregating file info from 2 data file(s)
The file information in any collection of isofiles can also be mutated using the function
iso_mutate_file_info. This function can introduce new columns and operate on any existing columns available in the file information (even if it does not exist in all files) and supports full dplyr syntax.
scan_files3 <- scan_files2 %>% iso_mutate_file_info( # introduce new column `Run in 2019?` = `Date & Time` > "2019-01-01" & `Date & Time` < "2020-01-01" ) #> Info: mutating file info for 4 data file(s) scan_files3 %>% iso_get_file_info() %>% rmarkdown::paged_table() #> Info: aggregating file info from 4 data file(s)
Additionally, some IRMS data files contain resistor information that are useful for downstream calculations (see e.g. section on signal conversion later in this vignette):
scan_files %>% iso_get_resistors() %>% rmarkdown::paged_table() #> Info: aggregating resistors info from 4 data file(s)
The raw data read from the scan files can be retrieved similarly using the
iso_get_raw_data() function. Most data aggregation functions also allow for inclusion of file information using the
include_file_info parameter, which functions identically to the
select parameter of the
iso_get_file_info function discussed earlier.
# get raw data with default selections (all raw data, no additional file info) scan_files %>% iso_get_raw_data() %>% head(n=10) %>% rmarkdown::paged_table() #> Info: aggregating raw data from 4 data file(s)
# get specific raw data and add some file information scan_files %>% iso_get_raw_data( # select just time and the two ions select = c(x, x_units, v44.mV, v45.mV), # include the scan type and rename the column include_file_info = c(`Scan Type` = type) ) %>% # look at first few records only head(n=10) %>% rmarkdown::paged_table() #> Info: aggregating raw data from 4 data file(s), selecting data columns 'c(x, x_units, v44.mV, v45.mV)', including file info 'c(`Scan Type` = type)'
For users familiar with the nested data frames from the tidyverse (particularly tidyr’s
unnest), there is an easy way to retrieve all data from the iso file objects in a single nested data frame:
all_data <- scan_files %>% iso_get_all_data() #> Info: aggregating all data from 4 data file(s) # not printed out because this data frame is very big
Saving entire collections of isofiles for retrieval at a later point is easily done using the
iso_save function which stores collections or individual isoreader file objects in the efficient R data storage format
.rds (if not specified, the extension
.scan.rds will be automatically appended). These saved collections can be convientiently read back using the same
iso_read_scan command used for raw data files.
# export to R data archive scan_files %>% iso_save("scan_files_export.scan.rds") #> Info: exporting data from 4 iso_files into R Data Storage 'scan_files_export.scan.rds' # read back the exported R data storage iso_read_scan("scan_files_export.scan.rds") #> Info: preparing to read 1 data files (all will be cached)... #> Info: reading file 'scan_files_export.scan.rds' with '.scan.rds' reader... #> Info: loaded 4 data files from R Data Storage #> Info: finished reading 1 files in 0.16 secs #> Data from 4 scan iso files: #> # A tibble: 4 x 5 #> file_id raw_data file_info method_info file_path #> <chr> <glue> <chr> <chr> <chr> #> 1 peak_shape_sca… 220 measurements, 3 io… 8 entries resistors peak_shape_scan… #> 2 background_sca… 525 measurements, 7 io… 8 entries resistors background_scan… #> 3 full_scan_exam… 799 measurements, 3 ch… 8 entries resistors full_scan_examp… #> 4 time_scan_exam… 5532 measurements, 2 i… 8 entries resistors time_scan_examp…
At the moment, isoreader supports export of all data to Excel and the Feather file format (a Python/R cross-over format). Note that both export methods have similar syntax and append the appropriate file extension for each type of export file (
# export to excel scan_files %>% iso_export_to_excel("scan_files_export") #> Info: exporting data from 4 iso_files into Excel 'scan_files_export.scan.xlsx' #> Info: aggregating all data from 4 data file(s) # data sheets available in the exported data file: readxl::excel_sheets("scan_files_export.scan.xlsx") #>  "file info" "raw data" "resistors" "problems"
# export to feather scan_files %>% iso_export_to_feather("scan_files_export") #> Info: exporting data from 4 iso_files into .scan.feather files at 'scan_files_export' #> Info: aggregating all data from 4 data file(s) # exported feather files list.files(pattern = ".scan.feather") #>  "scan_files_export_file_info.scan.feather" #>  "scan_files_export_problems.scan.feather" #>  "scan_files_export_raw_data.scan.feather" #>  "scan_files_export_resistors.scan.feather"