Scan Examples

Introduction

Isoreader supports several dual inlet IRMS data formats. This vignette shows some of the functionality for scan data files. For additional information on operations more generally (caching, combining read files, data export, etc.), please consult the operations vignette. For details on downstream data processing and visualization, see the isoprocessor package.

Note: this vignette is still a work in progress.

# load isoreader package
library(isoreader)

Reading files

Reading scan files is as simple as passing one or multiple file or folder paths to the iso_read_scan() function. If folders are provided, any files that have a recognized scan file extensions within those folders will be processed (e.g. all .scn). Here we read several files that are bundled with the package as examples (and whose paths can be retrieved using the iso_get_reader_example() function).

# all available examples
iso_get_reader_examples() |> knitr::kable()

filename	type	software	description
continuous_flow_example.cf	continuous flow	Isodat	Continuous Flow file format (older)
continuous_flow_example.dxf	continuous flow	Isodat	Continuous Flow file format (newer)
continuous_flow_example.iarc	continuous flow	ionOS	Continuous Flow data archive
dual_inlet_example.caf	dual inlet	Isodat	Dual Inlet file format (older)
dual_inlet_example.did	dual inlet	Isodat	Dual Inlet file format (newer)
dual_inlet_nu_example.txt	dual inlet	Nu	Dual Inlet file format
background_scan_example.scn	scan	Isodat	Scan file format
full_scan_example.scn	scan	Isodat	Scan file format
peak_shape_scan_example.scn	scan	Isodat	Scan file format
time_scan_example.scn	scan	Isodat	Scan file format

# read scan examples
scan_files <-
  iso_read_scan(
    iso_get_reader_example("peak_shape_scan_example.scn"),
    iso_get_reader_example("background_scan_example.scn"),
    iso_get_reader_example("full_scan_example.scn"),
    iso_get_reader_example("time_scan_example.scn")
  )
#> Info: preparing to read 4 data files (all will be cached)...
#> Info: reading file 'peak_shape_scan_example.scn' with '.scn' reader...
#> Info: reading file 'background_scan_example.scn' with '.scn' reader...
#> Info: reading file 'full_scan_example.scn' with '.scn' reader...
#> Info: reading file 'time_scan_example.scn' with '.scn' reader...
#> Info: finished reading 4 files in 1.21 secs
#> Warning: file creation date could not be accessed for all files because this
#> information is not available on some Linux systems, reporting last modified
#> time for file_datetime instead. To turn these warnings off, call
#> iso_turn_datetime_warnings_off() and reread these files with
#> iso_reread_all_files().
#> Warning: encountered 4 problems.
#> # | FILE                        | PROBLEM | OCCURRED IN                      ...
#> 1 | peak_shape_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 2 | background_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 3 | full_scan_example.scn       | warning | extract_os_file_creation_datetime...
#> 4 | time_scan_example.scn       | warning | extract_os_file_creation_datetime...
#> Use iso_get_problems(...) for more details.

File summary

The scan_files variable now contains a set of isoreader objects, one for each file. Take a look at what information was retrieved from the files using the iso_get_data_summary() function.

scan_files |> iso_get_data_summary() |> knitr::kable()
#> Info: aggregating data summary from 4 data file(s)

file_id	file_path_	file_subpath	raw_data	file_info	method_info
peak_shape_scan_example.scn	peak_shape_scan_example.scn	NA	220 measurements, 3 ions (44,45,46)	7 entries	resistors
background_scan_example.scn	background_scan_example.scn	NA	525 measurements, 7 ions (44,45,46,47,54,48,49)	8 entries	resistors
full_scan_example.scn	full_scan_example.scn	NA	799 measurements, 3 channels (2,4,6)	8 entries	resistors
time_scan_example.scn	time_scan_example.scn	NA	5532 measurements, 2 ions (38,40)	8 entries	resistors

Problems

In case there was any trouble with reading any of the files, the following functions provide an overview summary as well as details of all errors and warnings, respectively. The examples here contain no errors but if you run into any unexpected file read problems, please file a bug report in the isoreader issue tracker.

scan_files |> iso_get_problems_summary() |> knitr::kable()

file_id	warning	error
background_scan_example.scn	1	0
full_scan_example.scn	1	0
peak_shape_scan_example.scn	1	0
time_scan_example.scn	1	0

scan_files |> iso_get_problems() |> knitr::kable()

file_id	type	func	details
peak_shape_scan_example.scn	warning	extract_os_file_creation_datetime	file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead
background_scan_example.scn	warning	extract_os_file_creation_datetime	file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead
full_scan_example.scn	warning	extract_os_file_creation_datetime	file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead
time_scan_example.scn	warning	extract_os_file_creation_datetime	file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead

File Information

Detailed file information can be aggregated for all isofiles using the iso_get_file_info() function which supports the full select syntax of the dplyr package to specify which columns are of interest (by default, all file information is retrieved).

# all file information
scan_files |> iso_get_file_info(select = c(-file_root)) |> knitr::kable()
#> Info: aggregating file info from 4 data file(s), selecting info columns 'c(-file_root)'

file_id	file_path	file_subpath	file_datetime	file_size	type	comment
peak_shape_scan_example.scn	peak_shape_scan_example.scn	NA	2023-07-31 00:35:05	36411	High Voltage	NA
background_scan_example.scn	background_scan_example.scn	NA	2023-07-31 00:35:05	62303	High Voltage	NA
full_scan_example.scn	full_scan_example.scn	NA	2023-07-31 00:35:05	57551	MagnetCurrent	NA
time_scan_example.scn	time_scan_example.scn	NA	2023-07-31 00:35:05	140550	Clock	NA

Select/Rename

File information can also be modified across an entire collection of isofiles using the iso_select_file_info() and iso_rename_file_info() functions:

# select + rename specific file info columns
scan_files2 <- scan_files |>
  iso_select_file_info(-file_root) |>
  iso_rename_file_info(`Date & Time` = file_datetime)
#> Info: selecting/renaming the following file info across 4 data file(s): '-file_root'
#> Info: renaming the following file info across 4 data file(s): 'file_datetime'->'Date & Time'

# fetch all file info
scan_files2 |> iso_get_file_info() |> knitr::kable()
#> Info: aggregating file info from 4 data file(s)

file_id	file_path	file_subpath	Date & Time	file_size	type	comment
peak_shape_scan_example.scn	peak_shape_scan_example.scn	NA	2023-07-31 00:35:05	36411	High Voltage	NA
background_scan_example.scn	background_scan_example.scn	NA	2023-07-31 00:35:05	62303	High Voltage	NA
full_scan_example.scn	full_scan_example.scn	NA	2023-07-31 00:35:05	57551	MagnetCurrent	NA
time_scan_example.scn	time_scan_example.scn	NA	2023-07-31 00:35:05	140550	Clock	NA

Filter

Any collection of isofiles can also be filtered based on the available file information using the function iso_filter_files. This function can operate on any column available in the file information and supports full dplyr syntax.

# find files that have 'CIT' in the new ID field
scan_files2 |>
  iso_filter_files(type == "High Voltage") |>
  iso_get_file_info() |>
  knitr::kable()
#> Info: applying file filter, keeping 2 of 4 files
#> Info: aggregating file info from 2 data file(s)

file_id	file_path	file_subpath	Date & Time	file_size	type	comment
peak_shape_scan_example.scn	peak_shape_scan_example.scn	NA	2023-07-31 00:35:05	36411	High Voltage	NA
background_scan_example.scn	background_scan_example.scn	NA	2023-07-31 00:35:05	62303	High Voltage	NA

Mutate

The file information in any collection of isofiles can also be mutated using the function iso_mutate_file_info. This function can introduce new columns and operate on any existing columns available in the file information (even if it does not exist in all files) and supports full dplyr syntax.

scan_files3 <- scan_files2 |>
  iso_mutate_file_info(
    # introduce new column
    `Run in 2019?` = `Date & Time` > "2019-01-01" & `Date & Time` < "2020-01-01"
  )
#> Info: mutating file info for 4 data file(s)

scan_files3 |>
  iso_get_file_info() |>
  knitr::kable()
#> Info: aggregating file info from 4 data file(s)

file_id	file_path	file_subpath	Date & Time	file_size	type	comment	Run in 2019?
peak_shape_scan_example.scn	peak_shape_scan_example.scn	NA	2023-07-31 00:35:05	36411	High Voltage	NA	FALSE
background_scan_example.scn	background_scan_example.scn	NA	2023-07-31 00:35:05	62303	High Voltage	NA	FALSE
full_scan_example.scn	full_scan_example.scn	NA	2023-07-31 00:35:05	57551	MagnetCurrent	NA	FALSE
time_scan_example.scn	time_scan_example.scn	NA	2023-07-31 00:35:05	140550	Clock	NA	FALSE

Resistors

Additionally, some IRMS data files contain resistor information that are useful for downstream calculations (see e.g. section on signal conversion later in this vignette):

scan_files |> iso_get_resistors() |> knitr::kable()
#> Info: aggregating resistors info from 4 data file(s)

file_id	cup	R.Ohm	mass
peak_shape_scan_example.scn	3	3e+08	44
peak_shape_scan_example.scn	4	3e+10	45
peak_shape_scan_example.scn	6	1e+11	46
background_scan_example.scn	1	3e+08	44
background_scan_example.scn	2	3e+10	45
background_scan_example.scn	3	1e+11	46
background_scan_example.scn	4	1e+13	47
background_scan_example.scn	6	1e+13	54
background_scan_example.scn	7	1e+13	48
background_scan_example.scn	8	1e+13	49
full_scan_example.scn	2	3e+08	NA
full_scan_example.scn	4	3e+10	NA
full_scan_example.scn	6	1e+11	NA
time_scan_example.scn	2	3e+08	38
time_scan_example.scn	4	3e+10	40

Raw Data

The raw data read from the scan files can be retrieved similarly using the iso_get_raw_data() function. Most data aggregation functions also allow for inclusion of file information using the include_file_info parameter, which functions identically to the select parameter of the iso_get_file_info function discussed earlier.

# get raw data with default selections (all raw data, no additional file info)
scan_files |> iso_get_raw_data() |> head(n=10) |> knitr::kable()
#> Info: aggregating raw data from 4 data file(s)

file_id	step	x	x_units	v44.mV	v45.mV	v46.mV	v47.mV	v54.mV	v48.mV	v49.mV	vC2.mV	vC4.mV	vC6.mV	v38.mV	v40.mV
peak_shape_scan_example.scn	61600	9.399710	KV	-1.650841	0.5759007	0.2701459	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61603	9.400167	KV	-1.639444	0.5949929	0.3998433	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61606	9.400624	KV	-1.631846	0.6140866	0.3197281	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61609	9.401081	KV	-1.612849	0.6293626	0.3006569	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61612	9.401538	KV	-1.605250	0.6026302	0.3120995	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61615	9.401995	KV	-1.609050	0.6522784	0.3616900	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61618	9.402452	KV	-1.586252	0.7158523	0.3578750	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61621	9.402909	KV	-1.571052	0.7272921	0.2587052	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61624	9.403366	KV	-1.563451	0.7959418	0.3197281	NA	NA	NA	NA	NA	NA	NA	NA	NA
peak_shape_scan_example.scn	61627	9.403823	KV	-1.555851	0.7730565	0.3197281	NA	NA	NA	NA	NA	NA	NA	NA	NA

# get specific raw data and add some file information
scan_files |>
  iso_get_raw_data(
    # select just time and the two ions
    select = c(x, x_units, v44.mV, v45.mV),
    # include the scan type and rename the column
    include_file_info = c(`Scan Type` = type)
  ) |>
  # look at first few records only
  head(n=10) |> knitr::kable()
#> Info: aggregating raw data from 4 data file(s), selecting data columns 'c(x, x_units, v44.mV, v45.mV)', including file info 'c(`Scan Type` = type)'

file_id	Scan Type	x	x_units	v44.mV	v45.mV
peak_shape_scan_example.scn	High Voltage	9.399710	KV	-1.650841	0.5759007
peak_shape_scan_example.scn	High Voltage	9.400167	KV	-1.639444	0.5949929
peak_shape_scan_example.scn	High Voltage	9.400624	KV	-1.631846	0.6140866
peak_shape_scan_example.scn	High Voltage	9.401081	KV	-1.612849	0.6293626
peak_shape_scan_example.scn	High Voltage	9.401538	KV	-1.605250	0.6026302
peak_shape_scan_example.scn	High Voltage	9.401995	KV	-1.609050	0.6522784
peak_shape_scan_example.scn	High Voltage	9.402452	KV	-1.586252	0.7158523
peak_shape_scan_example.scn	High Voltage	9.402909	KV	-1.571052	0.7272921
peak_shape_scan_example.scn	High Voltage	9.403366	KV	-1.563451	0.7959418
peak_shape_scan_example.scn	High Voltage	9.403823	KV	-1.555851	0.7730565

For expert users: retrieving all data

For users familiar with the nested data frames from the tidyverse (particularly tidyr’s nest and unnest), there is an easy way to retrieve all data from the iso file objects in a single nested data frame:

all_data <- scan_files |> iso_get_all_data()
#> Info: aggregating all data from 4 data file(s)
# not printed out because this data frame is very big

Saving collections

Saving entire collections of isofiles for retrieval at a later point is easily done using the iso_save function which stores collections or individual isoreader file objects in the efficient R data storage format .rds (if not specified, the extension .scan.rds will be automatically appended). These saved collections can be conveniently read back using the same iso_read_scan command used for raw data files.

# export to R data archive
scan_files |> iso_save("scan_files_export.scan.rds")
#> Info: exporting data from 4 iso_files into R Data Storage 'scan_files_export.scan.rds'

# read back the exported R data storage
iso_read_scan("scan_files_export.scan.rds")
#> Info: preparing to read 1 data files (all will be cached)...
#> Info: reading file 'scan_files_export.scan.rds' with '.scan.rds' reader...
#> Info: loaded 4 data files from R Data Storage
#> Info: finished reading 1 files in 0.08 secs
#> Warning: file creation date could not be accessed for all files because this
#> information is not available on some Linux systems, reporting last modified
#> time for file_datetime instead. To turn these warnings off, call
#> iso_turn_datetime_warnings_off() and reread these files with
#> iso_reread_all_files().
#> Warning: encountered 4 problems.
#> # | FILE                        | PROBLEM | OCCURRED IN                      ...
#> 1 | peak_shape_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 2 | background_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 3 | full_scan_example.scn       | warning | extract_os_file_creation_datetime...
#> 4 | time_scan_example.scn       | warning | extract_os_file_creation_datetime...
#> Use iso_get_problems(...) for more details.
#> Data from 4 scan iso files: 
#> # A tibble: 4 × 6
#>   file_id                 file_path_ file_subpath raw_data file_info method_info
#>   <chr>                   <chr>      <chr>        <glue>   <chr>     <chr>      
#> 1 peak_shape_scan_exampl… peak_shap… NA           220 mea… 7 entries resistors  
#> 2 background_scan_exampl… backgroun… NA           525 mea… 8 entries resistors  
#> 3 full_scan_example.scn   full_scan… NA           799 mea… 8 entries resistors  
#> 4 time_scan_example.scn   time_scan… NA           5532 me… 8 entries resistors  
#> 
#> Problem summary:
#> # A tibble: 4 × 3
#>   file_id                     warning error
#>   <chr>                         <int> <int>
#> 1 background_scan_example.scn       1     0
#> 2 full_scan_example.scn             1     0
#> 3 peak_shape_scan_example.scn       1     0
#> 4 time_scan_example.scn             1     0

Data Export

At the moment, isoreader supports export of all data to Excel and the Feather file format (a Python/R cross-over format). Note that both export methods have similar syntax and append the appropriate file extension for each type of export file (.scan.xlsx and .scan.feather, respectively).

# export to excel
scan_files |> iso_export_files_to_excel("scan_files_export")

# data sheets available in the exported data file:
readxl::excel_sheets("scan_files_export.scan.xlsx")

# export to feather
scan_files |> iso_export_files_to_feather("scan_files_export")

# exported feather files
list.files(pattern = ".scan.feather")

2023-07-31