Isoreader supports several dual inlet IRMS data formats. This vignette shows some of the functionality for scan data files. For additional information on operations more generally (caching, combining read files, data export, etc.), please consult the operations vignette. For details on downstream data processing and visualization, see the isoprocessor package.
Note: this vignette is still a work in progress.
Reading scan files is as simple as passing one or multiple file or
folder paths to the iso_read_scan()
function. If folders
are provided, any files that have a recognized scan file extensions
within those folders will be processed (e.g. all .scn
).
Here we read several files that are bundled with the package as examples
(and whose paths can be retrieved using the
iso_get_reader_example()
function).
# all available examples
iso_get_reader_examples() |> knitr::kable()
filename | type | software | description |
---|---|---|---|
continuous_flow_example.cf | continuous flow | Isodat | Continuous Flow file format (older) |
continuous_flow_example.dxf | continuous flow | Isodat | Continuous Flow file format (newer) |
continuous_flow_example.iarc | continuous flow | ionOS | Continuous Flow data archive |
dual_inlet_example.caf | dual inlet | Isodat | Dual Inlet file format (older) |
dual_inlet_example.did | dual inlet | Isodat | Dual Inlet file format (newer) |
dual_inlet_nu_example.txt | dual inlet | Nu | Dual Inlet file format |
background_scan_example.scn | scan | Isodat | Scan file format |
full_scan_example.scn | scan | Isodat | Scan file format |
peak_shape_scan_example.scn | scan | Isodat | Scan file format |
time_scan_example.scn | scan | Isodat | Scan file format |
# read scan examples
scan_files <-
iso_read_scan(
iso_get_reader_example("peak_shape_scan_example.scn"),
iso_get_reader_example("background_scan_example.scn"),
iso_get_reader_example("full_scan_example.scn"),
iso_get_reader_example("time_scan_example.scn")
)
#> Info: preparing to read 4 data files (all will be cached)...
#> Info: reading file 'peak_shape_scan_example.scn' with '.scn' reader...
#> Info: reading file 'background_scan_example.scn' with '.scn' reader...
#> Info: reading file 'full_scan_example.scn' with '.scn' reader...
#> Info: reading file 'time_scan_example.scn' with '.scn' reader...
#> Info: finished reading 4 files in 1.21 secs
#> Warning: file creation date could not be accessed for all files because this
#> information is not available on some Linux systems, reporting last modified
#> time for file_datetime instead. To turn these warnings off, call
#> iso_turn_datetime_warnings_off() and reread these files with
#> iso_reread_all_files().
#> Warning: encountered 4 problems.
#> # | FILE | PROBLEM | OCCURRED IN ...
#> 1 | peak_shape_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 2 | background_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 3 | full_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 4 | time_scan_example.scn | warning | extract_os_file_creation_datetime...
#> Use iso_get_problems(...) for more details.
The scan_files
variable now contains a set of isoreader
objects, one for each file. Take a look at what information was
retrieved from the files using the iso_get_data_summary()
function.
scan_files |> iso_get_data_summary() |> knitr::kable()
#> Info: aggregating data summary from 4 data file(s)
file_id | file_path_ | file_subpath | raw_data | file_info | method_info |
---|---|---|---|---|---|
peak_shape_scan_example.scn | peak_shape_scan_example.scn | NA | 220 measurements, 3 ions (44,45,46) | 7 entries | resistors |
background_scan_example.scn | background_scan_example.scn | NA | 525 measurements, 7 ions (44,45,46,47,54,48,49) | 8 entries | resistors |
full_scan_example.scn | full_scan_example.scn | NA | 799 measurements, 3 channels (2,4,6) | 8 entries | resistors |
time_scan_example.scn | time_scan_example.scn | NA | 5532 measurements, 2 ions (38,40) | 8 entries | resistors |
In case there was any trouble with reading any of the files, the following functions provide an overview summary as well as details of all errors and warnings, respectively. The examples here contain no errors but if you run into any unexpected file read problems, please file a bug report in the isoreader issue tracker.
scan_files |> iso_get_problems_summary() |> knitr::kable()
file_id | warning | error |
---|---|---|
background_scan_example.scn | 1 | 0 |
full_scan_example.scn | 1 | 0 |
peak_shape_scan_example.scn | 1 | 0 |
time_scan_example.scn | 1 | 0 |
scan_files |> iso_get_problems() |> knitr::kable()
file_id | type | func | details |
---|---|---|---|
peak_shape_scan_example.scn | warning | extract_os_file_creation_datetime | file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead |
background_scan_example.scn | warning | extract_os_file_creation_datetime | file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead |
full_scan_example.scn | warning | extract_os_file_creation_datetime | file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead |
time_scan_example.scn | warning | extract_os_file_creation_datetime | file creation date cannot be accessed on this Linux system, using last modified time for file_datetime instead |
Detailed file information can be aggregated for all isofiles using
the iso_get_file_info()
function which supports the full select
syntax of the dplyr
package to specify which columns are of interest (by default, all file
information is retrieved).
# all file information
scan_files |> iso_get_file_info(select = c(-file_root)) |> knitr::kable()
#> Info: aggregating file info from 4 data file(s), selecting info columns 'c(-file_root)'
file_id | file_path | file_subpath | file_datetime | file_size | type | comment |
---|---|---|---|---|---|---|
peak_shape_scan_example.scn | peak_shape_scan_example.scn | NA | 2023-07-31 00:35:05 | 36411 | High Voltage | NA |
background_scan_example.scn | background_scan_example.scn | NA | 2023-07-31 00:35:05 | 62303 | High Voltage | NA |
full_scan_example.scn | full_scan_example.scn | NA | 2023-07-31 00:35:05 | 57551 | MagnetCurrent | NA |
time_scan_example.scn | time_scan_example.scn | NA | 2023-07-31 00:35:05 | 140550 | Clock | NA |
File information can also be modified across an entire collection of
isofiles using the iso_select_file_info()
and
iso_rename_file_info()
functions:
# select + rename specific file info columns
scan_files2 <- scan_files |>
iso_select_file_info(-file_root) |>
iso_rename_file_info(`Date & Time` = file_datetime)
#> Info: selecting/renaming the following file info across 4 data file(s): '-file_root'
#> Info: renaming the following file info across 4 data file(s): 'file_datetime'->'Date & Time'
# fetch all file info
scan_files2 |> iso_get_file_info() |> knitr::kable()
#> Info: aggregating file info from 4 data file(s)
file_id | file_path | file_subpath | Date & Time | file_size | type | comment |
---|---|---|---|---|---|---|
peak_shape_scan_example.scn | peak_shape_scan_example.scn | NA | 2023-07-31 00:35:05 | 36411 | High Voltage | NA |
background_scan_example.scn | background_scan_example.scn | NA | 2023-07-31 00:35:05 | 62303 | High Voltage | NA |
full_scan_example.scn | full_scan_example.scn | NA | 2023-07-31 00:35:05 | 57551 | MagnetCurrent | NA |
time_scan_example.scn | time_scan_example.scn | NA | 2023-07-31 00:35:05 | 140550 | Clock | NA |
Any collection of isofiles can also be filtered based on the
available file information using the function
iso_filter_files
. This function can operate on any column
available in the file information and supports full dplyr
syntax.
# find files that have 'CIT' in the new ID field
scan_files2 |>
iso_filter_files(type == "High Voltage") |>
iso_get_file_info() |>
knitr::kable()
#> Info: applying file filter, keeping 2 of 4 files
#> Info: aggregating file info from 2 data file(s)
file_id | file_path | file_subpath | Date & Time | file_size | type | comment |
---|---|---|---|---|---|---|
peak_shape_scan_example.scn | peak_shape_scan_example.scn | NA | 2023-07-31 00:35:05 | 36411 | High Voltage | NA |
background_scan_example.scn | background_scan_example.scn | NA | 2023-07-31 00:35:05 | 62303 | High Voltage | NA |
The file information in any collection of isofiles can also be
mutated using the function iso_mutate_file_info
. This
function can introduce new columns and operate on any existing columns
available in the file information (even if it does not exist in all
files) and supports full dplyr
syntax.
scan_files3 <- scan_files2 |>
iso_mutate_file_info(
# introduce new column
`Run in 2019?` = `Date & Time` > "2019-01-01" & `Date & Time` < "2020-01-01"
)
#> Info: mutating file info for 4 data file(s)
scan_files3 |>
iso_get_file_info() |>
knitr::kable()
#> Info: aggregating file info from 4 data file(s)
file_id | file_path | file_subpath | Date & Time | file_size | type | comment | Run in 2019? |
---|---|---|---|---|---|---|---|
peak_shape_scan_example.scn | peak_shape_scan_example.scn | NA | 2023-07-31 00:35:05 | 36411 | High Voltage | NA | FALSE |
background_scan_example.scn | background_scan_example.scn | NA | 2023-07-31 00:35:05 | 62303 | High Voltage | NA | FALSE |
full_scan_example.scn | full_scan_example.scn | NA | 2023-07-31 00:35:05 | 57551 | MagnetCurrent | NA | FALSE |
time_scan_example.scn | time_scan_example.scn | NA | 2023-07-31 00:35:05 | 140550 | Clock | NA | FALSE |
Additionally, some IRMS data files contain resistor information that are useful for downstream calculations (see e.g. section on signal conversion later in this vignette):
scan_files |> iso_get_resistors() |> knitr::kable()
#> Info: aggregating resistors info from 4 data file(s)
file_id | cup | R.Ohm | mass |
---|---|---|---|
peak_shape_scan_example.scn | 3 | 3e+08 | 44 |
peak_shape_scan_example.scn | 4 | 3e+10 | 45 |
peak_shape_scan_example.scn | 6 | 1e+11 | 46 |
background_scan_example.scn | 1 | 3e+08 | 44 |
background_scan_example.scn | 2 | 3e+10 | 45 |
background_scan_example.scn | 3 | 1e+11 | 46 |
background_scan_example.scn | 4 | 1e+13 | 47 |
background_scan_example.scn | 6 | 1e+13 | 54 |
background_scan_example.scn | 7 | 1e+13 | 48 |
background_scan_example.scn | 8 | 1e+13 | 49 |
full_scan_example.scn | 2 | 3e+08 | NA |
full_scan_example.scn | 4 | 3e+10 | NA |
full_scan_example.scn | 6 | 1e+11 | NA |
time_scan_example.scn | 2 | 3e+08 | 38 |
time_scan_example.scn | 4 | 3e+10 | 40 |
The raw data read from the scan files can be retrieved similarly
using the iso_get_raw_data()
function. Most data
aggregation functions also allow for inclusion of file information using
the include_file_info
parameter, which functions
identically to the select
parameter of the
iso_get_file_info
function discussed earlier.
# get raw data with default selections (all raw data, no additional file info)
scan_files |> iso_get_raw_data() |> head(n=10) |> knitr::kable()
#> Info: aggregating raw data from 4 data file(s)
file_id | step | x | x_units | v44.mV | v45.mV | v46.mV | v47.mV | v54.mV | v48.mV | v49.mV | vC2.mV | vC4.mV | vC6.mV | v38.mV | v40.mV |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
peak_shape_scan_example.scn | 61600 | 9.399710 | KV | -1.650841 | 0.5759007 | 0.2701459 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61603 | 9.400167 | KV | -1.639444 | 0.5949929 | 0.3998433 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61606 | 9.400624 | KV | -1.631846 | 0.6140866 | 0.3197281 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61609 | 9.401081 | KV | -1.612849 | 0.6293626 | 0.3006569 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61612 | 9.401538 | KV | -1.605250 | 0.6026302 | 0.3120995 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61615 | 9.401995 | KV | -1.609050 | 0.6522784 | 0.3616900 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61618 | 9.402452 | KV | -1.586252 | 0.7158523 | 0.3578750 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61621 | 9.402909 | KV | -1.571052 | 0.7272921 | 0.2587052 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61624 | 9.403366 | KV | -1.563451 | 0.7959418 | 0.3197281 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
peak_shape_scan_example.scn | 61627 | 9.403823 | KV | -1.555851 | 0.7730565 | 0.3197281 | NA | NA | NA | NA | NA | NA | NA | NA | NA |
# get specific raw data and add some file information
scan_files |>
iso_get_raw_data(
# select just time and the two ions
select = c(x, x_units, v44.mV, v45.mV),
# include the scan type and rename the column
include_file_info = c(`Scan Type` = type)
) |>
# look at first few records only
head(n=10) |> knitr::kable()
#> Info: aggregating raw data from 4 data file(s), selecting data columns 'c(x, x_units, v44.mV, v45.mV)', including file info 'c(`Scan Type` = type)'
file_id | Scan Type | x | x_units | v44.mV | v45.mV |
---|---|---|---|---|---|
peak_shape_scan_example.scn | High Voltage | 9.399710 | KV | -1.650841 | 0.5759007 |
peak_shape_scan_example.scn | High Voltage | 9.400167 | KV | -1.639444 | 0.5949929 |
peak_shape_scan_example.scn | High Voltage | 9.400624 | KV | -1.631846 | 0.6140866 |
peak_shape_scan_example.scn | High Voltage | 9.401081 | KV | -1.612849 | 0.6293626 |
peak_shape_scan_example.scn | High Voltage | 9.401538 | KV | -1.605250 | 0.6026302 |
peak_shape_scan_example.scn | High Voltage | 9.401995 | KV | -1.609050 | 0.6522784 |
peak_shape_scan_example.scn | High Voltage | 9.402452 | KV | -1.586252 | 0.7158523 |
peak_shape_scan_example.scn | High Voltage | 9.402909 | KV | -1.571052 | 0.7272921 |
peak_shape_scan_example.scn | High Voltage | 9.403366 | KV | -1.563451 | 0.7959418 |
peak_shape_scan_example.scn | High Voltage | 9.403823 | KV | -1.555851 | 0.7730565 |
For users familiar with the nested data frames from the tidyverse (particularly tidyr’s nest
and
unnest
), there is an easy way to retrieve all data from the
iso file objects in a single nested data frame:
all_data <- scan_files |> iso_get_all_data()
#> Info: aggregating all data from 4 data file(s)
# not printed out because this data frame is very big
Saving entire collections of isofiles for retrieval at a later point
is easily done using the iso_save
function which stores
collections or individual isoreader file objects in the efficient R data
storage format .rds
(if not specified, the extension
.scan.rds
will be automatically appended). These saved
collections can be conveniently read back using the same
iso_read_scan
command used for raw data files.
# export to R data archive
scan_files |> iso_save("scan_files_export.scan.rds")
#> Info: exporting data from 4 iso_files into R Data Storage 'scan_files_export.scan.rds'
# read back the exported R data storage
iso_read_scan("scan_files_export.scan.rds")
#> Info: preparing to read 1 data files (all will be cached)...
#> Info: reading file 'scan_files_export.scan.rds' with '.scan.rds' reader...
#> Info: loaded 4 data files from R Data Storage
#> Info: finished reading 1 files in 0.08 secs
#> Warning: file creation date could not be accessed for all files because this
#> information is not available on some Linux systems, reporting last modified
#> time for file_datetime instead. To turn these warnings off, call
#> iso_turn_datetime_warnings_off() and reread these files with
#> iso_reread_all_files().
#> Warning: encountered 4 problems.
#> # | FILE | PROBLEM | OCCURRED IN ...
#> 1 | peak_shape_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 2 | background_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 3 | full_scan_example.scn | warning | extract_os_file_creation_datetime...
#> 4 | time_scan_example.scn | warning | extract_os_file_creation_datetime...
#> Use iso_get_problems(...) for more details.
#> Data from 4 scan iso files:
#> # A tibble: 4 × 6
#> file_id file_path_ file_subpath raw_data file_info method_info
#> <chr> <chr> <chr> <glue> <chr> <chr>
#> 1 peak_shape_scan_exampl… peak_shap… NA 220 mea… 7 entries resistors
#> 2 background_scan_exampl… backgroun… NA 525 mea… 8 entries resistors
#> 3 full_scan_example.scn full_scan… NA 799 mea… 8 entries resistors
#> 4 time_scan_example.scn time_scan… NA 5532 me… 8 entries resistors
#>
#> Problem summary:
#> # A tibble: 4 × 3
#> file_id warning error
#> <chr> <int> <int>
#> 1 background_scan_example.scn 1 0
#> 2 full_scan_example.scn 1 0
#> 3 peak_shape_scan_example.scn 1 0
#> 4 time_scan_example.scn 1 0
At the moment, isoreader supports export of all data to Excel and the
Feather file format
(a Python/R cross-over format). Note that both export methods have
similar syntax and append the appropriate file extension for each type
of export file (.scan.xlsx
and .scan.feather
,
respectively).
# export to excel
scan_files |> iso_export_files_to_excel("scan_files_export")
# data sheets available in the exported data file:
readxl::excel_sheets("scan_files_export.scan.xlsx")
# export to feather
scan_files |> iso_export_files_to_feather("scan_files_export")
# exported feather files
list.files(pattern = ".scan.feather")