This is a convenience function to capture substrings from textual data.
str_match_all internally but instead of returning everything, always returns only one single part of the match, depending on parameters
extract_substring( string, pattern, capture_n = 1, capture_bracket = 0, missing = NA_character_ )
string to extract
regular expression pattern to search for
within each string, which match of the
pattern should be extracted? e.g. if the pattern searches for words, should the first, second or third word be captured?
for the captured match, which capture group should be extracted? i.e. which parentheses-enclosed segment of the
by default captures the whole pattern (
capture_bracket = 0).
what to replace missing values with? Note that values can be missing because there are not enough captured matches or because the actual capture_bracket is empty.
character vector of same length as
string with the extracted substrings