On Tue, 17 Sep 2019 08:48:43 +0200 Ivan Calandra <calan...@rgzm.de> wrote:
> CSVs <- list.files(path=..., pattern="\\.csv$") > w.files <- CSVs[grep(pattern="_w_", CSVs)] > > Of course, what I would like to do is list only the interesting files > from the beginning, rather than subsetting the whole list of files. One way to express that would be "_w_.*\\.csv$", meaning that the filename has to have "_w_" in it, followed by anything (any character repeated any number of times, including 0), followed by ".csv" at the end of the line. > 2) The units of the variables are given in the original headers. I > would like to extract the units. This is what I did: headers <- > c("dist to origin on curve [mm]","segment on section [mm]", "angle 1 > [degree]", "angle 2 [degree]","angle 3 [degree]") units.var <- > gsub(pattern="^.*\\[|\\]$", "", headers) > > It seems to be to overly complicated using gsub(). Isn't there a way > to extract what is interesting rather than deleting what is not? Pure-R way: use regmatches() + regexpr(). Both regmatches and regexpr take the character vector as an argument, so duplication is hard to avoid: units <- regmatches(headers, regexpr('\\[.*\\]', headers)) The stringr package has an str_match() function with a nicer interface: str_match(headers, '\\[.*\\]') -> units. Such "greedy" patterns containing ".*" present a few pitfalls, e.g. looking for text in parentheses using the pattern "\\(.*\\)" in "...(abc)...(def)..." will match the whole "(abc)...(def)" instead of single groups "(abc)" and "(def)", but with your examples the pattern should work as presented. One other option would be to ask for "[", followed by zero or more characters that are not "]", followed by "]": '\\[[^]]*\\]'. -- Best regards, Ivan ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.