This seems to do the job but there are probably more elegant solutions: f <- function(s) { sub("^ ","",unlist(strsplit(gsub("\\+ ","+@ ",s),"@"))) } g <- function(s) { sub("^ ","",unlist(strsplit(gsub("- ","-@ ",s),"@"))) } h <- function(s) { g(f(s)) }
To try it out: s <- “leucocyten + gramnegatieve staven +++ grampositieve staven ++” t <- “leucocyten – grampositieve coccen +” h(s) h(t) HTH, Eric On Wed, Apr 12, 2023 at 7:56 PM Emily Bakker <emilybak...@outlook.com> wrote: > Hello List, > > I have a dataset consisting of strings that I want to split while saving > the delimiter. > > Some example data: > “leucocyten + gramnegatieve staven +++ grampositieve staven ++” > “leucocyten – grampositieve coccen +” > > I want to split the strings such that I get the following result: > c(“leucocyten +”, “gramnegatieve staven +++”, “grampositieve staven ++”) > c(“leucocyten –“, “grampositieve coccen +”) > > I have tried strsplit with a regular expression with a positive lookahead, > but I am not able to achieve the results that I want. > > I have tried: > as.list(strsplit(x, split = “(?=[\\+-]{1,3}\\s)+, perl=TRUE) > > Which results in: > c(“leucocyten “, “+”, “gramnegatieve staven “, “+”, “+”, “+”, > “grampositieve staven ++”) > c(“leucocyten “, “–“, “grampositieve coccen +”) > > > Is there a function or regular expression that will make this possible? > > Kind regards, > Emily > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.