On Jul 5, 2010, at 9:04 AM, Kunzler, Andreas wrote: > Dear list, > > I'm looking for a way to count the number of "|" within an object. > The character "|" is used to separated ids. > > Assume a data (d) structure like > > Var > NA > NA > NA > NA > NA > 1 > 1|2 > 1|22|45 > 3 > 4b|24789 > > I need to know the maximum number of ids within one object. In this case 3 > (1|22|45) > > > Does anybody know a better way? > > Thanks
Presuming that your column is in a data frame called 'DF', where the 'Var' column is likely imported as a factor: > DF Var 1 <NA> 2 <NA> 3 <NA> 4 <NA> 5 <NA> 6 1 7 1|2 8 1|22|45 9 3 10 4b|24789 > max(sapply(strsplit(as.character(DF$Var), split = "\\|"), length)) [1] 3 The above uses strsplit() to split each line using the "|" as the split character. Since "|" has a special meaning for regular expressions, it needs to be escaped using the double backslash: > strsplit(as.character(DF$Var), split = "\\|") [[1]] [1] NA [[2]] [1] NA [[3]] [1] NA [[4]] [1] NA [[5]] [1] NA [[6]] [1] "1" [[7]] [1] "1" "2" [[8]] [1] "1" "22" "45" [[9]] [1] "3" [[10]] [1] "4b" "24789" Then you just loop through each line getting the length: > sapply(strsplit(as.character(DF$Var), split = "\\|"), length) [1] 1 1 1 1 1 1 2 3 1 2 and of course get the max value. HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.