Hi list, I'm currently processing textual data and I would really appreciate some help with one off my problems.
I have a set of strings and I want to count how often each of this strings appears in this set. This is not very difficult and can be done as: TB<-table(my_set) plot(TB) However, I also want to collapse across sub-strings. This is, I want a sub-string ss of string S to be counted as an occurrence of string S. So, 'abab' should be included in the count of 'ababaaa' and should not be listed as a separate entry in the frequency table. Does somebody has a pointer to a way to do this? I have been checking out the CRAN packages for handling DNA sequences, but this has not really brought me closer to a solution. Thanks, Dieter Vanderelst ------------------------------------------ Dieter Vanderelst Eindhoven University of Technology Faculty of Industrial Design Designed Intelligence Group Den Dolech 2 5612 AZ Eindhoven The Netherlands Tel +31 40 247 91 11 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.