Hello, I am wondering about the behaviour of strsplit. When the pattern matches the beginning of the search string, the mepty string is added to the result, but that's not the case when the pattern matches the end of the search string:
strsplit(" hello dolly ") [1] "" "hello" "dolly" The man for strsplit explains the algorithm: " The algorithm applied to each input string is repeat { if the string is empty break. if there is a match add the string to the left of the match to the output. remove the match and all to the left of it. else add the string to the output. break. } Note that this means that if there is a match at the beginning of a (non-empty) string, the first element of the output is '""', but if there is a match at the end of the string, the output is the same as with the match removed. " I do not see how this algorithm specifies that there should be no empty string at the end of the output if the pattern matches the end of the input string. If the pattern matches, (second if above), the match is added to the output, and removed from the input -- which after this step is the empty string; in the next step, there is no match (else above), so the rest of the input string (= the empty string) *should* be added, but it is not what happens. I think that the implementation of the algorithm (and the explanation that "if there is a match at the end of the string, the output is the same as with the match removed") is both unintuitive (i see no good reason for including the empty string at the beginning but not at the end of the output; no other language i know would do that this way) and actually wrong wrt. the algorithm. Any opinion? What was the ground for this design? vQ -- ------------------------------------------------------------------------------- Wacek Kusnierczyk, MD PhD Email: [EMAIL PROTECTED] Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics & Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel