Oh, wait a second. I misread your original post. Please ignore my truly incorrect suggestion.
-- Bert On Fri, Mar 1, 2024 at 7:57 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > > Here's another *incorrect* way to do it -- incorrect because it will > not always work, unlike Iris's correct solution. But it does not > require PERL type matching. The idea: separate the two vowels in the > regex by a character that you know cannot appear (if there is such) > and match it optionally, e.g. with '*" repetition specifier. I used > "?" for the optional character below (which must be escaped). > > > gsub("([aeiouAEIOU])\\?*([aeiouAEIOU])", "\\1_\\2", "aerioue") > [1] "a_eri_ou_e" > > Cheers, > Bert > > > On Fri, Mar 1, 2024 at 3:59 AM Iago Giné Vázquez <iago.g...@sjd.es> wrote: > > > > Hi Iris, > > > > Thank you. Further, very nice solution. > > > > Best, > > > > Iago > > > > On 01/03/2024 12:49, Iris Simmons wrote: > > > Hi Iago, > > > > > > > > > This is not a bug. It is expected. Patterns may not overlap. However, > > > there > > > is a way to get the result you want using perl: > > > > > > ```R > > > gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_", "aerioue", perl = TRUE) > > > ``` > > > > > > The specific change I made is called a positive lookahead, you can read > > > more about it here: > > > > > > https://www.regular-expressions.info/lookaround.html > > > > > > It's a way to check for a piece of text without consuming it in the match. > > > > > > Also, since you don't care about character case, it might be more legible > > > to add ignore.case = TRUE and remove the upper case characters: > > > > > > ```R > > > gsub("([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE, ignore.case = > > > TRUE) > > > > > > ## or > > > > > > gsub("(?i)([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE) > > > ``` > > > > > > I hope this helps! > > > > > > > > > On Fri, Mar 1, 2024, 06:37 Iago Giné Vázquez<iago.g...@sjd.es> wrote: > > > > > >> Hi all, > > >> > > >> I tested next command: > > >> > > >> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2", "aerioue") > > >> > > >> with the following output: > > >> > > >> [1] "a_eri_ou_e" > > >> > > >> So, there are two consecutive vowels where an underscore is not added. > > >> > > >> May it be a bug? Is it expected (bug or not)? Is there any chance to get > > >> what I want (an underscore between each pair of consecutive vowels)? > > >> > > >> > > >> Thank you! > > >> > > >> Best regards, > > >> > > >> Iago > > >> > > >> [[alternative HTML version deleted]] > > >> > > >> ______________________________________________ > > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > >> PLEASE do read the posting guide > > >> http://www.R-project.org/posting-guide.html > > >> and provide commented, minimal, self-contained, reproducible code. > > >> > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.