The attached patch provides a modification to the recent fix/improvement to do_grep already included in the most recent development version.
The original fix added new functionality to the grep function by adding a new parameter, 'invert'. In the source code for the underlying do_grep, the value of the parameter is used to invert the logical match-no match flag vector ind. The modification is distributed across several lines of code. The patch systematizes the solution by inverting the logical match flag vector in place, once for each element in the character vector passed to grep as the argument 'x'. In the patched version, the invertion appears just once in the code. The patch does not modify the functionality of grep in any way. If the respective documentation was updated to cover the new functionality introduced by the original modification, it still applies to the patched version. The patch does not solve any immediate problem. However, due to replacing the redundantly distributed original modification with a one-line modofication, the patch is intended to make it easier to understand, maintain, and further modify the source code. The patch also renames the variable 'invert' introduced in the original modification to 'invert_opt', for consistency with how (almost) all other logical flag parameters in do_grep are named. This modification is again functionally transparent and requires no modifications to the documentation. The patch was prepared as follows: svn co https://svn.R-project.org/R/trunk/ cd trunk tools/rsync-recommended # modifications made to src/main/character.c svn diff > do_grep.diff The patched sources were successfully compiled and tested as follows: svn revert -R . patch -p0 < do_grep.diff ./configure make make check Assuming that appropriate tests were prepared for the extended version of grep as of the original modification, the patched version was successfully tested. The patched grep was also tested as follows: bin/R --no-save -q <<END x = replicate(10, paste(sample(letters, 10, replace=TRUE), collapse='')) pattern = paste(sample(letters, 3), collapse='') matched = grep(pattern=pattern, x=x, invert=FALSE) unmatched = grep(pattern=pattern, x=x, invert=TRUE) print(all.equal(1:length(x), sort(c(matched, unmatched)))) print(version) END with the output: [1] TRUE platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status Under development (unstable) major 2 minor 9.0 year 2009 month 02 day 12 svn rev 47904 language R version.string R version 2.9.0 Under development (unstable) (2009-02-12 r47904) vQ -- ------------------------------------------------------------------------------- Wacek Kusnierczyk, MD PhD Email: w...@idi.ntnu.no Phone: +47 73591875, +47 72574609 Department of Computer and Information Science (IDI) Faculty of Information Technology, Mathematics and Electrical Engineering (IME) Norwegian University of Science and Technology (NTNU) Sem Saelands vei 7, 7491 Trondheim, Norway Room itv303 Bioinformatics & Gene Regulation Group Department of Cancer Research and Molecular Medicine (IKM) Faculty of Medicine (DMF) Norwegian University of Science and Technology (NTNU) Laboratory Center, Erling Skjalgsons gt. 1, 7030 Trondheim, Norway Room 231.05.060 -------------------------------------------------------------------------------
Index: src/main/character.c =================================================================== --- src/main/character.c (revision 47904) +++ src/main/character.c (working copy) @@ -1015,7 +1015,7 @@ SEXP pat, vec, ind, ans; regex_t reg; int i, j, n, nmatches = 0, cflags = 0, ov, erroffset, ienc, rc; - int igcase_opt, extended_opt, value_opt, perl_opt, fixed_opt, useBytes, invert; + int igcase_opt, extended_opt, value_opt, perl_opt, fixed_opt, useBytes, invert_opt; const char *cpat, *errorptr; pcre *re_pcre = NULL /* -Wall */; pcre_extra *re_pe = NULL; @@ -1031,14 +1031,14 @@ perl_opt = asLogical(CAR(args)); args = CDR(args); fixed_opt = asLogical(CAR(args)); args = CDR(args); useBytes = asLogical(CAR(args)); args = CDR(args); - invert = asLogical(CAR(args)); + invert_opt = asLogical(CAR(args)); if (igcase_opt == NA_INTEGER) igcase_opt = 0; if (extended_opt == NA_INTEGER) extended_opt = 1; if (value_opt == NA_INTEGER) value_opt = 0; if (perl_opt == NA_INTEGER) perl_opt = 0; if (fixed_opt == NA_INTEGER) fixed_opt = 0; if (useBytes == NA_INTEGER) useBytes = 0; - if (invert == NA_INTEGER) invert = 0; + if (invert_opt == NA_INTEGER) invert_opt = 0; if (fixed_opt && igcase_opt) warning(_("argument '%s' will be ignored"), "ignore.case = TRUE"); if (fixed_opt && perl_opt) @@ -1148,7 +1148,8 @@ INTEGER(ind)[i] = 1; } else if (regexec(®, s, 0, NULL, 0) == 0) LOGICAL(ind)[i] = 1; } - if (invert ^ LOGICAL(ind)[i]) nmatches++; + LOGICAL(ind)[i] ^= invert_opt; + if (LOGICAL(ind)[i]) nmatches++; } if (fixed_opt); else if (perl_opt) { @@ -1167,13 +1168,13 @@ SEXP nmold = getAttrib(vec, R_NamesSymbol), nm; ans = allocVector(STRSXP, nmatches); for (i = 0, j = 0; i < n ; i++) - if (invert ^ LOGICAL(ind)[i]) + if (LOGICAL(ind)[i]) SET_STRING_ELT(ans, j++, STRING_ELT(vec, i)); /* copy across names and subset */ if (!isNull(nmold)) { nm = allocVector(STRSXP, nmatches); for (i = 0, j = 0; i < n ; i++) - if (invert ^ LOGICAL(ind)[i]) + if (LOGICAL(ind)[i]) SET_STRING_ELT(nm, j++, STRING_ELT(nmold, i)); setAttrib(ans, R_NamesSymbol, nm); } @@ -1181,7 +1182,7 @@ ans = allocVector(INTSXP, nmatches); j = 0; for (i = 0 ; i < n ; i++) - if (invert ^ LOGICAL(ind)[i]) INTEGER(ans)[j++] = i + 1; + if (LOGICAL(ind)[i]) INTEGER(ans)[j++] = i + 1; } UNPROTECT(1); return ans;
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel