Re: [R] grep

Steven Yen Thu, 01 Aug 2024 23:33:28 -0700

Thanks!

On 8/2/2024 12:28 PM, Rui Barradas wrote:

Às 02:10 de 02/08/2024, Steven Yen escreveu:

Good Morning. Below I like statement like


j<-grep(".r\\b",colnames(mydata),value=TRUE); j

with the \\b option which I read long time ago which Ive found useful.

Are there more or these options, other than ? grep? Thanks.

dstat is just my own descriptive routine.

 > x
  [1] "age"          "sleep"        "primary"      "middle"
  [5] "high"         "somewhath"    "veryh"        "somewhatm"
  [9] "verym"        "somewhatc"    "veryc"        "somewhatl"
[13] "veryl"        "village"      "married"      "social"
[17] "agricultural" "communist"    "minority"     "religious"
 > colnames(mydata)
  [1] "depression"     "sleep"          "female" "village"
  [5] "agricultural"   "married"        "communist" "minority"
  [9] "religious"      "social"         "no" "primary"
[13] "middle"         "high"           "veryh" "somewhath"
[17] "notveryh"       "verym"          "somewhatm" "notverym"
[21] "veryc"          "somewhatc"      "notveryc" "veryl"
[25] "somewhatl"      "notveryl"       "age" "village.r"
[29] "married.r"      "social.r"       "agricultural.r" "communist.r"
[33] "minority.r"     "religious.r"    "male.r" "education.r"
 > j<-grep(".r\\b",colnames(mydata),value=TRUE); j
[1] "village.r"      "married.r"      "social.r" "agricultural.r"
[5] "communist.r"    "minority.r"     "religious.r" "male.r"
[9] "education.r"
 > j<-c(x,j); j
  [1] "age"            "sleep"          "primary" "middle"
  [5] "high"           "somewhath"      "veryh" "somewhatm"
  [9] "verym"          "somewhatc"      "veryc" "somewhatl"
[13] "veryl"          "village"        "married" "social"
[17] "agricultural"   "communist"      "minority" "religious"
[21] "village.r"      "married.r"      "social.r" "agricultural.r"
[25] "communist.r"    "minority.r"     "religious.r" "male.r"
[29] "education.r"
 > data<-mydata[j]
 > cbind(
+   dstat(subset(data,male.r==1))[,1:2],
+   dstat(subset(data,male.r==0))[,1:2]
+ )
Sample statistics (Weighted =  FALSE )

Sample statistics (Weighted =  FALSE )

                 Mean Std.dev  Mean Std.dev
age            6.279   0.841 6.055   0.813
sleep          6.483   1.804 6.087   2.045
primary        0.452   0.498 0.408   0.491
middle         0.287   0.453 0.176   0.381
high           0.171   0.377 0.082   0.275
somewhath      0.522   0.500 0.447   0.497
veryh          0.254   0.435 0.250   0.433
somewhatm      0.419   0.493 0.460   0.498
verym          0.544   0.498 0.411   0.492
somewhatc      0.376   0.484 0.346   0.476
veryc          0.593   0.491 0.615   0.487
somewhatl      0.544   0.498 0.504   0.500
veryl          0.390   0.488 0.389   0.487
village        0.757   0.429 0.752   0.432
married        0.936   0.245 0.906   0.291
social         0.538   0.499 0.528   0.499
agricultural   0.780   0.414 0.826   0.379
communist      0.178   0.383 0.038   0.190
minority       0.071   0.256 0.081   0.273
religious      0.088   0.284 0.102   0.302
village.r      0.243   0.429 0.248   0.432
married.r      0.064   0.245 0.094   0.291
social.r       0.462   0.499 0.472   0.499
agricultural.r 0.220   0.414 0.174   0.379
communist.r    0.822   0.383 0.962   0.190
minority.r     0.929   0.256 0.919   0.273
religious.r    0.912   0.284 0.898   0.302
male.r         1.000   0.000 0.000   0.000
education.r    0.090   0.286 0.334   0.472
 >

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

The metacharacters reference is the documentation ?regex.
If you want to know whether there are more metacharacters similar to \b,
there are \< and \>. low are examples of using them instead of \b.

Also, the pattern '.r' does not match a period followed by an 'r', theperiod matches any character ('.'). To match a literal period you mustescape it. The correct regex is '\\.r'.




x <- c("age", "sleep", "primary", "middle", "high", "somewhath", "veryh",
       "somewhatm", "verym", "somewhatc", "veryc", "somewhatl", "veryl",
       "village", "married", "social", "agricultural", "communist",
       "minority", "religious")
colnms <- c("depression", "sleep", "female", "village", "agricultural",

"married", "communist", "minority", "religious", "social","no", "primary", "middle", "high", "veryh", "somewhath","notveryh", "verym", "somewhatm", "notverym", "veryc", "somewhatc","notveryc", "veryl", "somewhatl", "notveryl", "age", "village.r","married.r", "social.r", "agricultural.r", "communist.r", "minority.r","religious.r",

            "male.r", "education.r")

grep("\\.r\\b", colnms, value = TRUE)
#> [1] "village.r"      "married.r"      "social.r" "agricultural.r"
#> [5] "communist.r"    "minority.r"     "religious.r" "male.r"
#> [9] "education.r"
# the same as above
# \\> matches the empty string at the end of a word,
# \\b matches the empty string at both ends of a word
grep("\\.r\\>", colnms, value = TRUE)
#> [1] "village.r"      "married.r"      "social.r" "agricultural.r"
#> [5] "communist.r"    "minority.r"     "religious.r" "male.r"
#> [9] "education.r"

# 4 col names have a 'm' and end in '.r' therefore 4 matches
grep("m.*\\.r\\>", colnms, value = TRUE)
#> [1] "married.r"   "communist.r" "minority.r"  "male.r"
# only the strings starting with 'm'
grep("\\bm.*\\.r\\b", colnms, value = TRUE)
#> [1] "married.r"  "minority.r" "male.r"
grep("\\<m.*\\.r\\>", colnms, value = TRUE)
#> [1] "married.r"  "minority.r" "male.r"


Hope this helps,

Rui Barradas


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grep

Reply via email to