Also not answering your question directly, but may be provide some useful ideas or results:

library( gsubfn )

DF <- setNames( data.frame( t( strapply( ID
+                                        , "^[^_]+_([A-Z]+)_([A-Z]+)([0-9]+)$"
+                                        , c
+                                        , simplify=TRUE
+                                        )
+                              )
+                           , stringsAsFactors = FALSE
+                           )
+               , c( "Type", "Group", "Number" )
+               )
str( DF )
'data.frame':   100 obs. of  3 variables:
 $ Type  : chr  "MSM" "MSM" "MSM" "MSM" ...
 $ Group : chr  "HN" "HN" "HN" "HN" ...
 $ Number: chr  "01209" "01210" "01211" "10212" ...

On Tue, 3 Nov 2015, Peter Alspach wrote:

Tena koe Jen

Not answering your question: if you are after these locations in order to split 
the IDs in columns, then you might like to consider strsplit; e.g.,

t(sapply(strsplit(ID, '_'), rbind))

You could then split the last column.  You state that there is a 5-digit number at the 
end.  If this is correct, then use this feature (i.e., nchar(ID)-4) as you'd want 
"IBBS3_MSM_HN104213" (the fifth element in ID) to split to IBBS3, MSM, HN1 and 
04213.  However, if it isn't always 5 digits then split at the first number (i.e., HN and 

HTH .....

Peter Alspach

-----Original Message-----
From: R-help [] On Behalf Of Jennifer 
Sent: Tuesday, 3 November 2015 7:39 a.m.
Subject: [R] Locating the starting position of the first number in a string


So, I've got a vector of strings that look like this:
ID <- c("IBBS3_MSM_HN01209","IBBS3_MSM_HN01210","IBBS3_MSM_HN01211",
"IBBS3_MSM_HN25275","IBBS3_MSM_HN25276", "IBBS3_MSM_HN25277", 
"IBBS3_MSM_HN25284","IBBS3_MSM_HMC44285",  "IBBS3_MSM_HMC44286", 

This is an ID that is in the following format:  IBBS3_Type_Group#####

What I want to do is locate the starting position of Type, which is anywhere 
from 3 to 4 letters long (in this example it's either MSM or PWID), the 
starting position of Group which is 2-3 letters long (either HN or HMC), and 
finally the starting position of the 5-digit number.

I'm able to get Type and Group using the following:

TYPE_s <- sapply(c("MSM", "PWID"), regexpr, ID,

GROUP_s <- (sapply(c("HN", "HMC"), regexpr, ID,

What I am having trouble with is getting the starting position of the 5-digit 

I am trying:

DIGITS_s <- sapply("([0:9])", regexpr, ID,

But that just seems to look for the position of the first 0.:



 [1,]      13

 [2,]      13

 [3,]      13

 [4,]      14

 [5,]      14

 [6,]      14

 [7,]      -1

 [8,]      -1

 [9,]      -1

[10,]      -1

[11,]      17

[12,]      17

[13,]      -1

[14,]      -1

[15,]      -1

[16,]      -1

[17,]      -1

[18,]      -1

[19,]      -1

[20,]      -1

[21,]      17

[22,]      17

[23,]      -1

[24,]      -1

[25,]      -1

[26,]      -1

[27,]      -1

[28,]      -1

[29,]      -1

[30,]      -1

[31,]      17

[32,]      17

[33,]      -1

[34,]      -1

[35,]      -1

[36,]      -1

[37,]      -1

[38,]      -1

[39,]      -1

[40,]      -1

[41,]      17

[42,]      17

[43,]      -1

[44,]      -1

[45,]      -1

[46,]      -1

[47,]      -1

[48,]      -1

[49,]      -1

[50,]      -1

[51,]      17

[52,]      17

[53,]      -1

[54,]      -1

[55,]      -1

[56,]      -1

[57,]      -1

[58,]      -1

[59,]      -1

[60,]      -1

[61,]      17

[62,]      17

[63,]      -1

[64,]      -1

[65,]      -1

[66,]      -1

[67,]      -1

[68,]      -1

[69,]      -1

[70,]      -1

[71,]      17

[72,]      17

[73,]      -1

[74,]      -1

[75,]      -1

[76,]      -1

[77,]      -1

[78,]      -1

[79,]      -1

[80,]      -1

[81,]      18

[82,]      17

[83,]      17

[84,]      17

[85,]      17

[86,]      17

[87,]      17

[88,]      17

[89,]      17

[90,]      17

[91,]      17

[92,]      17

[93,]      17

[94,]      17

[95,]      17

[96,]      17

[97,]      17

[98,]      17

[99,]      17

[100,]      17

So, clearly, this is wrong.  I just would like to find the starting position of 
the first digit, no matter what it is.

It's probably easy, isn't it?



        [[alternative HTML version deleted]]

______________________________________________ mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.
The contents of this e-mail are confidential and may be ...{{dropped:14}}

______________________________________________ mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________ mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.

Reply via email to