Here is a start; you can change the column names: > x chr start end peak_loc cluster_TC strand peak_TC 1 chr1 564620 564649 chr1:564644..564645,+ 94 + 10 2 chr1 565369 565404 chr1:565371..565372,+ 217 + 8 3 chr1 565463 565541 chr1:565480..565481,+ 1214 + 15 4 chr1 565653 565697 chr1:565662..565663,+ 1031 + 28 5 chr1 565861 565922 chr1:565883..565884,+ 316 + 12 6 chr1 566537 566573 chr1:566564..566565,+ 119 + 11 > y <- sub("^.*:([[:digit:]]+)..([[:digit:]]+).*", "\\1 \\2", x$peak_loc) > y [1] "564644 564645" "565371 565372" "565480 565481" "565662 565663" "565883 565884" "566564 566565" > y <- strsplit(y, ' ') > y [[1]] [1] "564644" "564645"
[[2]] [1] "565371" "565372" [[3]] [1] "565480" "565481" [[4]] [1] "565662" "565663" [[5]] [1] "565883" "565884" [[6]] [1] "566564" "566565" > x.new <- cbind(x, do.call(rbind, y)) > x.new chr start end peak_loc cluster_TC strand peak_TC 1 2 1 chr1 564620 564649 chr1:564644..564645,+ 94 + 10 564644 564645 2 chr1 565369 565404 chr1:565371..565372,+ 217 + 8 565371 565372 3 chr1 565463 565541 chr1:565480..565481,+ 1214 + 15 565480 565481 4 chr1 565653 565697 chr1:565662..565663,+ 1031 + 28 565662 565663 5 chr1 565861 565922 chr1:565883..565884,+ 316 + 12 565883 565884 6 chr1 566537 566573 chr1:566564..566565,+ 119 + 11 566564 566565 On Mon, Jun 6, 2011 at 8:22 PM, ads pit <deconstructed.morn...@gmail.com> wrote: > Hi all, > I am given the a data frame in which one of the columns has more information > together- see column 4, peak_loc: > chr start end peak_loc cluster_TC strand peak_TC > 1 chr1 564620 564649 chr1:564644..564645,+ 94 + 10 > 2 chr1 565369 565404 chr1:565371..565372,+ 217 + 8 > 3 chr1 565463 565541 chr1:565480..565481,+ 1214 + 15 > 4 chr1 565653 565697 chr1:565662..565663,+ 1031 + 28 > 5 chr1 565861 565922 chr1:565883..565884,+ 316 + 12 > 6 chr1 566537 566573 chr1:566564..566565,+ 119 + 11 > > > I am trying to find out if there's a way to extract the coordinates given > in the 4th column and replace this column with two others that would have > the start coord and the end coord. so instead of chr1:564644..564645,+ > I would obtain; > start_peak end_peak > 564644 564645 > > Best, > nanami > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.