On 6-Mar-11, at 7:13 PM, Eric Fail wrote:

Dear R-list,

I have a partly comma separated partly underscore separated string that I am trying to parse into R.

Furthermore I have a bunch of them, and they are quite long. I have now spent most of my Sunday trying to figure this out and thought I would try the list to see if someone here would be able to get me started.

My data structure looks like this,

(in a example.txt file)
Subject ID,ExperimentName,2010-04-23,32:34:23,Version 0.4, 640 by 960 pixels, On Device M, M, 3.2.4,zz_373_462_488_...@9z.svg, 592,820,3.35,zz_032_288_436_...@9z.svg, 332,878,3.66,zz_384_204_433_...@9z.svg, 334,824,3.28,zz_365_575_683_...@9z.svg, 598,878,3.50,zz_005_480_239_...@9z.svg, 630,856,8.03,zz_030_423_394_...@9z.svg, 98,846,4.09,zz_033_596_398_...@9z.svg, 636,902,3.28,zz_263_064_320_...@9z.svg,570,894,1.26,bl...@9z.svg, 322,842,32.96,zz_004_088_403_...@9z.svg, 606,908,3.32,zz_703_546_434_...@9z.svg, 624,934,2.58,zz_712_348_543_...@9z.svg, 20,828,5.36,zz_005_48_239_...@9z.svg, 580,830,4.36,zz_310_444_623_...@9z.svg, 586,806,0.08,zz_030_423_394_...@9z.svg, 350,854,3.84,zz_340_382_539_...@9z.svg,570,894,1.26,bl...@9z.svg, 542,840,4.44,zz_345_230_662_...@9z.svg, 632,844,2.47,zz_006_335_309_...@9z.svg, 96,930,3.63,zz_782_346_746_...@9z.svg, 306,850,2.58,zz_334_200_333_...@9z.svg, 304,842,3.34,zz_383_506_726_...@9z.svg, 622,884,3.84,zz_294_360_448_...@9z.svg, 90,858,3.56,zz_334_335_473_...@9z.svg,570,894,1.26,bl...@9z.svg, 320,852,4.04,
(end of example.txt file)

The above is approximate 5% of the length of a full file, and then I got about 100 of them. Please note that the strings end with a comma.

I am trying to parse it into something like this

ID ImgNam BLOCK RUN Tx Ty Treatment x y Y
Subject ID 373 1 1 462 488 TRT 592 820 3.35
Subject ID 32 1 2 288 436 CON 332 878 3.66
Subject ID 384 1 3 204 433 TRT 334 824 3.28
Subject ID 365 1 4 575 683 TRT 598 878 3.5
Subject ID 5 1 5 480 239 CON 630 856 8.03
Subject ID 30 1 6 423 394 CON 98 846 4.09
Subject ID 33 1 7 596 398 CON 636 902 3.28
Subject ID 263 1 8 64 320 TRT 570 894 1.26
Subject ID 4 2 1 88 403 CON 606 908 3.32
Subject ID 703 2 2 546 434 CON 624 934 2.58
Subject ID 712 2 3 348 543 CON 20 828 5.36
Subject ID 5 2 4 48 239 CON 580 830 4.36
Subject ID 310 2 5 444 623 TRT 586 806 0.08
Subject ID 30 2 6 423 394 CON 350 854 3.84
Subject ID 340 2 7 382 539 TRT 570 894 1.26
Subject ID 345 3 1 230 662 TRT 632 844 2.47
Subject ID 6 3 2 335 309 CON 96 930 3.63
Subject ID 782 3 3 346 746 TRT 306 850 2.58
Subject ID 334 3 4 200 333 TRT 304 842 3.34
Subject ID 383 3 5 506 726 TRT 622 884 3.84
Subject ID 294 3 6 360 448 TRT 90 858 3.56
Subject ID 334 3 7 335 473 TRT 570 894 1.26

I could do it in Excel, but it would take me a week--and it would be stupid--if someone could please help me get started I would very much appreciate it. It would not only benefit me, but my colleagues would see the benefit of R and the R-list in particular.

Thanks in advance!

Eric


In a good text editor it would be one command per file. So if you are on UNIX or mac OSX you could loop through files with (probably) an awk command. I don't remember the syntax (it's been too long) but it should be just a few lines of shell script. In windows I'm not sure but there should
be something similar.

Maybe that "gets you started". Probably one of the list jocks will have it nailed if you wait.
--

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- guide.html
and provide commented, minimal, self-contained, reproducible code.

Why does the universe go to all the bother of existing?
-- Stephen Hawking

#define QUESTION ((bb) || !(bb))
-- William Shakespeare



Don McKenzie, Research Ecologist
Pacific WIldland Fire Sciences Lab
US Forest Service

Affiliate Professor
School of Forest Resources, College of the Environment
CSES Climate Impacts Group
University of Washington

desk: 206-732-7824
cell: 206-321-5966
d...@uw.edu
donaldmcken...@fs.fed.us

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to