Re: [R] How to remove square brackets, etc. from address strings?

Sabina Arndt Fri, 25 May 2012 13:32:43 -0700

Hello r-help members,

the solutions which Sarah Goslee and arun sent to me in such a promptand helpful manner work well with the examples I cut from the data.frameI'm analyzing. Thank you very much for that!I incorporated them into my R-script and discovered that it stilldoesn't work properly, unfortunately. I have no idea why that's the case.You see, I want to extract country names from the contents oftab-delimited text files. This is an example of the data I'm using:http://pastebin.com/mYZNDXg6This is the script I'm using to import the data:http://pastebin.com/Z10UUH3z (It requires the text files to be in afolder which doesn't contain any other .txt files.)This is the script I'm using to extract the country names:http://pastebin.com/G37fuPbaThis is the string that's in the relevant field of the first record I'mworking on:

[Engel, Kathrin M. Y.; Schroeck, Kristin; Schoeneberg, Torsten; Schulz,Angela] Univ Leipzig, Fac Med, Inst Biochem, Leipzig, Germany; [Teupser,Daniel; Holdt, Lesca Miriam; Thiery, Joachim] Univ Leipzig, Fac Med,Inst Lab Med Clin Chem & Mol Diagnost, Leipzig, Germany; [Toenjes, Anke;Kern, Matthias; Blueher, Matthias; Stumvoll, Michael] Univ Leipzig, FacMed, Dept Internal Med, Leipzig, Germany; [Dietrich, Kerstin; Kovacs,Peter] Univ Leipzig, Fac Med, Interdisciplinary Ctr Clin Res, Leipzig,Germany; [Kruegel, Ute] Univ Leipzig, Fac Med, Rudolf Boehm InstPharmacol & Toxicol, Leipzig, Germany; [Scheidt, Holger A.; Schiller,Juergen; Huster, Daniel] Univ Leipzig, Fac Med, Inst Med Phys & Biophys,Leipzig, Germany; [Brockmann, Gudrun A.] Humboldt Univ, Inst Anim Sci,D-10099 Berlin, Germany; [Augustin, Martin] Ingenium Pharmaceut AG,Martinsried, Germany

This is the incorrect result my extraction script gives me for the firstrecord:


> C1s[1]
 [1] "[ENGEL,  KATHRIN M. Y." "KRISTIN"                "TORSTEN"
 [4] "GERMANY"                "DANIEL"                 "LESCA MIRIAM"
 [7] "GERMANY"                "ANKE"                   "MATTHIAS"
[10] "MATTHIAS"               "GERMANY"                "KERSTIN"

[13] "GERMANY" "GERMANY" "[SCHEIDT,HOLGER A."

[16] "JUERGEN"                "GERMANY"                "HUMBOLDT"
[19] "GERMANY"

For some reason the first and sixth pair of the eight square bracketsare not removed ... Do you understand why?

Instead I'd like to get this result, though:

> C1s[1]
 [1] "GERMANY"        "GERMANY"        "GERMANY"
 [4] "GERMANY"        "GERMANY"        "GERMANY"
 [7] "HUMBOLDT"        "GERMANY"

What am I doing wrong? What are the errors in my R-script?
Would anybody be so kind as to take a look and help me out, please?
Thank you very much in advance!

Faithfully yours,

Sabina Arndt

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to remove square brackets, etc. from address strings?

Reply via email to