On Jul 15, 2010, at 11:27 AM, AndrewPage wrote: > > Actually I have one more question that's somewhat related-- I'm starting out > by importing a .txt file that isn't divided into vectors and is at times > inconsistent with regards to spacing, indents, etc., so I can't rely on > those. It looks something like this: > > > "Drink=Coffee:Location=Office:Time=Morning:Market=Flat > > Drink=Water:Location=Office:Time=Afternoon:Market=Up > > Drink=Water:Location=Gym:Time=Evening:Market=Closed > Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed > Drink=Coffee:Location=Office:Time=Morning:Market=Flat > Drink=Water:Location=Office:Time=Afternoon:Market=Up > > Drink=Water:Location=Gym:Time=Evening:Market=Closed > Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed > Drink=Coffee:Location=Office:Time=Morning:Market=Flat > > Drink=Water:Location=Office:Time=Afternoon:Market=Up > > Drink=Water:Location=Gym:Time=Evening:Market=Closed > > Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" > > > > How can I take a single string like this and divide it into twelve vectors, > like this: > > FixedData > [1] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" > [2] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" > [3] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" > [4] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" > [5] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" > [6] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" > [7] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" > [8] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" > [9] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" > [10] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" > [11] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" > [12] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" > > Thanks again for all of the help!
If each of the text lines in the file are in fact on a separate line, then they will be split up by carriage return/line feed sequences (CR/LF) and can be read by R on a line by line basis using readLines(). Having done so, by copying the above from the clipboard, I get the following, presuming that the quotes are not part of the file input: > Lines [1] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat " [2] "" [3] "Drink=Water:Location=Office:Time=Afternoon:Market=Up " [4] "" [5] "Drink=Water:Location=Gym:Time=Evening:Market=Closed " [6] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed " [7] " Drink=Coffee:Location=Office:Time=Morning:Market=Flat " [8] "Drink=Water:Location=Office:Time=Afternoon:Market=Up " [9] "" [10] " Drink=Water:Location=Gym:Time=Evening:Market=Closed " [11] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" [12] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat " [13] "" [14] "Drink=Water:Location=Office:Time=Afternoon:Market=Up " [15] "" [16] "Drink=Water:Location=Gym:Time=Evening:Market=Closed " [17] "" [18] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" Even with this irregular structure, you can still use: Res1 <- gsub(".*Location=(.+):Time=.*", "\\1", Lines) > Res1 [1] "Office" "" "Office" "" "Gym" [6] "Restaurant" "Office" "Office" "" "Gym" [11] "Restaurant" "Office" "" "Office" "" [16] "Gym" "" "Restaurant" I can get rid of the blanks by using: > Res1[Res1 != ""] [1] "Office" "Office" "Gym" "Restaurant" "Office" [6] "Office" "Gym" "Restaurant" "Office" "Office" [11] "Gym" "Restaurant" If you do want to get just the fixed data as you have above: # Get rid of all spaces Res2 <- gsub(" +", "", Lines) # get rid of blank lines > Res2[Res2 != ""] [1] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" [2] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" [3] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" [4] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" [5] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" [6] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" [7] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" [8] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" [9] "Drink=Coffee:Location=Office:Time=Morning:Market=Flat" [10] "Drink=Water:Location=Office:Time=Afternoon:Market=Up" [11] "Drink=Water:Location=Gym:Time=Evening:Market=Closed" [12] "Drink=Wine:Location=Restaurant:Time=LateEvening:Market=Closed" HTH, Marc ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.