On 08-Sep-09 16:17:00, David Winsemius wrote: > On Sep 8, 2009, at 12:00 PM, Lauri Nikkinen wrote: >> Ok, I think that I have to give up and try to get this data separated >> by some char. It seem pretty much impossible to separate those fields. >> Thanks for your help and efforts. > > The solution that Henrique offered seems to be a complete one: > > read.table(textConnection(gsub("([0-9]+)", ";\\1;", "DF12 This is an > example 1 This > + DF12 This is an 1232 This is > + DF14 This is 12334 This is an > + DF15 This 23 This is an example > + ")), sep = ";") > V1 V2 V3 V4 V5 > 1 DF 12 This is an example 1 This > 2 DF 12 This is an 1232 This is > 3 DF 14 This is 12334 This is an > 4 DF 15 This 23 This is an example
Surely the above solution is ad-hoc? It is based on an assumption that the fields alternate Text/Num/Text/Num/Text (hence the "gsub" usage), and does not at all make use of the field-width information varlength <- c(2, 2, 18, 5, 18). It simply puts a ";" separator at the start and end of every sequence of digits. If that is how Lauri's data really are organised, then the solution could work. But, if not, ... Ted. > Verus what you wanted... > > structure(list(V1 = structure(c(1L, 1L, 1L, 1L), .Label = "DF", > class > + = "factor"), > + V2 = c(12L, 12L, 14L, 15L), V3 = structure(c(4L, 3L, 2L, > + 1L), .Label = c("This", "This is", "This is an", "This is an > example" > + ), class = "factor"), V4 = c(1L, 1232L, 12334L, 23L), V5 = > + structure(1:4, .Label = c("This", > + "This is", "This is an", "This is an example"), class = > + "factor")), .Names = c("V1", > + "V2", "V3", "V4", "V5"), class = "data.frame", row.names = c(NA, > + -4L)) > V1 V2 V3 V4 V5 > 1 DF 12 This is an example 1 This > 2 DF 12 This is an 1232 This is > 3 DF 14 This is 12334 This is an > 4 DF 15 This 23 This is an example > > Unless you can be any clearer ... than you have been to this hour. > >> >> -L >> >> 2009/9/8 Lauri Nikkinen <lauri.nikki...@iki.fi>: >>> This is the file (see the attachment) that represents the problem I'm >>> facing with the original file. I'm looking for some generic way to >>> solve this problem. Thank you for your time. >>> >>> -L >>> >>> 2009/9/8 Barry Rowlingson <b.rowling...@lancaster.ac.uk>: >>>> On Tue, Sep 8, 2009 at 1:52 PM, Lauri >>>> Nikkinen<lauri.nikki...@iki.fi> wrote: >>>> >>>>> But this is not the solution I was looking for. Thanks. >>>> >>>> I think the only way you'll get the solution you are looking for is >>>> if you can let us have a copy of the original input file, or at >>>> least >>>> the first few lines - and not pasted into an email because special >>>> characters like spaces and tabs get smushed up and confuse things. >>>> >>> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 08-Sep-09 Time: 17:39:27 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.