On 18/01/16 10:48, Uwe Ligges wrote:
This is not a tab delimited file (as you apparently assume given the
code), but a fixed width format, hence I'd try:

url <- "http://data.princeton.edu/wws509/datasets/divorce.dat";
widths <- c(9, 13, 10, 8, 10, 6)
f5 <- read.fwf(url, widths = widths, skip = 1, strip.white = TRUE)

names(f5) <- as.character(unlist(read.fwf(url, widths = widths,
strip.white=TRUE, n=1)))

Not sure why reading it simply with header=TRUE des not work, but no
time to investiagte this now.

Dear Uwe,

I have fiddled around a bit and the situation seems to me to be of the nature of a bug in read.fwf. It would seem that in order for header=TRUE to work, the entries of the header need to be separated by the sep delimiter which defaults to "\t". In the case in question the entries are separated by blanks, so presumably the header gets read in as a single entity, rather than 6 such, leading to a mismatch between the length of the header and the number of columns.

It seems that the specified widths get ignored when the header line is dealt with.

It also seems that if one specifies sep="" then the header gets read correctly but then strings of blanks get interpreted as field separators throughout and then blanks within the fields result in the
wrong number of columns.

I think that the code of read.fwf is easy enough to fix; a slight adjustment will make the header get treated the same way as the body of the file.

I don't see any problems/drawbacks with so-doing, and experimenting with my modified function resulted in the divorce data being read in with header=TRUE with no problems.

If this mod is made, I see no reason to keep the "sep" argument in read.fwf --- except maybe for backward compatibility issues, and I don't think there would be any since it never worked properly anyhow.

cheers,

Rolf

P. S. I can send you my modified version of read.fwf off-list if this would be of any use to you.

R.

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to