On 2024-09-07 4:52 p.m., Jeff Newmiller via R-help wrote:
When you specify LE in the encoding type, you are logically telling the decoder 
that you know the two-byte pairs are in little-endian order... which could 
override whatever the byte-order-mark was indicating. If the BOM indicated 
big-endian then the file decoding would break. If there is a BOM, don't 
override it unless you have to (e.g. for a wrong BOM)... leave off the LE 
unless you really need it.

That sounds like good advice, but it doesn't work:

 > read.delim(
+ 'https://online.stat.psu.edu/onlinecourses/sites/stat501/files /ch15/employee.txt',
 +     fileEncoding = "UTF-16"
 + )
[1] time













[2] vendor.洀攀琀愀氀........㐀㐀........㜀.㐀㐀........㤀.㐀㐀.㐀..㐀.....㐀..㐀..㔀...㜀.㐀..㠀..㘀...㠀.㐀㐀....㜀...㔀.㐀㐀.

and so on.

On September 7, 2024 1:22:23 PM PDT, Enrico Schumann <e...@enricoschumann.net> 
wrote:
On Sun, 08 Sep 2024, Christofer Bogaso writes:

Hi,

I am trying to the data from
https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt
without any success. Below is the error I am getting:

read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt')

Error in make.names(col.names, unique = TRUE) :

   invalid multibyte string at '<ff><fe>t'

In addition: Warning messages:

1: In read.table(file = file, header = header, sep = sep, quote = quote,  :

   line 1 appears to contain embedded nulls

2: In read.table(file = file, header = header, sep = sep, quote = quote,  :

   line 2 appears to contain embedded nulls

3: In read.table(file = file, header = header, sep = sep, quote = quote,  :

   line 3 appears to contain embedded nulls

4: In read.table(file = file, header = header, sep = sep, quote = quote,  :

   line 4 appears to contain embedded nulls

5: In read.table(file = file, header = header, sep = sep, quote = quote,  :

   line 5 appears to contain embedded nulls

Is there any way to read this data directly onto R?

Thanks for your time


The <ff><fe> looks like a byte-order mark
(https://en.wikipedia.org/wiki/Byte_order_mark).
Try this:

    fn <- 
file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt',
               encoding = "UTF-16LE")
    read.delim(fn)



______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to