I tried it on R 4.4.1 on Linux Mint 21.3 just before I posted it, and I just tried it on R 3.4.2 on Ubuntu 16.04 and R 4.3.2 on Windows 11 just now and it works on all of them.
I don't have a big-endian machine to test on, but the Unicode spec says to honor the BOM and if there isn't one to assume that it is big-endian data. But in this case there is a BOM so your machine has a buggy decoder? On September 7, 2024 2:43:24 PM PDT, Duncan Murdoch <murdoch.dun...@gmail.com> wrote: >On 2024-09-07 4:52 p.m., Jeff Newmiller via R-help wrote: >> When you specify LE in the encoding type, you are logically telling the >> decoder that you know the two-byte pairs are in little-endian order... which >> could override whatever the byte-order-mark was indicating. If the BOM >> indicated big-endian then the file decoding would break. If there is a BOM, >> don't override it unless you have to (e.g. for a wrong BOM)... leave off the >> LE unless you really need it. > >That sounds like good advice, but it doesn't work: > > > read.delim( > + 'https://online.stat.psu.edu/onlinecourses/sites/stat501/files > /ch15/employee.txt', > + fileEncoding = "UTF-16" > + ) > [1] time > > > > > > > > > > > > > > [2] > vendor.洀攀琀愀氀........㐀㐀........㜀.㐀㐀........㤀.㐀㐀.㐀..㐀.....㐀..㐀..㔀...㜀.㐀..㠀..㘀...㠀.㐀㐀....㜀...㔀.㐀㐀. > >and so on. >> >> On September 7, 2024 1:22:23 PM PDT, Enrico Schumann >> <e...@enricoschumann.net> wrote: >>> On Sun, 08 Sep 2024, Christofer Bogaso writes: >>> >>>> Hi, >>>> >>>> I am trying to the data from >>>> https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt >>>> without any success. Below is the error I am getting: >>>> >>>>> read.delim('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt') >>>> >>>> Error in make.names(col.names, unique = TRUE) : >>>> >>>> invalid multibyte string at '<ff><fe>t' >>>> >>>> In addition: Warning messages: >>>> >>>> 1: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>> line 1 appears to contain embedded nulls >>>> >>>> 2: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>> line 2 appears to contain embedded nulls >>>> >>>> 3: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>> line 3 appears to contain embedded nulls >>>> >>>> 4: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>> line 4 appears to contain embedded nulls >>>> >>>> 5: In read.table(file = file, header = header, sep = sep, quote = quote, : >>>> >>>> line 5 appears to contain embedded nulls >>>> >>>> Is there any way to read this data directly onto R? >>>> >>>> Thanks for your time >>>> >>> >>> The <ff><fe> looks like a byte-order mark >>> (https://en.wikipedia.org/wiki/Byte_order_mark). >>> Try this: >>> >>> fn <- >>> file('https://online.stat.psu.edu/onlinecourses/sites/stat501/files/ch15/employee.txt', >>> encoding = "UTF-16LE") >>> read.delim(fn) >>> >> > -- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.