Have you considered reading the file in a binary/raw, finding the offending character and replacing it with a blank (or whatever and then writing the file back out). You can then probably process it using read.table.;
On Thu, Mar 4, 2010 at 12:50 PM, jonas garcia <garcia.jona...@googlemail.com> wrote: > Thank you so much for your reply. > > > > I can identify the characters very easily in a couple of files. The reason I > am worried is that I have thousands of files to read in. The files were > produced in a very old MS-DOS software that records information on > oceanographic data and geographic position during a survey. > > > > My main goal is read all these files into R for further analysis. Most of > the files are cleared of these EOL markers but some are not. I only noticed > the problem by chance when I was looking and comparing one of them. I wonder > if I can solve this problem using R, without having to go for text editors > separately. > > > > Help on this would be much appreciated. > > Thanks again > > > > J > > > On 3/4/10, David Winsemius <dwinsem...@comcast.net> wrote: >> >> >> On Mar 3, 2010, at 2:22 PM, jonas garcia wrote: >> >> Dear R users, >>> >>> I am trying to read a huge file in R. For some reason, only a part of the >>> file is read. When I further investigated, I found that in one of my >>> non-numeric columns, there is one odd character responsible for this, >>> which >>> I reproduce bellow: >>> In case you cannot see it, it looks like a right arrow, but it is not the >>> one you get from microsoft word in menu "insert symbol". >>> >>> I think my dat file is broken and that funny character is an EOL marker >>> that >>> makes R not read the rest of the file. I am sure the character is there by >>> chance but I fear that it might be present in some other big files I have >>> to >>> work with as well. So, is there any clever way to remove this inconvenient >>> character in R avoiding having to edit the file in notepad and remove it >>> manually? >>> >>> Code I am using: >>> >>> read.csv("new3.dat", header=F) >>> >>> Warning message: >>> In read.table(file = file, header = header, sep = sep, quote = quote, : >>> incomplete final line found by readTableHeader on 'new3.dat' >>> >> >> I think you should identify the offending line by using the count.fields >> function and fix it with an editor. >> >> >> -- >> David >> >>> >>> I am working with R 2.10.1 in windows XP. >>> >>> Thanks in advance >>> >>> Jonas >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> David Winsemius, MD >> Heritage Laboratories >> West Hartford, CT >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.