On Sun, 18 Sep 2016, 19:04 Philippe de Rochambeau <phi...@free.fr> wrote:
> Please find below code that attempts to read ints, longs and floats from a > binary file (which is a simplification of my original program). > Please disregard the R inefficiencies, such as using rbind, for now. > I’ve also included Java code to generate the binary file. > The output shows that, at one point, anInt becomes undefined. > Unfortunately, I couldn’t find the correct R function to determine whether > inInt is undefined or not, as is.null, is.nan, and is.infinite don’t work. > Any help would be much appreciated. > Many thanks in advance. > Philippe > > ——————— > [1] "anInt = 1" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 2" > [1] "aFloat = 3.44440007209778" > [1] "--------------------------" > [1] "anInt = 2" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 22" > [1] "aFloat = 13.4644002914429" > [1] "--------------------------" > [1] "anInt = 3" > [1] "is.null FALSE" > [1] "is.nan FALSE" > [1] "is.infinite FALSE" > [1] "aLong = 55" > [1] "aFloat = 45.4444007873535" > [1] "--------------------------" > [1] "anInt = " > [1] "is.null FALSE" > [1] "is.nan " > [1] "is.infinite " > [1] "aLong = " > [1] "aFloat = " > [1] "--------------------------" > [,1] [,2] [,3] > [1,] 1 2 3.4444 > [2,] 2 22 13.4644 > [3,] 3 55 45.4444 > [4,] Integer,0 Integer,0 Numeric,0 > > > > ----------- > > > ————————————————————— > > readFile <- function(inputPath) { > URL <- file(inputPath, "rb") > PLT <- matrix(nrow=0, ncol=3) > counte <- 0 > max <- 4 > while (counte < max) { > anInt <- readBin(con=URL, what=integer(), size=4, n=1, endian="big") > print(paste("anInt =", anInt)) > #if (! (anInt == 0)) { print(paste("empty int")); break } > print(paste("is.null ", is.null(anInt))) > print(paste("is.nan ", is.nan(anInt))) > print(paste("is.infinite ", is.infinite(anInt))) > aLong <- readBin(URL, integer(), size=8, n=1, endian="big") > print(paste("aLong =", aLong)) > aFloat <- readBin(URL, numeric(), size=4, n=1, endian="big") > print(paste("aFloat =", aFloat)) > print("--------------------------") > PLT <- rbind(PLT, list(anInt, aLong, aFloat)) > counte <- counte + 1 > } # end while > close(URL) > PLT > } > fichier <- "/Users/philippe/Desktop/datatests/data0.bin" > PLT2 <- readFile(fichier) > print(PLT2) > ————————————————————— > > import java.io.*; > > public class Main { > > Main() { > writeData(); > } > > public static void main(String[] args) { > new Main(); > } > > public void writeData() { > > final String path = > "/Users/philippe/Desktop/datatests/data0.bin"; > > DataOutputStream dos; > try { > dos = new DataOutputStream(new > BufferedOutputStream(new FileOutputStream(path))); > // big endian write! ("high byte first") , see > https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html > dos.writeInt(1); > dos.writeLong(2L); > dos.writeFloat(3.4444F); > > dos.writeInt(2); > dos.writeLong(22L); > dos.writeFloat(13.4644F); > > dos.writeInt(3); > dos.writeLong(55L); > dos.writeFloat(45.4444F); > > dos.close(); > } catch (FileNotFoundException e) { > e.printStackTrace(); > } catch (IOException ioe) { > ioe.printStackTrace(); > } > > } > > } > > > ————————————————————— > > > > > > > > Le 17 sept. 2016 à 20:45, Philippe de Rochambeau <phi...@free.fr> a > écrit : > > > > Hi Jim, > > this is exactly the answer I was look for. Many thanks. I didn’t R had a > pack function, as in PERL. > > To answer your earlier question, I am trying to update legacy code to > read a binary file with unknown size, over a network, slice up it into rows > each containing an integer, an integer, a long, a short, a float and a > float, and stuff the rows into a matrix. > It's possible to read all rows fast as raw(), then parse in a vectorised way with matrix indexing to group the bytes appropriately. There is an example on the mailing list somewhere, but otherwise I can show an example if that's of interest. Cheers, Mike > Best regards, > > Philippe > > > >> Le 17 sept. 2016 à 20:38, jim holtman <jholt...@gmail.com <mailto: > jholt...@gmail.com>> a écrit : > >> > >> Here is an example of how to do it: > >> > >> x <- 1:10 # integer values > >> xf <- seq(1.0, 2, by = 0.1) # floating point > >> > >> setwd("d:/temp") > >> > >> # create file to write to > >> output <- file('integer.bin', 'wb') > >> writeBin(x, output) # write integer > >> writeBin(xf, output) # write reals > >> close(output) > >> > >> > >> library(pack) > >> library(readr) > >> > >> # read all the data at once > >> allbin <- read_file_raw('integer.bin') > >> > >> # decode the data into a list > >> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin)) > >> > >> > >> > >> > >> Jim Holtman > >> Data Munger Guru > >> > >> What is the problem that you are trying to solve? > >> Tell me what you want to do, not how you want to do it. > >> > >> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenism...@gmail.com > <mailto:sezenism...@gmail.com><mailto:sezenism...@gmail.com <mailto: > sezenism...@gmail.com>>> wrote: > >> I noticed same issue but didnt care much :) > >> > >> On Sat, Sep 17, 2016, 18:01 jim holtman <jholt...@gmail.com <mailto: > jholt...@gmail.com> <mailto:jholt...@gmail.com <mailto:jholt...@gmail.com>>> > wrote: > >> Your example was not reproducible. Also how do you "break" out of the > >> "while" loop? > >> > >> > >> Jim Holtman > >> Data Munger Guru > >> > >> What is the problem that you are trying to solve? > >> Tell me what you want to do, not how you want to do it. > >> > >> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phi...@free.fr > <mailto:phi...@free.fr> <mailto:phi...@free.fr <mailto:phi...@free.fr>>> > >> wrote: > >> > >>> Hello, > >>> the following function, which stores numeric values extracted from a > >>> binary file, into an R matrix, is very slow, especially when the said > file > >>> is several MB in size. > >>> Should I rewrite the function in inline C or in C/C++ using Rcpp? If > the > >>> latter case is true, how do you « readBin » in Rcpp (I’m a total Rcpp > >>> newbie)? > >>> Many thanks. > >>> Best regards, > >>> phiroc > >>> > >>> > >>> ------------- > >>> > >>> # inputPath is something like http://myintranet/getData < > http://myintranet/getData><http://myintranet/getData < > http://myintranet/getData>>? > >>> pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData < > http://myintranet/getData> <http://myintranet/getData < > http://myintranet/getData>>? > >>> pathToFile=/usr/lib/xxx/yyy/data.bin> > >>> > >>> PLTreader <- function(inputPath){ > >>> URL <- file(inputPath, "rb") > >>> PLT <- matrix(nrow=0, ncol=6) > >>> compteurDePrints = 0 > >>> compteurDeLignes <- 0 > >>> maxiPrints = 5 > >>> displayData <- FALSE > >>> while (TRUE) { > >>> periodIndex <- readBin(URL, integer(), size=4, n=1, > >>> endian="little") # int (4 bytes) > >>> eventId <- readBin(URL, integer(), size=4, n=1, > >>> endian="little") # int (4 bytes) > >>> dword1 <- readBin(URL, integer(), size=4, signed=FALSE, > >>> n=1, endian="little") # int > >>> dword2 <- readBin(URL, integer(), size=4, signed=FALSE, > >>> n=1, endian="little") # int > >>> if (dword1 < 0) { > >>> dword1 = dword1 + 2^32-1; > >>> } > >>> eventDate = (dword2*2^32 + dword1)/1000 > >>> repNum <- readBin(URL, integer(), size=2, n=1, > >>> endian="little") # short (2 bytes) > >>> exp <- readBin(URL, numeric(), size=4, n=1, > >>> endian="little") # float (4 bytes, strangely enough, would expect 8) > >>> loss <- readBin(URL, numeric(), size=4, n=1, > >>> endian="little") # float (4 bytes) > >>> PLT <- rbind(PLT, c(periodIndex, eventId, eventDate, > >>> repNum, exp, loss)) > >>> } # end while > >>> return(PLT) > >>> close(URL) > >>> } > >>> > >>> ---------------- > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help@r-project.org <mailto:R-help@r-project.org> <mailto: > R-help@r-project.org <mailto:R-help@r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>< > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>> > >>> PLEASE do read the posting guide http://www.R-project.org/ < > http://www.r-project.org/> <http://www.r-project.org/ < > http://www.r-project.org/>> > >>> posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help@r-project.org <mailto:R-help@r-project.org> <mailto: > R-help@r-project.org <mailto:R-help@r-project.org>> mailing list -- To > UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>< > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help>> > >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html> < > http://www.r-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html>> > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org <mailto:R-help@r-project.org> mailing list -- To > UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help < > https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html < > http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Dr. Michael Sumner Software and Database Engineer Australian Antarctic Division 203 Channel Highway Kingston Tasmania 7050 Australia [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.