Hi,

I need to serialize and save a 20K x 20K matrix as a binary file. This process 
is significantly slower in R compared to Python (4X slower).

I'm not sure about the best approach to optimize the below code. Is it possible 
to parallelize the serialization function to enhance performance?


  n <- 20000^2
  cat("Generating matrices ... ")
  INI.TIME <- proc.time()
  A <- matrix(runif(n), ncol = m)
  END_GEN.TIME <- proc.time()
  arg_ser <- serialize(object = A, connection = NULL)

  END_SER.TIME <- proc.time()
  con <- file(description = "matrix_file", open = "wb")
  writeBin(object = arg_ser, con = con)
  close(con)
  END_WRITE.TIME <- proc.time()
  con <- file(description = "matrix_file", open = "rb")
  par_raw <- readBin(con, what = raw(), n = file.info("matrix_file")$size)
  END_READ.TIME <- proc.time()
  B <- unserialize(connection = par_raw)
  close(con)
  END_DES.TIME <- proc.time()
  TIME <- END_GEN.TIME - INI.TIME
  cat("Generation time", TIME[3], " seconds.")

  TIME <- END_SER.TIME - END_GEN.TIME
  cat("Serialization time", TIME[3], " seconds.")

  TIME <- END_WRITE.TIME - END_SER.TIME
  cat("Writting time", TIME[3], " seconds.")

  TIME <- END_READ.TIME - END_WRITE.TIME
  cat("Read time", TIME[3], " seconds.")

  TIME <- END_DES.TIME - END_READ.TIME
  cat("Deserialize time", TIME[3], " seconds.")




Best,
--Sameh

-- 

This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.

        [[alternative HTML version deleted]]

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to