Re: [Rcpp-devel] efficient ingestion of "sparse csv"

2021-05-10 Thread Vincent Carey
Thanks Dirk, lots of useful information there. I wonder whether the sparse ingestion problem would best be solved with multiple passes -- it seems one would want to learn the dimensions and the number of nonzero elements per row to allocate the index vectors, and then populate them and the data ve

Re: [Rcpp-devel] efficient ingestion of "sparse csv"

2021-05-10 Thread Dirk Eddelbuettel
Vincent, In the broad terms of the question the best answer may be a simple "sure". More seriously, there have been many approaches. Consider for example the recent Rcpp Gallery post lead by Zach (with some edits by me): https://gallery.rcpp.org/articles/sparse-matrix-class/ It's focus on not

[Rcpp-devel] efficient ingestion of "sparse csv"

2021-05-10 Thread Vincent Carey
This problem has been discussed in various places but I don't see a clear solution. Certain applications are generating large comma-delimited files with mostly zero entries. The aim is to ingest efficiently, converting to sparse representation a record at a time. Presumably a triplet format woul