Yes, I'm running my code with the --release flag.

I've been looking everywhere but I can't find a way to make the writing
faster. I dont know if it is a mistake I'm making with the structs or the
Parquet crate needs optimizations.

Fernando,

On Thu, Jan 28, 2021 at 12:02 PM Andrew Lamb <al...@influxdata.com> wrote:

> The first thing I would check is that you are using a release build (`cargo
> build --release`)
>
> If you are, there may be additional optimizations needed in the Rust
> implementations
>
> Andrew
>
> On Thu, Jan 28, 2021 at 6:19 AM Fernando Herrera <
> fernando.j.herr...@gmail.com> wrote:
>
> > Hi,
> >
> > What is the writing speed that we should expect from the Arrow Parquet
> > writer?
> >
> > I'm writing a RecordBatch with two columns and 1,000,000 records and it
> > takes a lot of time to write the batch to the file (close to 2 secs).
> >
> > This is what I'm doing
> >
> > let schema = Schema::new(vec![
> > >     Field::new("col_1", DataType::Utf8, false),
> > >     Field::new("col_2", DataType::Utf8, false),
> > > ]);
> > > let batch = RecordBatch::try_new(
> > >     Arc::new(schema.clone()),
> > >     vec![Arc::new(array_1), Arc::new(array_2)],
> > > )
> > > .unwrap();
> > > let mut writer = ArrowWriter::try_new(file, Arc::new(schema.clone()),
> > > None).unwrap();
> > > writer.write(&batch).unwrap();
> > > writer.close().unwrap();
> >
> >
> >  I'm comparing a similar operation with Pandas and that is almost
> > immediate.
> >
> > Is there something I'm missing?
> >
> > Thanks,
> >
>

Reply via email to