subject:"RE\: Comparing with Parquet"

Re: Comparing with Parquet

2016-02-26 Thread Sourav Mazumder

riginal Message- > From: Reynold Xin [mailto:r...@databricks.com] > Sent: Thursday, February 25, 2016 2:46 PM > To: dev@arrow.apache.org > Subject: Re: Comparing with Parquet > > To put it even more layman, on-disk formats are typically designed for > more permanent storage on di

Re: Comparing with Parquet

2016-02-25 Thread Venkat Krishnamurthy

but the performance couldn't support fast query. > > > > So for PB level data and interactively query(second level), both couldn't > > solve? > > > > Regards > > Liang > > -邮件原件----- > > 发件人: Henry Robinson [mailto:he...@cloudera.com] > >

Re: Comparing with Parquet

2016-02-25 Thread Pedro Miguel Duarte

rds > Liang > -邮件原件- > 发件人: Henry Robinson [mailto:he...@cloudera.com] > 发送时间: 2016年2月26日 0:20 > 收件人: dev@arrow.apache.org > 主题: Re: Comparing with Parquet > > Think of Parquet as a format well-suited to writing very large datasets to > disk, whereas Arrow is a for

Re: Comparing with Parquet

2016-02-25 Thread Jason Altekruse

] > Sent: Thursday, February 25, 2016 2:46 PM > To: dev@arrow.apache.org > Subject: Re: Comparing with Parquet > > To put it even more layman, on-disk formats are typically designed for > more permanent storage on disks/ssds, and as a result the format would want > to reduce the size,

RE: Comparing with Parquet

2016-02-25 Thread Andrew Brust

Also extremely helpful; thank you! -Original Message- From: Reynold Xin [mailto:r...@databricks.com] Sent: Thursday, February 25, 2016 2:46 PM To: dev@arrow.apache.org Subject: Re: Comparing with Parquet To put it even more layman, on-disk formats are typically designed for more

Re: Comparing with Parquet

2016-02-25 Thread Reynold Xin

t mostly around aligning things for > SIMD/vectorization? > > > > There is probably some ignorance in my question, but I'm comfortable > > with that. :-) > > > > -Original Message- > > From: Wes McKinney [mailto:w...@cloudera.com] > > Sent: Th

RE: Comparing with Parquet

2016-02-25 Thread Andrew Brust

That's extremely helpful, thank you Todd. (And nice to "see" you again. I interviewed you years ago.) -Original Message- From: Todd Lipcon [mailto:t...@cloudera.com] Sent: Thursday, February 25, 2016 2:23 PM To: dev@arrow.apache.org Subject: Re: Comparing with Parquet I

Re: Comparing with Parquet

2016-02-25 Thread Todd Lipcon

. :-) > > -Original Message- > From: Wes McKinney [mailto:w...@cloudera.com] > Sent: Thursday, February 25, 2016 12:12 PM > To: dev@arrow.apache.org > Subject: Re: Comparing with Parquet > > We wrote about this in a recent blog post: > > http://blog.cloudera.com/blog/

RE: Comparing with Parquet

2016-02-25 Thread Andrew Brust

inney [mailto:w...@cloudera.com] Sent: Thursday, February 25, 2016 12:12 PM To: dev@arrow.apache.org Subject: Re: Comparing with Parquet We wrote about this in a recent blog post: http://blog.cloudera.com/blog/2016/02/introducing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-sta

Re: Comparing with Parquet

2016-02-25 Thread Wes McKinney

We wrote about this in a recent blog post: http://blog.cloudera.com/blog/2016/02/introducing-apache-arrow-a-fast-interoperable-in-memory-columnar-data-structure-standard/ "Apache Parquet is a compact, efficient columnar data storage designed for storing large amounts of data stored in HDFS. Arrow

Re: Comparing with Parquet

2016-02-25 Thread Henry Robinson

Think of Parquet as a format well-suited to writing very large datasets to disk, whereas Arrow is a format most suited to efficient storage in memory. You might read Parquet files from disk, and then materialize them in memory in Arrow's format. Both formats are designed around the idiosyncras

Re: Comparing with Parquet

Re: Comparing with Parquet

Re: Comparing with Parquet

Re: Comparing with Parquet

RE: Comparing with Parquet

Re: Comparing with Parquet

RE: Comparing with Parquet

Re: Comparing with Parquet

RE: Comparing with Parquet

Re: Comparing with Parquet

Re: Comparing with Parquet

11 matches

Site Navigation

Mail list logo

Footer information