subject:"Manually reading parquet files."

Re: Manually reading parquet files.

2019-03-22 Thread Wenchen Fan

ix.com" > *Date: *Thursday, March 21, 2019 at 3:32 PM > *To: *"Long, Andrew" > *Cc: *"dev@spark.apache.org" , " > u...@spark.apache.org" , "horizon-...@amazon.com" < > horizon-...@amazon.com> > *Subject: *Re: Manually reading parquet f

Re: Manually reading parquet files.

2019-03-21 Thread Long, Andrew

o: "Long, Andrew" Cc: "dev@spark.apache.org" , "u...@spark.apache.org" , "horizon-...@amazon.com" Subject: Re: Manually reading parquet files. You're getting InternalRow instances. They probably have the data you want, but the toString representation

Re: Manually reading parquet files.

2019-03-21 Thread Ryan Blue

You're getting InternalRow instances. They probably have the data you want, but the toString representation doesn't match the data for InternalRow. On Thu, Mar 21, 2019 at 3:28 PM Long, Andrew wrote: > Hello Friends, > > > > I’m working on a performance improvement that reads additional parquet

Manually reading parquet files.

2019-03-21 Thread Long, Andrew

Hello Friends, I’m working on a performance improvement that reads additional parquet files in the middle of a lambda and I’m running into some issues. This is what id like todo ds.mapPartitions(x=>{ //read parquet file in and perform an operation with x }) Here’s my current POC code but