Cheng: I only see user@spark in the CC. FYI
On Sun, Dec 6, 2015 at 8:01 PM, Cheng Lian <l...@databricks.com> wrote: > cc parquet-dev list (it would be nice to always do so for these general > questions.) > > Cheng > > On 12/6/15 3:10 PM, Shushant Arora wrote: > >> Hi >> >> I have few doubts on parquet file format. >> >> 1.Does parquet keeps min max statistics like in ORC. how can I see >> parquet version(whether its1.1,1.2or1.3) for parquet file generated using >> hive or custom MR or AvroParquetoutputFormat. >> > Yes, Parquet also keeps row group statistics. You may check the Parquet > file using the parquet-meta CLI tool in parquet-tools (see > https://github.com/Parquet/parquet-mr/issues/321 for details), then look > for the "creator" field of the file. For programmatic access, check for > o.a.p.hadoop.metadata.FileMetaData.createdBy. > >> >> 2.how to sort parquet records while generating parquet file using >> avroparquetoutput format? >> > AvroParquetOutputFormat is not a format. It's just responsible for > converting Avro records to Parquet records. How are you using > AvroParquetOutputFormat? Any example snippets? > >> >> Thanks >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >