Re: Spark 1.3.1 and Parquet Partitions

2015-05-07 Thread in4maniac
Hi V, I am assuming that each of the three .parquet paths you mentioned have multiple partitions in them. For eg: [/dataset/city=London/data.parquet/part-r-0.parquet, /dataset/city=London/data.parquet/part-r-1.parquet] I haven't personally used this with "hdfs", but I've worked with a similar

Re: Spark 1.3.1 and Parquet Partitions

2015-05-07 Thread Yana Kadiyska
- Original message > From: Vaxuki > Date:05/07/2015 7:38 AM (GMT-05:00) > To: Olivier Girardot > Cc: user@spark.apache.org > Subject: Re: Spark 1.3.1 and Parquet Partitions > > Olivier > Nope. Wildcard extensions don't work I am debugging the code to figu

Re: Spark 1.3.1 and Parquet Partitions

2015-05-07 Thread yana
MT-05:00) To: Olivier Girardot Cc: user@spark.apache.org Subject: Re: Spark 1.3.1 and Parquet Partitions Olivier Nope. Wildcard extensions don't work I am debugging the code to figure out what's wrong I know I am using 1.3.1 for sure Pardon typos... On May 7, 2015, at 7:06 AM, O

Re: Spark 1.3.1 and Parquet Partitions

2015-05-07 Thread Vaxuki
Olivier Nope. Wildcard extensions don't work I am debugging the code to figure out what's wrong I know I am using 1.3.1 for sure Pardon typos... > On May 7, 2015, at 7:06 AM, Olivier Girardot wrote: > > "hdfs://some ip:8029/dataset/*/*.parquet" doesn't work for you ? > >> Le jeu. 7 mai 2015

Re: Spark 1.3.1 and Parquet Partitions

2015-05-07 Thread Olivier Girardot
"hdfs://some ip:8029/dataset/*/*.parquet" doesn't work for you ? Le jeu. 7 mai 2015 à 03:32, vasuki a écrit : > Spark 1.3.1 - > i have a parquet file on hdfs partitioned by some string looking like this > /dataset/city=London/data.parquet > /dataset/city=NewYork/data.parquet > /dataset/city=Pari

Spark 1.3.1 and Parquet Partitions

2015-05-06 Thread vasuki
Spark 1.3.1 - i have a parquet file on hdfs partitioned by some string looking like this /dataset/city=London/data.parquet /dataset/city=NewYork/data.parquet /dataset/city=Paris/data.paruqet …. I am trying to get to load it using sqlContext using sqlcontext.parquetFile( "hdfs://some ip:8029/datas