Hi V,
I am assuming that each of the three .parquet paths you mentioned have
multiple partitions in them.
For eg: [/dataset/city=London/data.parquet/part-r-0.parquet,
/dataset/city=London/data.parquet/part-r-1.parquet]
I haven't personally used this with "hdfs", but I've worked with a similar
- Original message
> From: Vaxuki
> Date:05/07/2015 7:38 AM (GMT-05:00)
> To: Olivier Girardot
> Cc: user@spark.apache.org
> Subject: Re: Spark 1.3.1 and Parquet Partitions
>
> Olivier
> Nope. Wildcard extensions don't work I am debugging the code to figu
MT-05:00)
To: Olivier Girardot Cc:
user@spark.apache.org Subject: Re: Spark 1.3.1 and Parquet
Partitions
Olivier
Nope. Wildcard extensions don't work I am debugging the code to figure out
what's wrong I know I am using 1.3.1 for sure
Pardon typos...
On May 7, 2015, at 7:06 AM, O
Olivier
Nope. Wildcard extensions don't work I am debugging the code to figure out
what's wrong I know I am using 1.3.1 for sure
Pardon typos...
> On May 7, 2015, at 7:06 AM, Olivier Girardot wrote:
>
> "hdfs://some ip:8029/dataset/*/*.parquet" doesn't work for you ?
>
>> Le jeu. 7 mai 2015
"hdfs://some ip:8029/dataset/*/*.parquet" doesn't work for you ?
Le jeu. 7 mai 2015 à 03:32, vasuki a écrit :
> Spark 1.3.1 -
> i have a parquet file on hdfs partitioned by some string looking like this
> /dataset/city=London/data.parquet
> /dataset/city=NewYork/data.parquet
> /dataset/city=Pari
Spark 1.3.1 -
i have a parquet file on hdfs partitioned by some string looking like this
/dataset/city=London/data.parquet
/dataset/city=NewYork/data.parquet
/dataset/city=Paris/data.paruqet
….
I am trying to get to load it using sqlContext using sqlcontext.parquetFile(
"hdfs://some ip:8029/datas