I'd open an issue on the github to ask us to allow you to use hadoops glob
file format for the path.
On Thu, Jan 15, 2015 at 4:57 AM, David Jones
wrote:
> I've tried this now. Spark can load multiple avro files from the same
> directory by passing a path to a directory. However, passing multiple
I've tried this now. Spark can load multiple avro files from the same
directory by passing a path to a directory. However, passing multiple paths
separated with commas didn't work.
Is there any way to load all avro files in multiple directories using
sqlContext.avroFile?
On Wed, Jan 14, 2015 at
Should I be able to pass multiple paths separated by commas? I haven't
tried but didn't think it'd work. I'd expected a function that accepted a
list of strings.
On Wed, Jan 14, 2015 at 3:20 PM, Yana Kadiyska
wrote:
> If the wildcard path you have doesn't work you should probably open a bug
> --
If the wildcard path you have doesn't work you should probably open a bug
-- I had a similar problem with Parquet and it was a bug which recently got
closed. Not sure if sqlContext.avroFile shares a codepath with
.parquetFile...you
can try running with bits that have the fix for .parquetFile or loo
Hi,
I have a program that loads a single avro file using spark SQL, queries it,
transforms it and then outputs the data. The file is loaded with:
val records = sqlContext.avroFile(filePath)
val data = records.registerTempTable("data")
...
Now I want to run it over tens of thousands of Avro file