I updated the PR for SPARK-6352 to be more like SPARK-3595.
I added a new setting "spark.sql.parquet.output.committer.class" in hadoop
configuration to allow custom implementation of ParquetOutputCommitter.
Can someone take a look at the PR?
On Mon, Mar 16, 2015 at 5:23 PM, Pei-Lun
JIRA and PR for first issue:
https://issues.apache.org/jira/browse/SPARK-6408
https://github.com/apache/spark/pull/5087
On Thu, Mar 19, 2015 at 12:20 PM, Pei-Lun Lee wrote:
> Hi,
>
> I am trying jdbc data source in spark sql 1.3.0 and found some issues.
>
> First, the syntax
Hi,
I am trying jdbc data source in spark sql 1.3.0 and found some issues.
First, the syntax "where str_col='value'" will give error for both
postgresql and mysql:
psql> create table foo(id int primary key,name text,age int);
bash> SPARK_CLASSPATH=postgresql-9.4-1201-jdbc41.jar spark/bin/spark-s
gt; path contain an actual comma in it. In your case, you may do something like
> this:
>
> val s3nDF = parquetFile("s3n
> ://...
> ")val hdfsDF = parquetFile("hdfs://...")val finalDF = s3nDF.union(finalDF)
>
> Cheng
>
> On 3/16/15 4:03 PM, Pei-Lun Lee wr
ect dependency makes this injection much more
> difficult for saveAsParquetFile.
>
> On Thu, Mar 5, 2015 at 12:28 AM, Pei-Lun Lee wrote:
>
>> Thanks for the DirectOutputCommitter example.
>> However I found it only works for saveAsHadoopFile. What about
>> saveAsParquetFile?
&
Hi,
I am using Spark 1.3.0, where I cannot load parquet files from more than
one file system, say one s3n://... and another hdfs://..., which worked in
older version, or if I set spark.sql.parquet.useDataSourceApi=false in 1.3.
One way to fix this is instead of get a single FileSystem from defaul
on
> work
> >> when we wanna to upgrade Spark 1.3.
> >>
> >> Is there anyone can help me?
> >>
> >>
> >> Thanks
> >>
> >> Wisely Chen
> >>
> >>
> >> On Tue, Mar 10, 2015 at 5:06
Hi,
I found that if I try to read parquet file generated by spark 1.1.1 using
1.3.0-rc3 by default settings, I got this error:
com.fasterxml.jackson.core.JsonParseException: Unrecognized token
'StructType': was expecting ('true', 'false' or 'null')
at [Source: StructType(List(StructField(a,Integ
Thanks for the DirectOutputCommitter example.
However I found it only works for saveAsHadoopFile. What about
saveAsParquetFile?
It looks like SparkSQL is using ParquetOutputCommitter, which is subclass
of FileOutputCommitter.
On Fri, Feb 27, 2015 at 1:52 AM, Thomas Demoor
wrote:
> FYI. We're cur
Hi,
We have a PR to support fixed length byte array in parquet file.
https://github.com/apache/spark/pull/1737
Can someone help verifying it?
Thanks.
2014-07-15 19:23 GMT+08:00 Pei-Lun Lee :
> Sorry, should be SPARK-2489
>
>
> 2014-07-15 19:22 GMT+08:00 Pei-Lun Lee :
>
>
10 matches
Mail list logo