Hi Pramod
Is your data compressed? I encountered similar problem,however, after turned
codegen on, the GC time was still very long.The size of input data for my map
task is about 100M lzo file.
My query is ""select ip, count(*) as c from stage_bitauto_adclick_d group by ip
sort by c limit 10
Hi
I was trying to create an external table named "adclicktable" by API "def
createExternalTable(tableName: String, path: String)",then I can get the schema
of this table successfully like below and this table can be queried
normally.The data files are all Parquet files.
sqlContext.sql("describ
HiActually I did not use Tachyon 0.6.3,just compiled it with 0.5.0 by
make-distribution.sh. When I pulled the spark code from github,the Tachyon
version was still 0.5.0 in pom,xml.
Regards
Zhang
At 2015-04-29 04:19:20, "sara mustafa" wrote:
>Hi Zhang,
>
>How did you compile Spark 1.3.1 with
Hi,
I did some tests on Parquet Files with Spark SQL DataFrame API.
I generated 36 gzip compressed parquet files by Spark SQL and stored them on
Tachyon,The size of each file is about 222M.Then read them with below code.
val tfs
=sqlContext.parquetFile("tachyon://datanode8.bitauto.dmp:19998/apps
JIRA opened:https://issues.apache.org/jira/browse/SPARK-6921
At 2015-04-15 00:57:24, "Cheng Lian" wrote: >Would you
mind to open a JIRA for this? > >I think your suspicion makes sense. Will have
a look at this tomorrow. >Thanks for reporting! > >Cheng > >
Hi experts
I run below code in Spark Shell to access parquet files in Tachyon.
1.First,created a DataFrame by loading a bunch of Parquet Files in Tachyon
val ta3
=sqlContext.parquetFile("tachyon://tachyonserver:19998/apps/tachyon/zhangxf/parquetAdClick-6p-256m");
2.Second, set the "fs.local.block