o me.
I am a bigdata engineer, like to contribute for open source. I already summit 2
PR for Apache Flink(FLINK-26609, FLINK-26728) and its merged\closed.
So i think if i can get the jira ticket, i can implemented it fairly well.
thanks.
.
javaca...@163.com
Hi,
How can I persistent database/table created in spark application?
object TestPersistentDB {
def main(args:Array[String]): Unit = {
val spark = SparkSession.builder()
.appName("Create persistent table")
.config("spark.
Hi,
when I use Dataframe with table schema, It goes wrong:
val test_schema = StructType(Array(
StructField("id", IntegerType, false),
StructField("flag", CharType(1), false),
StructField("time", DateType, false)));
val df = spark.read.format("com.databricks.spark.csv")
.schema(test_s
I change the UDF but the performance seems still slow. What can I do else?
> 在 2017年7月14日,下午8:34,Wenchen Fan 写道:
>
> Try to replace your UDF with Spark built-in expressions, it should be as
> simple as `$”x” * (lit(1) - $”y”)`.
>
>> On 14 Jul 2017, at 5:46 PM, 163 > &
I modify the tech query5 to DataFrame:
val forders =
spark.read.parquet("hdfs://dell127:20500/SparkParquetDoubleTimestamp100G/orders
”).filter("o_orderdate
< 1995-01-01 and o_orderdate >= 1994-01-01").select("o_custkey", "o_orderkey")
val flineitem =
spark.read.parquet("hdfs://dell127:20500/Spa
>
> I modify the tech query5 to DataFrame:
> val forders =
> spark.read.parquet("hdfs://dell127:20500/SparkParquetDoubleTimestamp100G/orders
>
> ”).filter("o_orderdate
> < 1995-01-01 and o_orderdate >= 1994-01-01").select("o_custkey",
> "o_orderkey")
> val flineitem =
> spark.read.parquet("
How to add new topic to kafka without requiring restart of the streaming
context?
r7raul1...@163.com
Hi,
Have you ever considered cassandra as a replacement ? We are now almost the
seem usage as your engine, e.g. using mysql to store
initial aggregated data. Can you share more about your kind of Cube queries ?
We are very interested in that arch too : )
Best,
Sun.
fightf...@163.com
prompt response.
fightf...@163.com
From: tsh
Date: 2015-11-10 02:56
To: fightf...@163.com; user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Hi,
I'm in the same position right now: we are going to implement something like
OLAP BI + Machine Learning explorations on the
of olap architecture.
And we are happy to hear more use case from this community.
Best,
Sun.
fightf...@163.com
From: Jörn Franke
Date: 2015-11-09 14:40
To: fightf...@163.com
CC: user; dev
Subject: Re: OLAP query using spark dataframe with cassandra
Is there any distributor supporting
-apache-cassandra-and-spark
fightf...@163.com
Hi, there
Which version are you using ? Actually the problem seems gone after we change
our spark version from 1.2.0 to 1.3.0
Not sure what the internal changes did.
Best,
Sun.
fightf...@163.com
From: Night Wolf
Date: 2015-05-12 22:05
To: fightf...@163.com
CC: Patrick Wendell; user; dev
import org.apache.spark.sql.catalyst.expressions._
val values: JavaArrayList[Any] = new JavaArrayList()
computedValues = Row(values.get(0),values.get(1)) //It is not good by use
get(index). How to create a Row from a List or Array in Spark using Scala .
r7raul1...@163.com
application? Does spark provide such configs for achieving that goal?
We know that this is trickle to get it working. Just want to know that how
could this be resolved, or from other possible channel for
we did not cover.
Expecting for your kind advice.
Thanks,
Sun.
fightf...@163.com
Hi,
Really have no adequate solution got for this issue. Expecting any available
analytical rules or hints.
Thanks,
Sun.
fightf...@163.com
From: fightf...@163.com
Date: 2015-02-09 11:56
To: user; dev
Subject: Re: Sort Shuffle performance issues about using AppendOnlyMap for
large data
15 matches
Mail list logo