+1
On Wed, Mar 2, 2016 at 2:45 PM, Michael Armbrust
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.6.1!
>
> The vote is open until Saturday, March 5, 2016 at 20:00 UTC and passes if
> a majority of at least 3+1 PMC votes are cast.
>
> [ ] +1 Release this pa
Hi, all:
Sometimes task will fail with exception "About the exception "Received
LaunchTask command but executor was null", and I find it is a common problem:
https://issues.apache.org/jira/browse/SPARK-13112
https://issues.apache.org/jira/browse/SPARK-13060
I have a ques
SQL is very common and even some business analysts learn them. Scala and
Python are great, but the easiest language to use is often the languages a
user already knows. And for a lot of users, that is SQL.
On Wednesday, March 2, 2016, Jerry Lam wrote:
> Hi guys,
>
> FYI... this wiki page (StreamS
Please vote on releasing the following candidate as Apache Spark version
1.6.1!
The vote is open until Saturday, March 5, 2016 at 20:00 UTC and passes if a
majority of at least 3+1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 1.6.1
[ ] -1 Do not release this package because ...
I see, we could reduce the memory by moving the copy out of the HashedRelation,
then we should do the copy before call HashedRelation for shuffle hash join.
Another things is that when we do broadcasting, we will have another
serialized copy
of hash table.
For the table that's larger than 100M, w
Jay, thanks for the response.
Regarding the new consumer API for 0.9, I've been reading through the code
for it and thinking about how it fits in to the existing Spark integration.
So far I've seen some interesting challenges, and if you (or anyone else on
the dev list) have time to provide some h
I would expect the memory pressure to grow because not only are we storing
the backing array to the iterator of the rows on the driver, but we’re
also storing a copy of each of those rows in the hash table. Whereas if we
didn’t do the copy on the drive side then the hash table would only have
to st
UnsafeHashedRelation and HashedRelation could also be used in Executor
(for non-broadcast hash join), then the UnsafeRow could come from
UnsafeProjection,
so We should copy the rows for safety.
We could have a smarter copy() for UnsafeRow (avoid the copy if it's
already copied),
but I don't think
Hi guys,
FYI... this wiki page (StreamSQL: https://en.wikipedia.org/wiki/StreamSQL)
has some histories related Event Stream Processing and SQL.
Hi Steve,
It is difficult to ask your customers that they should learn a new language
when they are not programmers :)
I don't know where/why they learn
-dev +user
StructType(StructField(data,ArrayType(StructType(StructField(
> *stuff,ArrayType(*StructType(StructField(onetype,ArrayType(StructType(StructField(id,LongType,true),
> StructField(name,StringType,true)),true),true), StructField(othertype,
> ArrayType(StructType(StructField(company,String
> On 1 Mar 2016, at 22:25, Jerry Lam wrote:
>
> Hi Reynold,
>
> You are right. It is about the audience. For instance, in many of my cases,
> the SQL style is very attractive if not mandatory for people with minimum
> programming knowledge.
but SQL skills instead. Which is just relational se
When you create a dataframe using the sqlContext.read.schema() API, if you pass
in a schema that's compatible with some of the records, but incompatible with
others, it seems you can't do a .select on the problematic columns, instead you
get an AnalysisException error.
I know loading the wrong
12 matches
Mail list logo