Take a look at the implementation of typed sum/avg:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/expressions/scalalang/typed.scala
You can implement a typed max/min.
On Tue, Jun 7, 2016 at 4:31 PM, Alexander Pivovarov
wrote:
> Ted, It does not work l
Please go ahead.
On Tue, Jun 7, 2016 at 4:45 PM, franklyn
wrote:
> Thanks for reproducing it Ted, should i make a Jira Issue?.
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataframes-in-spark-2-0-preview-scala-2-10-tp1
Thanks for reproducing it Ted, should i make a Jira Issue?.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataframes-in-spark-2-0-preview-scala-2-10-tp17845p17852.html
Sent from the Apache Spark Developers List mailing list archiv
Ted, It does not work like that
you have to .map(toAB).toDS
On Tue, Jun 7, 2016 at 4:07 PM, Ted Yu wrote:
> Have you tried the following ?
>
> Seq(1->2, 1->5, 3->6).toDS("a", "b")
>
> then you can refer to columns by name.
>
> FYI
>
>
> On Tue, Jun 7, 2016 at 3:58 PM, Alexander Pivovarov
> wro
I built with Scala 2.10
>>> df.select(add_one(df.a).alias('incremented')).collect()
The above just hung.
On Tue, Jun 7, 2016 at 3:31 PM, franklyn
wrote:
> Thanks Ted !.
>
> I'm using
>
> https://github.com/apache/spark/commit/8f5a04b6299e3a47aca13cbb40e72344c0114860
> and building with scala-2
Have you tried the following ?
Seq(1->2, 1->5, 3->6).toDS("a", "b")
then you can refer to columns by name.
FYI
On Tue, Jun 7, 2016 at 3:58 PM, Alexander Pivovarov
wrote:
> I'm trying to switch from RDD API to Dataset API
> My question is about reduceByKey method
>
> e.g. in the following exa
I'm trying to switch from RDD API to Dataset API
My question is about reduceByKey method
e.g. in the following example I'm trying to rewrite
sc.parallelize(Seq(1->2, 1->5, 3->6)).reduceByKey(math.max).take(10)
using DS API. That is what I have so far:
Seq(1->2, 1->5,
3->6).toDS.groupBy(_._1).ag
Thanks Ted !.
I'm using
https://github.com/apache/spark/commit/8f5a04b6299e3a47aca13cbb40e72344c0114860
and building with scala-2.10
I can confirm that it works with scala-2.11
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Can-t-use-UDFs-with-Dataf
With commit 200f01c8fb15680b5630fbd122d44f9b1d096e02 using Scala 2.11:
Using Python version 2.7.9 (default, Apr 29 2016 10:48:06)
SparkSession available as 'spark'.
>>> from pyspark.sql import SparkSession
>>> from pyspark.sql.types import IntegerType, StructField, StructType
>>> from pyspark.sql.
I've built spark-2.0-preview (8f5a04b) with scala-2.10 using the following
>
>
> ./dev/change-version-to-2.10.sh
> ./dev/make-distribution.sh -DskipTests -Dzookeeper.version=3.4.5
> -Dcurator.version=2.4.0 -Dscala-2.10 -Phadoop-2.6 -Pyarn -Phive
and then ran the following code in a pyspark shell
As far as I know the process is just to copy docs/_site from the build
to the appropriate location in the SVN repo (i.e.
site/docs/2.0.0-preview).
Thanks
Shivaram
On Tue, Jun 7, 2016 at 8:14 AM, Sean Owen wrote:
> As a stop-gap, I can edit that page to have a small section about
> preview releas
Hi,
I'm searching for how and where spark allocates cores per executor in the
source code.
Is it possible to control programmaticaly allocated cores in standalone
cluster mode?
Regards,
Matteo
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Standalone
Thanks Sean, you were right, hard refresh made it show up.
Seems like we should at least link to the preview docs from
http://spark.apache.org/documentation.html.
Tom
On Tuesday, June 7, 2016 10:04 AM, Sean Owen wrote:
It's there (refresh maybe?). See the end of the downloads dropdown.
As a stop-gap, I can edit that page to have a small section about
preview releases and point to the nightly docs.
Not sure who has the power to push 2.0.0-preview to site/docs, but, if
that's done then we can symlink "preview" in that dir to it and be
done, and update this section about preview do
It's there (refresh maybe?). See the end of the downloads dropdown.
For the moment you can see the docs in the nightly docs build:
https://home.apache.org/~pwendell/spark-nightly/spark-branch-2.0-docs/latest/
I don't know, what's the best way to put this into the main site?
under a /preview root?
I just checked and I don't see the 2.0 preview release at all anymore on
.http://spark.apache.org/downloads.html, is it in transition? The only place
I can see it is at http://spark.apache.org/news/spark-2.0.0-preview.html
I would like to see docs there too. My opinion is it should be as eas
Congrats!!
On Mon, Jun 6, 2016, 8:12 AM Gayathri Murali
wrote:
> Congratulations Yanbo Liang! Well deserved.
>
>
> On Sun, Jun 5, 2016 at 7:10 PM, Shixiong(Ryan) Zhu <
> shixi...@databricks.com> wrote:
>
>> Congrats, Yanbo!
>>
>> On Sun, Jun 5, 2016 at 6:25 PM, Liwei Lin wrote:
>>
>>> Congratul
Hi,
I don't know if it is a bug or a feature, but one thing in streaming error
handling seems confusing to me - I create streaming context, start and call
#awaitTermination like this:
try {
ssc.awaitTermination();
} catch (Exception e) {
LoggerFactory.getLogger(getClass()).error("Job failed. S
18 matches
Mail list logo