I am trying to create UDFs with improved performance. So I decided to compare
several ways of doing it.
In general I created a dataframe using range with 50M elements, cached it and
counted it to manifest it.
I then implemented a simple predicate (x<10) in 4 different ways, counted the
elements
Hi Dushyant,
I saw this same error with an older Hana JDBC driver, but the error went
away when I tried a later ngdbc.jar driver file (dated May 2016). I've not
tried to
Heres an example I did using the later driver with Spark 1.6.2 running
standalone.
http://scn.sap.com/community/hana-in-memor
Hi,
Can we use GZIP compression for internal data such as RDD partitions,
broadcast variables and shuffle outputs so that user will have more
choice compared to the available LZ4, LZF and Snappy? Is there any
specific reason we are not supporting the JDK inbuilt compression? If
not, shall I
Hi,
I've stumbled upon CatalogImpl.makeDataset [1] -- the only
private[sql] method in the CatalogImpl object -- that looks like
SparkSession.createDataset [2].
What do you think about removing CatalogImpl.makeDataset? If not,
what's so special about one over the other to keep them both?
I'd appr
Hi,
Have you seen https://issues.apache.org/jira/browse/SPARK-4633 ?
// maropu
On Mon, Sep 12, 2016 at 11:00 PM, Nasser Ebrahim wrote:
> Hi,
>
> Can we use GZIP compression for internal data such as RDD partitions,
> broadcast variables and shuffle outputs so that user will have more choice
>
Never actually got around to doing this - do folks still think it
worthwhile?
On Thu, 21 Apr 2016 at 00:10 Joseph Bradley wrote:
> Sounds good to me. I'd request we be strict during this process about
> requiring *no* changes to the example itself, which will make review easier.
>
> On Tue, Apr
Hi,
I think you'd better off comparing the gen'd code of `df.filter` and your
gen'd code
by using .debugCodegen().
// maropu
On Mon, Sep 12, 2016 at 7:43 PM, assaf.mendelson
wrote:
> I am trying to create UDFs with improved performance. So I decided to
> compare several ways of doing it.
>
> I
Yes: will you have cycles to do it?
2016-09-12 9:09 GMT-07:00 Nick Pentreath :
> Never actually got around to doing this - do folks still think it
> worthwhile?
>
> On Thu, 21 Apr 2016 at 00:10 Joseph Bradley wrote:
>
>> Sounds good to me. I'd request we be strict during this process about
>> r
our weekly backups failed due to a hung job. even though i tried to
change the backup scheduler (internal to jenkins) to run tonite, it's
still insisting that it needs to run immediately and is continually
putting jenkins in to quiet mode.
short of killing all of the current jobs and restarting
I did, they look the same:
scala> my_func.explain(true)
== Parsed Logical Plan ==
Filter smaller#3L < 10
+- Project [id#0L AS smaller#3L]
+- Range (0, 5, splits=1)
== Analyzed Logical Plan ==
smaller: bigint
Filter smaller#3L < 10
+- Project [id#0L AS smaller#3L]
+- Range (0, 50
the backup is done and we're building again!
On Mon, Sep 12, 2016 at 9:31 AM, shane knapp wrote:
> our weekly backups failed due to a hung job. even though i tried to
> change the backup scheduler (internal to jenkins) to run tonite, it's
> still insisting that it needs to run immediately and i
Thank you Takeshi for sharing the info. I agree with Patrick and you
that there is no point in adding more codec unless it is showing better
performance results (at least with some work loads on some platforms).
The performance of GZIP depends upon its implementation on the
platforms. Will do s
Not sure if this is why but perhaps the constraint framework?
On Tuesday, September 13, 2016, Mendelson, Assaf
wrote:
> I did, they look the same:
>
>
>
> scala> my_func.explain(true)
>
> == Parsed Logical Plan ==
>
> Filter smaller#3L < 10
>
> +- Project [id#0L AS smaller#3L]
>
>+- Range (0
What is the constraint framework?
How would I add the same optimization to the sample function I created?
From: rxin [via Apache Spark Developers List]
[mailto:ml-node+s1001551n18932...@n3.nabble.com]
Sent: Tuesday, September 13, 2016 3:37 AM
To: Mendelson, Assaf
Subject: Re: UDF and native func
14 matches
Mail list logo