[DISCUSS] [Spark SQL, PySpark] Combining StructTypes into a new StructType

2022-08-09 Thread Tim
re any reasons why this is not yet part of StructType's functionality? If you support this idea, I could create a first PR for further and deeper discussion. Best Tim - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Dropping SortExec from SortMergeJoins on presorted data

2019-03-29 Thread tim
rom SortOrder expressions. This breaks in cases where our processing has caused the data to *lose* its sortedness. Have we missed something simple or do we have an exotic use-case unlike other users? Thanks! Tim -- Sent from: http://apache-spark-developers-l

Re: Honor ParseMode in AvroFileFormat

2019-03-07 Thread tim
/facepalm Here we go: https://issues.apache.org/jira/browse/SPARK-27093 Tim -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Honor ParseMode in AvroFileFormat

2019-03-07 Thread tim
Thanks Xiao, it's good to have that validated. I've created a ticket here: https://issues.apache.org/jira/browse/AVRO-2342 -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-u

Honor ParseMode in AvroFileFormat

2019-03-07 Thread tim
ting users. Is there any reason why this behavior doesn't exist or obvious workaround that I missed? If not, are there any further details needed to consider adding this capability to Spark's Avro reader? I’m happy to propose a solution and contribute this update if somebody isn't alrea

Re: eager execution and debuggability

2018-05-09 Thread Tim Hunter
ard trick in lazy environments and languages. Tim On Wed, May 9, 2018 at 3:26 AM, Reynold Xin wrote: > Yes would be great if possible but it’s non trivial (might be impossible > to do in general; we already have stacktraces that point to line numbers > when an error occur in UDFs but cl

[ml] Deep learning talks at the Spark Summit Europe

2017-10-10 Thread Tim Hunter
TensorFlow as a service, by Jim Dowling If you have not gotten your ticket yet, there is still time! You can use the promo code DatabricksEU for a 15% discount. Looking forward to meeting the dev community on the East side of the Atlantic. Tim

Re: [VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-28 Thread Tim Hunter
t; > On Sep 23, 2017, at 7:27 AM, Yanbo Liang wrote: >>> > >>> > +1 >>> > >>> > On Sat, Sep 23, 2017 at 7:08 PM, Noman Khan >>> wrote: >>> > +1 >>> > >>> > Regards >>> > Noman >>>

[VOTE][SPIP] SPARK-21866 Image support in Apache Spark

2017-09-21 Thread Tim Hunter
Hello community, I would like to call for a vote on SPARK-21866. It is a short proposal that has important applications for image processing and deep learning. Joseph Bradley has offered to be the shepherd. JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21866 PDF version: https://issues

SPIP: SPARK-21866 Image support in Apache Spark

2017-09-05 Thread Tim Hunter
Hello community, I would like to start a discussion about adding support for images in Spark. We will follow up with a formal vote in two weeks. Please feel free to comment on the JIRA ticket too. JIRA ticket: https://issues.apache.org/jira/browse/SPARK-21866 PDF version: https://issues.apache.or

Re: Question on Spark's graph libraries roadmap

2017-03-13 Thread Tim Hunter
popular demand. Along these lines, GraphBLAS could be added on top of it if someone is willing to step up. Tim [1] https://spark-summit.org/east-2016/events/graphframes-graph-queries-in-spark-sql/ On Mon, Mar 13, 2017 at 2:58 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: &

Re: [Spark Namespace]: Expanding Spark ML under Different Namespace?

2017-02-24 Thread Tim Hunter
Regarding logging, Graphframes makes a simple wrapper this way: https://github.com/graphframes/graphframes/blob/master/src/main/scala/org/ graphframes/Logging.scala Regarding the UDTs, they have been hidden to be reworked for Datasets, the reasons being detailed here [1]. Can you describe your us

Re: Feedback on MLlib roadmap process proposal

2017-02-23 Thread Tim Hunter
works well in practice. In the meantime, though, there are plenty of things that we could do to help developers of other libraries to have a great experience with Spark. Matei alluded to that in his Spark Summit keynote when he mentioned better integration with low-level libraries. Tim On Thu, Feb 23

Re: Design document - MLlib's statistical package for DataFrames

2017-02-17 Thread Tim Hunter
Hi Brad, this task is focusing on moving the existing algorithms, so that we are held up by parity issues. Do you have some paper suggestions for cardinality? I do not think there is a feature request on JIRA either. Tim On Thu, Feb 16, 2017 at 2:21 PM, bradc wrote: > Hi, > > While i

Design document - MLlib's statistical package for DataFrames

2017-02-16 Thread Tim Hunter
rapidly approaching, and it would be great if we could claim parity for this release! Cheers Tim - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Spark Improvement Proposals

2017-02-16 Thread Tim Hunter
is probably not going to receive much traction in the first place. Tim On Thu, Feb 16, 2017 at 9:17 AM, Cody Koeninger wrote: > Reynold, thanks, LGTM. > > Sean, great concerns. I agree that behavior is largely cultural and > writing down a process won't necessarily solve any pr

Re: Spark Improvement Proposals

2017-01-05 Thread Tim Hunter
opinion on these, but why not make a pick and reevaluate this decision later? This is not a binding process at this point. Tim On Tue, Jan 3, 2017 at 3:16 PM, Cody Koeninger wrote: > I don't have a concern about voting vs consensus. > > I have a concern that whatever the decision

GraphFrames 0.2.0 released

2016-08-16 Thread Tim Hunter
the DataFrame API, combined with a new API for motif finding. The user also benefits from DataFrame performance optimizations within the Spark SQL engine. Cheers Tim

Re: [VOTE] Release Apache Spark 1.6.2 (RC2)

2016-06-22 Thread Tim Hunter
+1 This release passes all tests on the graphframes and tensorframes packages. On Wed, Jun 22, 2016 at 7:19 AM, Cody Koeninger wrote: > If we're considering backporting changes for the 0.8 kafka > integration, I am sure there are people who would like to get > > https://issues.apache.org/jira/br

Request for comments: Tensorframes, an integration library between TensorFlow and Spark DataFrames

2016-03-19 Thread Tim Hunter
Tim Hunter

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
Regarding the failure in org.apache.spark.streaming.kafka.DirectKafkaStreamSuite","offset recovery We have been seeing the very same problem with the IBM JDK for quite a long time ( since at least July 2015 ). It is intermittent and we had dismissed it as a testcase problem. -- View this mess

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
I just created the following pull request ( against master but would like on 1.6.1 ) for the isolated classloader fix ( Spark-13648 ) https://github.com/apache/spark/pull/11495 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-

Re: [VOTE] Release Apache Spark 1.6.1 (RC1)

2016-03-03 Thread Tim Preece
I have been testing 1.6.1RC1 using the IBM Java SDK. I notice a problem ( with the org.apache.spark.sql.hive.client.VersionsSuite tests ) after a recent Spark 1.6.1 change. Pull request - https://github.com/apache/spark/commit/f7898f9e2df131fa78200f6034508e74a78c2a44 The change introduced a depe

Introducing spark-sklearn, a scikit-learn integration package for Spark

2016-02-10 Thread Tim Hunter
ions. Also, documentation or code contributions are much welcome (Apache 2.0 license). Cheers Tim - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org

Re: Tungsten in a mixed endian environment

2016-01-15 Thread Tim Preece
So if Spark does not support heterogeneous endianness clusters, should Spark at least always support homogeneous endianess clusters ? I ask because I just noticed https://issues.apache.org/jira/browse/SPARK-12785 which appears to be introducing a new feature designed for Little Endian only. -

Re: A proposal for Spark 2.0

2015-11-11 Thread Tim Preece
Considering Spark 2.x will run for 2 years, would moving up to Scala 2.12 ( pencilled in for Jan 2016 ) make any sense ? - although that would then pre-req Java 8. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/A-proposal-for-Spark-2-0-tp15122p15153.h

Re: Block Transfer Service encryption support

2015-11-10 Thread Tim Preece
So it appears the tests fail because of an SSLHandshakeException. Tracing the failure I see: 3,0001,Using SSLEngineImpl.\0A 3,0001,\0AIs initial handshake: true\0A 3,0001,Ignoring unsupported cipher suite: SSL_RSA_WITH_DES_CBC_SHA for TLSv1.2\0A 3,0001,No available cipher suite for TLSv1.2\0A 3,0

Re: Block Transfer Service encryption support

2015-11-10 Thread Tim Preece
hunkFetchIntegrationSuite.fetchFileChunk:184 expected:<[]> but was:<[1]> SslTransportClientFactorySuite>TransportClientFactorySuite.neverReturnInactiveClients:165 null SslTransportClientFactorySuite>TransportClientFactorySuite.returnDifferentClientsForDifferentServers:145 null T

Re: Some spark apps fail with "All masters are unresponsive", while others pass normally

2015-11-09 Thread Tim Preece
Searching shows several people hit this same NPE in AppClient.scala line 160 ( perhaps because appID was null - could application had be stopped before registered ?) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Some-spark-apps-fail-with-All-master

Re: Anyone has perfect solution for spark source code compilation issue on intellij

2015-11-09 Thread Tim Preece
I've had success building with maven ( 3.3.3 ) with: Intellij 14.1.5 scala 2.10.4 openjdk 7 (1.7.0_79) What OS/Platform are you on ? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Anyone-has-perfect-solution-for-spark-source-code-compilation-issue-

Intermittent timeout failure org/apache/spark/sql/hive/thriftserver/CliSuite.scala

2015-08-12 Thread Tim Preece
er if some fixes ( e.g https://issues.apache.org/jira/browse/SPARK-7973 ) may be a result of this Scala issue.   I am new to the Spark community. Is there a preferred way to track the fact the Spark testcase CliSuite has a dependency on the above Scala issue ?  

Re: [ANNOUNCE] Ending Java 6 support in Spark 1.5 (Sep 2015)

2015-05-19 Thread Tim Ellison
Sean, Did the JIRA get created? If so I can't find it so a pointer would be helpful. Regards, Tim On 06/05/15 06:59, Reynold Xin wrote: > Sean - Please do. > > On Tue, May 5, 2015 at 10:57 PM, Sean Owen wrote: > >> OK to file a JIRA to scrape out a few Java 6-specifi

Re: running the Terasort example

2014-12-17 Thread Tim Harsch
On 12/16/14, 11:42 PM, "Ewan Higgs" wrote: >Hi Tim, > >> On 16 Dec 2014, at 19:27, Tim Harsch wrote: >> >> Hi Ewan, >> Thanks, I think I was just a bit confused at the time, I was looking at >> the spark-perf repo when there was the problem

Re: running the Terasort example

2014-12-16 Thread Tim Harsch
/terasort/TeraOutputFormat.scala:76: value hsync is not a member of org.apache.hadoop.fs.FSDataOutputStream [ERROR] out.hsync(); [ERROR] ^ I can get past this by setting hadoop.version to 2.5.0 in the parent pom. Thanks, Tim On 12/16/14, 12:38 AM, "Ewan Higgs" w

running the Terasort example

2014-12-11 Thread Tim Harsch
changes weren¹t pushed? Thanks for any help, Tim - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org

Cassandra Examples Don't Work

2014-06-05 Thread Tim Kellogg
Hi, I’ve tried running the CassandraTest example against several versions of Cassandra and I can’t get it to work. I’m wondering if I’m doing something wrong, or if they simply don’t work. Please help! http://stackoverflow.com/q/24069039/503826 Much Thanks! Tim Kellogg Sr. Software Engineer

Re: Updating docs for running on Mesos

2014-05-13 Thread Tim St Clair
Perhaps linking to a Mesos page, which then can list the various package incantations. Cheers, Tim - Original Message - > From: "Matei Zaharia" > To: dev@spark.apache.org > Sent: Tuesday, May 13, 2014 2:59:42 AM > Subject: Re: Updating docs for running on Mesos &