Integrating Spark with Ignite File System

2015-04-11 Thread Dmitriy Setrakyan
Hello Everyone, I am one of the committers to Apache Ignite and have noticed some talks on this dev list about integrating Ignite In-Memory File System (IgniteFS) with Spark. We definitely like the idea. If you have any questions about Apache Ignite at all, feel free to forward them to the Ignite

Re: Integrating Spark with Ignite File System

2015-04-11 Thread Reynold Xin
Welcome, Dmitriy, to the Spark dev list! On Sat, Apr 11, 2015 at 1:14 AM, Dmitriy Setrakyan wrote: > Hello Everyone, > > I am one of the committers to Apache Ignite and have noticed some talks on > this dev list about integrating Ignite In-Memory File System (IgniteFS) > with Spark. We definite

Re: Integrating Spark with Ignite File System

2015-04-11 Thread Devl Devel
Hi Dmitriy, Thanks for the input, I think as per my previous email it would be good to have a bridge project that for example, creates a IgniteFS RDD, similar to the JDBC or HDFS one in which we can extract blocks and populate RDD partitions, I'll post this proposal on your list. Thanks Devl O

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Sean Owen
+1 same result as last time. On Sat, Apr 11, 2015 at 7:05 AM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.1! > > The tag to be voted on is v1.3.1-rc2 (commit 3e83913): > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3e8

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Reynold Xin
+1 On Fri, Apr 10, 2015 at 11:07 PM -0700, "Patrick Wendell" wrote: Please vote on releasing the following candidate as Apache Spark version 1.3.1! The tag to be voted on is v1.3.1-rc2 (commit 3e83913): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3e8391327ba586

Re: wait time between start master and start slaves

2015-04-11 Thread Shivaram Venkataraman
Yeah from what I remember it was set defensively. I don't know of a good way to check if the master is up though. I guess we could poll the Master Web UI and see if we get a 200/ok response Shivaram On Fri, Apr 10, 2015 at 8:24 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Check th

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Krishna Sankar
+1. All tests OK (same as RC2) Cheers On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.1! > > The tag to be voted on is v1.3.1-rc2 (commit 3e83913): > > https://git-wip-us.apache.org/repos/asf?p=spark.git;a

Re: wait time between start master and start slaves

2015-04-11 Thread Nicholas Chammas
So basically, to tell if the master is ready to accept slaves, just poll http://master-node:4040 for an HTTP 200 response? ​ On Sat, Apr 11, 2015 at 2:42 PM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah from what I remember it was set defensively. I don't know of a good > way

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Denny Lee
+1 (non-binding) On Sat, Apr 11, 2015 at 11:48 AM Krishna Sankar wrote: > +1. All tests OK (same as RC2) > Cheers > > > On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell > wrote: > > > Please vote on releasing the following candidate as Apache Spark version > > 1.3.1! > > > > The tag to be vo

Re: wait time between start master and start slaves

2015-04-11 Thread Shivaram Venkataraman
Yeah thats the best I can think ok -- Not sure if there is a better way to do it. On Sat, Apr 11, 2015 at 2:38 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > So basically, to tell if the master is ready to accept slaves, just poll > http://master-node:4040 for an HTTP 200 response? >

Re: wait time between start master and start slaves

2015-04-11 Thread Ted Yu
>From SparkUI.scala : def getUIPort(conf: SparkConf): Int = { conf.getInt("spark.ui.port", SparkUI.DEFAULT_PORT) } Better retrieve effective UI port before probing. Cheers On Sat, Apr 11, 2015 at 2:38 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > So basically, to tell if t

Integrating D3 with Spark

2015-04-11 Thread shroffpradyumn
I'm working on adding a data-graph to the Spark jobs page (rendered by stagePage.scala) to help users analyze the different job phases visually. I've already made a mockup using "dummy data" and D3.js but I'm having some difficulties integrating my JavaScript code with the Scala code of Spark. Ess

Parquet File Binary column statistics error when reuse byte[] among rows

2015-04-11 Thread Yijie Shen
Hi, Suppose I create a dataRDD which extends RDD[Row], and each row is GenericMutableRow(Array(Int, Array[Byte])). A same Array[Byte] object is reused among rows but has different content each time. When I convert it to a dataFrame and save it as Parquet File, the file's row group statistic(max &

Re: Query regarding infering data types in pyspark

2015-04-11 Thread Suraj Shetiya
Humble reminder On Sat, Apr 11, 2015 at 12:16 PM, Suraj Shetiya wrote: > Hi, > > Below is one line from the json file. > I have highlighted the field that represents the date. > > "YEAR":2015,"QUARTER":1,"MONTH":1,"DAY_OF_MONTH":31,"DAY_OF_WEEK":6, > *"FL_DATE":"2015-01-31"*,"UNIQUE_CARRIER":"NK