Re: Query regarding infering data types in pyspark

2015-04-11 Thread Suraj Shetiya
Humble reminder On Sat, Apr 11, 2015 at 12:16 PM, Suraj Shetiya wrote: > Hi, > > Below is one line from the json file. > I have highlighted the field that represents the date. > > "YEAR":2015,"QUARTER":1,"MONTH":1,"DAY_OF_MONTH":31,"DAY_OF_WEEK":6, > *"FL_DATE":"2015-01-31"*,"UNIQUE_CARRIER":"NK

Parquet File Binary column statistics error when reuse byte[] among rows

2015-04-11 Thread Yijie Shen
Hi, Suppose I create a dataRDD which extends RDD[Row], and each row is GenericMutableRow(Array(Int, Array[Byte])). A same Array[Byte] object is reused among rows but has different content each time. When I convert it to a dataFrame and save it as Parquet File, the file's row group statistic(max &

Integrating D3 with Spark

2015-04-11 Thread shroffpradyumn
I'm working on adding a data-graph to the Spark jobs page (rendered by stagePage.scala) to help users analyze the different job phases visually. I've already made a mockup using "dummy data" and D3.js but I'm having some difficulties integrating my JavaScript code with the Scala code of Spark. Ess

Re: wait time between start master and start slaves

2015-04-11 Thread Ted Yu
>From SparkUI.scala : def getUIPort(conf: SparkConf): Int = { conf.getInt("spark.ui.port", SparkUI.DEFAULT_PORT) } Better retrieve effective UI port before probing. Cheers On Sat, Apr 11, 2015 at 2:38 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > So basically, to tell if t

Re: wait time between start master and start slaves

2015-04-11 Thread Shivaram Venkataraman
Yeah thats the best I can think ok -- Not sure if there is a better way to do it. On Sat, Apr 11, 2015 at 2:38 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > So basically, to tell if the master is ready to accept slaves, just poll > http://master-node:4040 for an HTTP 200 response? >

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Denny Lee
+1 (non-binding) On Sat, Apr 11, 2015 at 11:48 AM Krishna Sankar wrote: > +1. All tests OK (same as RC2) > Cheers > > > On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell > wrote: > > > Please vote on releasing the following candidate as Apache Spark version > > 1.3.1! > > > > The tag to be vo

Re: wait time between start master and start slaves

2015-04-11 Thread Nicholas Chammas
So basically, to tell if the master is ready to accept slaves, just poll http://master-node:4040 for an HTTP 200 response? ​ On Sat, Apr 11, 2015 at 2:42 PM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Yeah from what I remember it was set defensively. I don't know of a good > way

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Krishna Sankar
+1. All tests OK (same as RC2) Cheers On Fri, Apr 10, 2015 at 11:05 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.1! > > The tag to be voted on is v1.3.1-rc2 (commit 3e83913): > > https://git-wip-us.apache.org/repos/asf?p=spark.git;a

Re: wait time between start master and start slaves

2015-04-11 Thread Shivaram Venkataraman
Yeah from what I remember it was set defensively. I don't know of a good way to check if the master is up though. I guess we could poll the Master Web UI and see if we get a 200/ok response Shivaram On Fri, Apr 10, 2015 at 8:24 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Check th

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Reynold Xin
+1 On Fri, Apr 10, 2015 at 11:07 PM -0700, "Patrick Wendell" wrote: Please vote on releasing the following candidate as Apache Spark version 1.3.1! The tag to be voted on is v1.3.1-rc2 (commit 3e83913): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3e8391327ba586

Re: [VOTE] Release Apache Spark 1.3.1 (RC3)

2015-04-11 Thread Sean Owen
+1 same result as last time. On Sat, Apr 11, 2015 at 7:05 AM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.1! > > The tag to be voted on is v1.3.1-rc2 (commit 3e83913): > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3e8

Re: Integrating Spark with Ignite File System

2015-04-11 Thread Devl Devel
Hi Dmitriy, Thanks for the input, I think as per my previous email it would be good to have a bridge project that for example, creates a IgniteFS RDD, similar to the JDBC or HDFS one in which we can extract blocks and populate RDD partitions, I'll post this proposal on your list. Thanks Devl O

Re: Integrating Spark with Ignite File System

2015-04-11 Thread Reynold Xin
Welcome, Dmitriy, to the Spark dev list! On Sat, Apr 11, 2015 at 1:14 AM, Dmitriy Setrakyan wrote: > Hello Everyone, > > I am one of the committers to Apache Ignite and have noticed some talks on > this dev list about integrating Ignite In-Memory File System (IgniteFS) > with Spark. We definite

Integrating Spark with Ignite File System

2015-04-11 Thread Dmitriy Setrakyan
Hello Everyone, I am one of the committers to Apache Ignite and have noticed some talks on this dev list about integrating Ignite In-Memory File System (IgniteFS) with Spark. We definitely like the idea. If you have any questions about Apache Ignite at all, feel free to forward them to the Ignite