is there any api in spark like getInstance(className:String):AnyRef

2015-03-11 Thread Zhang, Liyun
Hi all: I'm a newbie to spark and scala and now I am working on SPARK-5682(Add encrypted shuffle in spark). I met a problem:is there any api in spark like getInstance(className:String):AnyRef ? I saw org.apache.spark.sql.hive .thriftserver.Refl

Re: Spark Streaming - received block allocation to batch

2015-03-11 Thread Tathagata Das
See responses inline. On Wed, Mar 11, 2015 at 6:58 AM, Zoltán Zvara wrote: > I'm trying to understand the block allocation mechanism Spark uses to > generate batch jobs and a JobSet. > > The JobGenerator.generateJobs tries to allocate received blocks to batch, > effectively in ReceivedBlockTrack

Apache Spark GSOC 2015

2015-03-11 Thread Tamer TAS
Hello Everyone, I'm a senior year computer engineering student in Turkey. My main area of interests are cloud computing and machine learning. I've been working on Apache Spark using Scala API for a few months. My projects involved the use of MLib for a movie recommendation system and a stock pr

Re: enum-like types in Spark

2015-03-11 Thread RJ Nowling
How do these proposals affect PySpark? I think compatibility with PySpark through Py4J should be considered. On Mon, Mar 9, 2015 at 8:39 PM, Patrick Wendell wrote: > Does this matter for our own internal types in Spark? I don't think > any of these types are designed to be used in RDD records,

Re: [SparkSQL] Reuse HiveContext to different Hive warehouse?

2015-03-11 Thread Michael Armbrust
That val is not really your problem. In general, there is a lot of global state throughout the hive codebase that make it unsafe to try and connect to more than one hive installation from the same JVM. On Tue, Mar 10, 2015 at 11:36 PM, Haopu Wang wrote: > Hao, thanks for the response. > > > >

Using Log4j2 in spark executors

2015-03-11 Thread lior.c
Hi, I'd like to allow using log4j2 in executor code. As spark contains dependencies to log4j 1.2, I would like to support spark build with log4j2 instead of log4j 1.2. To accomplish that, I suggest creating a new profile for log4j2 in spark-parent. The default profile (log4j12), would include dep

Re: GitHub Syncing Down

2015-03-11 Thread Sean Owen
(I have been able to push over the last few hours and see the commits in github) On Wed, Mar 11, 2015 at 2:38 PM, Ted Yu wrote: > Looks like github is functioning again (I no longer encounter this problem > when pushing to hbase repo). > > Do you want to give it a try ? > > Cheers > > On Tue, Mar

Re: GitHub Syncing Down

2015-03-11 Thread Ted Yu
Looks like github is functioning again (I no longer encounter this problem when pushing to hbase repo). Do you want to give it a try ? Cheers On Tue, Mar 10, 2015 at 6:54 PM, Michael Armbrust wrote: > FYI: https://issues.apache.org/jira/browse/INFRA-9259 >

Spark Streaming - received block allocation to batch

2015-03-11 Thread Zoltán Zvara
I'm trying to understand the block allocation mechanism Spark uses to generate batch jobs and a JobSet. The JobGenerator.generateJobs tries to allocate received blocks to batch, effectively in ReceivedBlockTracker.allocateBlocksToBatch creates a streamIdToBlocks, where steam ID's (Int) mapped to S