from:"Rishi Mishra"

Re: Joining streaming data with static table data.

2017-12-11 Thread Rishi Mishra

You can do a join between streaming dataset and a static dataset. I would prefer your first approach. But the problem with this approach is performance. Unless you cache the dataset , every time you fire a join query it will fetch the latest records from the table. Regards, Rishitesh Mishra, Sna

Spark 2.0.1 fails for provided hadoop

2016-08-30 Thread Rishi Mishra

Hi All, I tried to configure my Spark with MapR hadoop cluster. For that I built Spark 2.0 from source with hadoop-provided option. Then as per the document I set my hadoop libraries in spark-env.sh. However I get an error while SessionCatalog is getting created. Please refer below for exception s

Re: Resultant RDD after a group by query always returns 200 partitions

2016-08-16 Thread Rishi Mishra

That's the default shuffle partitions with Spark, You can tune it using spark.sql.shuffle.partitions. Regards, Rishitesh Mishra, SnappyData . (http://www.snappydata.io/) https://in.linkedin.com/in/rishiteshmishra On Tue, Aug 16, 2016 at 11:31 AM, Niranda Perera wrote: > Hi, > > I ran the follo

StatefulNetworkWordCount behaviour

2016-03-22 Thread Rishi Mishra

I am trying out StatefulNetworkWordCount from latest Spark master branch. When I run this example I see a odd behaviour. If in a batch a key is repeated the output stream prints for each repetition e.g. If I key in "ab" five times for input it will show like (ab,1) (ab,2) (ab,3) (ab,4) (ab,5) Is

Re: HashedRelation Memory Pressure on Broadcast Joins

2016-03-03 Thread Rishi Mishra

Hi Davies, When you say *"UnsafeRow could come from UnsafeProjection, so We should copy the rows for safety." *do you intend to say that the underlying state might change , because of some state update APIs ? Or its due to some other rationale ? Regards, Rishitesh Mishra, SnappyData . (http://www

Re: Joining streaming data with static table data.

Spark 2.0.1 fails for provided hadoop

Re: Resultant RDD after a group by query always returns 200 partitions

StatefulNetworkWordCount behaviour

Re: HashedRelation Memory Pressure on Broadcast Joins

5 matches

Site Navigation

Mail list logo

Footer information