You can do a join between streaming dataset and a static dataset. I would
prefer your first approach. But the problem with this approach is
performance.
Unless you cache the dataset , every time you fire a join query it will
fetch the latest records from the table.
Regards,
Rishitesh Mishra,
Sna
That's the default shuffle partitions with Spark, You can tune it using
spark.sql.shuffle.partitions.
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://in.linkedin.com/in/rishiteshmishra
On Tue, Aug 16, 2016 at 11:31 AM, Niranda Perera
wrote:
> Hi,
>
> I ran the follo
Hi All,
I tried to configure my Spark with MapR hadoop cluster. For that I built
Spark 2.0 from source with hadoop-provided option. Then as per the document
I set my hadoop libraries in spark-env.sh.
However I get an error while SessionCatalog is getting created. Please
refer below for exception s
Hi Davies,
When you say *"UnsafeRow could come from UnsafeProjection, so We should
copy the rows for safety." *do you intend to say that the underlying state
might change , because of some state update APIs ?
Or its due to some other rationale ?
Regards,
Rishitesh Mishra,
SnappyData . (http://www
I am trying out StatefulNetworkWordCount from latest Spark master branch.
When I run this example I see a odd behaviour.
If in a batch a key is repeated the output stream prints for each
repetition e.g. If I key in "ab" five times for input it will show like
(ab,1)
(ab,2)
(ab,3)
(ab,4)
(ab,5)
Is