You can do a join between streaming dataset and a static dataset. I would
prefer your first approach. But the problem with this approach is
performance.
Unless you cache the dataset , every time you fire a join query it will
fetch the latest records from the table.
Regards,
Rishitesh Mishra,
Sna
Hi All,
I tried to configure my Spark with MapR hadoop cluster. For that I built
Spark 2.0 from source with hadoop-provided option. Then as per the document
I set my hadoop libraries in spark-env.sh.
However I get an error while SessionCatalog is getting created. Please
refer below for exception s
That's the default shuffle partitions with Spark, You can tune it using
spark.sql.shuffle.partitions.
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://in.linkedin.com/in/rishiteshmishra
On Tue, Aug 16, 2016 at 11:31 AM, Niranda Perera
wrote:
> Hi,
>
> I ran the follo
I am trying out StatefulNetworkWordCount from latest Spark master branch.
When I run this example I see a odd behaviour.
If in a batch a key is repeated the output stream prints for each
repetition e.g. If I key in "ab" five times for input it will show like
(ab,1)
(ab,2)
(ab,3)
(ab,4)
(ab,5)
Is
Hi Davies,
When you say *"UnsafeRow could come from UnsafeProjection, so We should
copy the rows for safety." *do you intend to say that the underlying state
might change , because of some state update APIs ?
Or its due to some other rationale ?
Regards,
Rishitesh Mishra,
SnappyData . (http://www