I think your understanding is correct. There will be external libraries
that allow you to use the twitter streaming dstream API even in 2.0 though.
On Sat, May 28, 2016 at 8:37 AM, Ricardo Almeida <
ricardo.alme...@actnowib.com> wrote:
> As far as I could understand...
> 1. Using Python (PySpark
Hi Ralph,
You could look at https://spark-packages.org/ and see if there's anything
you want on there, and if not release your packages there.
Constraint programming might benefit from integration into Spark, though.
Marcin
On Mon, May 30, 2016 at 7:12 AM, Debusmann, Ralph
wrote:
> Hi,
>
>
>
I have to clarify something…
In SparkSQL, we can query against both immutable existing RDDs, and
Hive/HBase/MapRDB/ which are mutable.
So we have to keep this in mind while we are talking about secondary indexing.
(Its not just RDDs)
I think the only advantage to being immutable is that once
I’m not sure where to post this since its a bit of a philosophical question in
terms of design and vision for spark.
If we look at SparkSQL and performance… where does Secondary indexing fit in?
The reason this is a bit awkward is that if you view Spark as querying RDDs
which are temporary, i
Hi,
I am still a Spark newbie who'd like to contribute.
There are two topics which I am most interested in:
1) Deep NLP (Syntactic/Semantic analysis)
2) Constraint Programming
For both, I see no built-in support in Spark yet. Or is there?
Cheers,
Ralph
I think I saw this one already as the first indication that something is
wrong and it was related to
https://issues.apache.org/jira/browse/SPARK-13516
2016-05-28 1:34 GMT+02:00 Koert Kuipers :
> it seemed to be related to an Aggregator, so for tests we replaced it with
> an ordinary Dataset.reduc