Hi Sandeep,
would you be interesting in joining my open source project?
https://github.com/tribbloid/spookystuff
IMHO spark is indeed not for general purpose crawling, of which distributed
job is highly homogeneous. But good enough for directional scraping which
involves heterogeneous input and
Hi Daniil,
I have to do some processing of the results, as well as pushing the data to
the front end. Currently I'm using akka for this application, but I was
thinking maybe spark streaming would be a better thing to do. as well as i
can use mllib for processing the results. Any specific reason's
sage in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Crawler-and-Scraper-with-different-priorities-tp13645.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To un
anks.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Crawler-and-Scraper-with-different-priorities-tp13645.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubsc