Spark is a parallel computing framework.  There are many ways to give it
data to chomp down on.  If you don't know why you would need HDFS, then you
don't need it.  Same goes for Zookeeper.  Spark works fine without either.

Much of what we read online comes from people with specialized problems and
requirements (such as maintaining a 'highly available' service, or
accessing an existing HDFS).  It can be extremely confusing to the dude who
just needs to do some parallel computing.

Pete

On Wed, Aug 24, 2016 at 3:54 PM, kant kodali <kanth...@gmail.com> wrote:

> What do I loose if I run spark without using HDFS or Zookeper ? which of
> them is almost a must in practice?
>

Reply via email to