Hey Chris,
Apologies for the delayed reply. Your responses are always insightful and
appreciated :-)
However, I have a few more questions.
"also, it looks like you're writing to S3 per RDD. you'll want to broaden
that out to write DStream batches"
I assume you mean "dstream.saveAsTextFiles(...
Mike:
Once hadoop 2.7.0 is released, you should be able to enjoy the enhanced
performance of s3a.
See HADOOP-11571
Cheers
On Sat, Mar 21, 2015 at 8:09 AM, Chris Fregly wrote:
> hey mike!
>
> you'll definitely want to increase your parallelism by adding more shards
> to the stream - as well as s
hey mike!
you'll definitely want to increase your parallelism by adding more shards to
the stream - as well as spinning up 1 receiver per shard and unioning all the
shards per the KinesisWordCount example that is included with the kinesis
streaming package.
you'll need more cores (cluster) or t