I should qualify by saying there is boto support for dynamodb - but not for the 
inputFormat. You could roll your own python-based connection but this involves 
figuring out how to split the data in dynamo - inputFormat takes care of this 
so should be the easier approach —
Sent from Mailbox

On Fri, Jul 4, 2014 at 8:51 AM, Ian Wilkinson <ia...@me.com> wrote:

> Excellent. Let me get browsing on this.
> Huge thanks,
> ian
> On 4 Jul 2014, at 16:47, Nick Pentreath <nick.pentre...@gmail.com> wrote:
>> No boto support for that. 
>> 
>> In master there is Python support for loading Hadoop inputFormat. Not sure 
>> if it will be in 1.0.1 or 1.1
>> 
>> I master docs under the programming guide are instructions and also under 
>> examples project there are pyspark examples of using Cassandra and HBase. 
>> These should hopefully give you enough to get started. 
>> 
>> Depending on how easy it is to use the dynamo DB format, you may have to 
>> write a custom converter (see the mentioned examples for storm details).
>> 
>> Sent from my iPhone
>> 
>> On 4 Jul 2014, at 08:38, Ian Wilkinson <ia...@me.com> wrote:
>> 
>>> Hi Nick,
>>> 
>>> I’m going to be working with python primarily. Are you aware of
>>> comparable boto support?
>>> 
>>> ian
>>> 
>>> On 4 Jul 2014, at 16:32, Nick Pentreath <nick.pentre...@gmail.com> wrote:
>>> 
>>>> You should be able to use DynamoDBInputFormat (I think this should be part 
>>>> of AWS libraries for Java) and create a HadoopRDD from that.
>>>> 
>>>> 
>>>> On Fri, Jul 4, 2014 at 8:28 AM, Ian Wilkinson <ia...@me.com> wrote:
>>>> Hi,
>>>> 
>>>> I noticed mention of DynamoDB as input source in
>>>> http://ampcamp.berkeley.edu/wp-content/uploads/2012/06/matei-zaharia-amp-camp-2012-advanced-spark.pdf.
>>>> 
>>>> Unfortunately, Google is not coming to my rescue on finding
>>>> further mention for this support.
>>>> 
>>>> Any pointers would be well received.
>>>> 
>>>> Big thanks,
>>>> ian
>>>> 
>>> 

Reply via email to