Re: An extremely fast cassandra table full scan utility

siddharth verma Mon, 03 Oct 2016 13:08:09 -0700

Hi Jon,
It wan't allowed.
Moreover, if someone who isn't familiar with spark, and might be new to map
filter reduce etc. operations, could also use the utility for some simple
operations assuming a sequential scan of the cassandra table.


Regards
Siddharth Verma

On Tue, Oct 4, 2016 at 1:32 AM, Jonathan Haddad <[email protected]> wrote:

> Couldn't set up as couldn't get it working, or its not allowed?
>
> On Mon, Oct 3, 2016 at 3:23 PM Siddharth Verma <
> [email protected]> wrote:
>
>> Hi Jon,
>> We couldn't setup a spark cluster.
>>
>> For some use case, a spark cluster was required, but for some reason we
>> couldn't create spark cluster. Hence, one may use this utility to iterate
>> through the entire table at very high speed.
>>
>> Had to find a work around, that would be faster than paging on result set.
>>
>> Regards
>>
>> Siddharth Verma
>> *Software Engineer I - CaMS*
>> *M*: +91 9013689856, *T*: 011 22791596 *EXT*: 14697
>> CA2125, 2nd Floor, ASF Centre-A, Jwala Mill Road,
>> Udyog Vihar Phase - IV, Gurgaon-122016, INDIA
>> Download Our App
>> [image: A]
>> <https://play.google.com/store/apps/details?id=com.snapdeal.main&utm_source=mobileAppLp&utm_campaign=android>
>>  [image:
>> A]
>> <https://itunes.apple.com/in/app/snapdeal-mobile-shopping/id721124909?ls=1&mt=8&utm_source=mobileAppLp&utm_campaign=ios>
>>  [image:
>> W]
>> <http://www.windowsphone.com/en-in/store/app/snapdeal/ee17fccf-40d0-4a59-80a3-04da47a5553f>
>>
>> On Tue, Oct 4, 2016 at 12:41 AM, Jonathan Haddad <[email protected]>
>> wrote:
>>
>> It almost sounds like you're duplicating all the work of both spark and
>> the connector. May I ask why you decided to not use the existing tools?
>>
>> On Mon, Oct 3, 2016 at 2:21 PM siddharth verma <
>> [email protected]> wrote:
>>
>> Hi DuyHai,
>> Thanks for your reply.
>> A few more features planned in the next one(if there is one) like,
>> custom policy keeping in mind the replication of token range on specific
>> nodes,
>> fine graining the token range(for more speedup),
>> and a few more.
>>
>> I think, as fine graining a token range,
>> If one token range is split further in say, 2-3 parts, divided among
>> threads, this would exploit the possible parallelism on a large scaled out
>> cluster.
>>
>> And, as you mentioned the JIRA, streaming of request, that would of huge
>> help with further splitting the range.
>>
>> Thanks once again for your valuable comments. :-)
>>
>> Regards,
>> Siddharth Verma
>>
>>
>>

Re: An extremely fast cassandra table full scan utility

Reply via email to