ate previously computed spark results.
>
> Regards,
> Yohann
>
>
> --
> *De :* Rick Moritz
> *Envoyé :* jeudi 16 mars 2017 10:37
> *À :* user
> *Objet :* Re: RE: Fast write datastore...
>
> If you have enough RAM/SSDs available, maybe ti
/aggregate previously computed spark results.
>
> Regards,
> Yohann
>
> De : Rick Moritz
> Envoyé : jeudi 16 mars 2017 10:37
> À : user
> Objet : Re: RE: Fast write datastore...
>
> If you have enough RAM/SSDs available, maybe tiered HDFS storage and Parquet
> might al
Objet : Re: RE: Fast write datastore...
If you have enough RAM/SSDs available, maybe tiered HDFS storage and Parquet
might also be an option. Of course, management-wise it has much more overhead
than using ES, since you need to manually define partitions and buckets, which
is suboptimal. On the
;
>>
>> *From:* Vova Shelgunov [mailto:vvs...@gmail.com]
>> *Sent:* Wednesday, March 15, 2017 11:51 PM
>> *To:* Muthu Jayakumar
>> *Cc:* vincent gromakowski ; Richard
>> Siebeling ; user ; Shiva
>> Ramagopal
>> *Subject:* Re: Fast write datastore...
>&
Hi,
Will MongoDB not fit this solution?
From: Vova Shelgunov [mailto:vvs...@gmail.com]
Sent: Wednesday, March 15, 2017 11:51 PM
To: Muthu Jayakumar
Cc: vincent gromakowski ; Richard Siebeling
; user ; Shiva Ramagopal
Subject: Re: Fast write datastore...
Hi Muthu,.
I did not catch from
>Reading your original question again, it seems to me probably you don't
need a fast data store
Shiva, You are right. I only asked about fast-write and never mentioned on
read :). For us, Cassandra may not be a choice of read because of its
a. limitations on pagination support on the server side
b.
Hi,
The choice of ES vs Cassandra should really be made depending on your query
use-cases. ES and Cassandra have their own strengths which should be
matched to what you want to do rather than making a choice based on their
respective feature sets.
Reading your original question again, it seems to
we are using elasticsearch for this.
the issue of elasticsearch falling over if the number of partitions/cores
in spark writing to it is too high does suck indeed. and the answer every
time i asked about it on elasticsearch mailing list has been to reduce
spark tasks or increase elasticsearch node
Hi Muthu,.
I did not catch from your message, what performance do you expect from
subsequent queries?
Regards,
Uladzimir
On Mar 15, 2017 9:03 PM, "Muthu Jayakumar" wrote:
> Hello Uladzimir / Shiva,
>
> From ElasticSearch documentation (i have to see the logical plan of a
> query to confirm), t
Hello Uladzimir / Shiva,
>From ElasticSearch documentation (i have to see the logical plan of a query
to confirm), the richness of filters (like regex,..) is pretty good while
comparing to Cassandra. As for aggregates, i think Spark Dataframes is
quite rich enough to tackle.
Let me know your thoug
Probably Cassandra is a good choice if you are mainly looking for a
datastore that supports fast writes. You can ingest the data into a table
and define one or more materialized views on top of it to support your
queries. Since you mention that your queries are going to be simple you can
define you
Hello Vincent,
Cassandra may not fit my bill if I need to define my partition and other
indexes upfront. Is this right?
Hello Richard,
Let me evaluate Apache Ignite. I did evaluate it 3 months back and back
then the connector to Apache Spark did not support Spark 2.0.
Another drastic thought ma
maybe Apache Ignite does fit your requirements
On 15 March 2017 at 08:44, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:
> Hi
> If queries are statics and filters are on the same columns, Cassandra is a
> good option.
>
> Le 15 mars 2017 7:04 AM, "muthu" a écrit :
>
> Hello there,
>
Hi
If queries are statics and filters are on the same columns, Cassandra is a
good option.
Le 15 mars 2017 7:04 AM, "muthu" a écrit :
Hello there,
I have one or more parquet files to read and perform some aggregate queries
using Spark Dataframe. I would like to find a reasonable fast datastore
14 matches
Mail list logo