Bear in mind that you won't be able to merely "tune" your schema - you will
need to completely redesign your data model. Step one is to look at all of
the queries you need to perform and get a handle on what flat, denormalized
data model they will need to execute performantly in a NoSQL database. No
JOINs. No ad hoc queries. Secondary indexes are supported, but not advised.
The general model is that you have a "query table" for each form of query,
with the primary key adapted to the needs of the query. That means a lot of
denormalization and repetition of data. The new, automated Materialized
View feature of Cassandra 3.0 can help with that a lot, but is a new
feature and not quite stable enough for production (no DataStax Enterprise
(DSE) release with 3.0 yet.) Triggers are supported, but not advised -
better to do that processing at the application level. DSE also supports
Hadoop and Spark for batch/analytics and Solr for search and ad hoc queries
(or use Stratio or Stargate for Lucene queries.)

Best to start with a basic proof of concept implementation to get your feet
wet and learn the ins and outs before making a full commitment.

Is this a Java app? The Java Driver is where you need to get started in
terms of ingesting and querying data. It's a bit more sophisticated than
just a simple JDBC interface. Most of your queries will need to be
rewritten anyway even though the CQL syntax does indeed look a lot like
SQL, but much of that will be because your data model will need to be made
NoSQL-compatible.

That should get you started.


-- Jack Krupansky

On Tue, Jan 5, 2016 at 10:52 AM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote:

> I understand, Ravi,  we have our application layers well defined. The
> major changes will be in database access layers and entities will be
> changed. Schema will be modified to tune the efficiency of the data store
> chosen.
>
> We have been using mongo as a cache for a long time now, but as its a
> document store and since we have a crisp well defined schema we chose to go
> with a columnar database.
>
> Our data size has been growing very rapidly. Currently it is 200GB with
> indexes, in couple of years it will grow up to approx 5 TB. And we may need
> to run procedures to aggregate data and update tables.
>
> On Tue, Jan 5, 2016 at 6:54 PM, Ravi Krishna <sravikrish...@gmail.com>
> wrote:
>
>> You are moving from a SQL database to C* ??? I hope you are aware of the
>> differences between a nosql like C* and a RDBMS. To keep it short, the app
>> has to change significantly.
>>
>> Please read documentation on differences between nosql and RDBMS.
>>
>> thanks.
>>
>> On Tue, Jan 5, 2016 at 6:20 AM, Bhuvan Rawal <bhu1ra...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> Im planning to shift from SQL database to a columnar nosql database, we
>>> have streamlined our choices to Cassandra and HBase. I would really
>>> appreciate if someone decent experience with both give me a honest
>>> comparison on below parameters (links to neutral benchmarks/blogs also
>>> appreciated):
>>>
>>> 1. Data Consistency (Eventual consistency allowed but define "eventual")
>>> 2. Ease of Scaling Up
>>> 3. Managebility
>>> 4. Failure Recovery options
>>> 5. Secondary Indexing
>>> 6. Data Aggregation
>>> 7. Query Language (3rd party wrapper solutions also allowed)
>>> 8. Security
>>> 9. *Commercial Support for quick solutions to issues*.
>>> 10. Run batch job on data like map reduce or some common aggregation
>>> functions using row scan. Any other packages for cassandra to achieve this?
>>> 11. Trigger specific updates on tables used for secondary index.
>>> 12. Please consider that our DB will be the source of truth, with no
>>> specific requirement of immediate data consistency amongst nodes.
>>>
>>> Regards,
>>> Bhuvan Rawal
>>> SDE
>>>
>>
>>
>

Reply via email to