Re: Rationale for using Hazelcast in front of Cassandra?

2016-10-07 Thread Peter Lin
Cassandra is a database, not an in-memory cache. Please don't abuse Cassandra like that when there's plenty of existing distributed cache products designed for that purpose. That's like asking "why can't I drag race with a school bus?" You could and it might be fun, but that's not what it was des

Re: Cassandra data model right definition

2016-10-03 Thread Peter Lin
I've met clients that read the cassandra docs and then said in a big meeting "it's just like relational database, it has tables just like sqlserver/oracle." I'm not putting words in other people's mouth either, but I've heard that said enough times to want to puke. Does the docs claim cassandra is

Re: Cassandra data model right definition

2016-10-03 Thread Peter Lin
Whether a storage engine requires schema isn't really critical for row oriented storage. How about CSV that doesn't have a header row? CSV is probably the most commonly used row oriented storage and tons of businesses still use it for B2B transactions. As you pointed out, some traditional RDBMS ha

Re: Cassandra data model right definition

2016-10-01 Thread Peter Lin
I'll second Ed's comment. The documentation should be more careful when using phrases "like relational databases". When we look at the history of relational databases, people expect certain things like ACID transactions, primary/foriegn key constraints, query planners, joins and relational algebra

Re: Help on temporal data modeling

2016-09-23 Thread Peter Lin
yes it would. Whether next_billing_date is timestamp or date wouldn't make any difference on scanning all partitions. If you want to them to be on the same node, you can use composite key, but there's a trade off. The nodes may get unbalanced, so you have to do the math to figure out if your specif

Re: Help on temporal data modeling

2016-09-23 Thread Peter Lin
Ignoring noSql for a minute, the standard way of modeling this in car and health insurance is with effective/expiration day. Commonly called bi-temporal data modeling. How people model bi-temporal models varies quite a bit from first hand experience, but the common thing is to have transaction tim

Re: Question about hector api documentation

2016-06-25 Thread Peter Lin
Object friendly APIs are good fit for many use cases. Text-based languages are nice, but I personally prefer thrift and hector. Haven't we learned anything from Rbdms and ORM? Sent from my iPhone > On Jun 25, 2016, at 3:46 PM, Nate McCall wrote: > > >> I used to be surprised that people stil

Re: ScyllaDB, a new open source, Cassandra-compatible NoSQL

2015-09-23 Thread Peter Lin
Looking at the architecture and what scylladb does, I'm not surprised they got 10x improvement. SeaStar skips a lot of the overhead of copying stuff and it gives them CPU core affinity. Anyone that's listened to Clif Click talk about cache misses, locks and other low level stuff would recognize the

Re: ScyllaDB, a new open source, Cassandra-compatible NoSQL

2015-09-22 Thread Peter Lin
very interesting. I'm glad to see someone building a drop in replacement for Cassandra. On Tue, Sep 22, 2015 at 5:40 PM, Tzach Livyatan wrote: > Hi Sachin > > On Tue, Sep 22, 2015 at 11:40 PM, Sachin Nikam wrote: > >> Tzach, >> Can you point to any documentation on scylladb site which talks abo

Re: Some love for multi-partition LWT?

2015-09-08 Thread Peter Lin
I would caution using paxos for distributed transaction in an inappropriate way. The model has to be logically and mathematically correct, otherwise you end up with corrupt data. In the worst case, it could cause cascading failure that brings down the cluster. I've seen distributed systems come to

Re: Support for ad-hoc query

2015-06-10 Thread Peter Lin
t; required if I want to do analysis on the data stored in cassandra, do you >> have any better ideas)? >> >> Regards, >> Seenu. >> >> On Tue, Jun 9, 2015 at 5:57 PM, Peter Lin wrote: >> >>> >>> what do you mean by ad-hoc queries? >&g

Re: Support for ad-hoc query

2015-06-09 Thread Peter Lin
what do you mean by ad-hoc queries? Do you mean simple queries against a single column family aka table? Or do you mean MDX style queries that looks at multiple tables? if it's MDX style queries, many people extract data from Cassandra into a data warehouse that support multi-dimensional cubes.

Re: Arbitrary nested tree hierarchy data model

2015-03-28 Thread Peter Lin
that's neat, thanks for sharing. sounds like the solution is partly inspired by merkle tree to make lookup fast and easy. peter On Fri, Mar 27, 2015 at 10:07 PM, Robert Wille wrote: > Okay, this is going to be a pretty long post, but I think its an > interesting data model, and hopefully some

Re: Documentation of batch statements

2015-03-03 Thread Peter Lin
I agree with jonathan haddad. A traditional ACID transaction following the classic definition, isolation is necessary. Having said that, there is different levels of isolation. http://en.wikipedia.org/wiki/Isolation_%28database_systems%29#Isolation_levels Saying the distinction is pendantic is wr

Re: how to make unique coloumns in cassandra

2015-03-02 Thread Peter Lin
Use a RDBMS There is a reason constraints were created and why Cassandra doesn't have it Sent from my iPhone > On Mar 2, 2015, at 2:23 AM, Rahul Srivastava > wrote: > > but what if i want to fetch the value using on table then this idea might fail > >> On Mon, Mar 2, 2015 at 12:46 PM, Ajaya

Re: how to make unique constraints in cassandra

2015-02-28 Thread Peter Lin
synthetic PK are best, > e.g. UUIDs or TimeUUIDs. > > On Sat, Feb 28, 2015 at 8:42 AM, Peter Lin wrote: > >> >> Hate to be the one to point this out, but that is not the ideal use case >> for Cassandra. >> >> If you really want to brute force it and &

Re: how to make unique constraints in cassandra

2015-02-28 Thread Peter Lin
Hate to be the one to point this out, but that is not the ideal use case for Cassandra. If you really want to brute force it and "make it fit" cassandra, the easiest way is to create a class called Index. The index class would have name, phone and address fields. The hashcode and equals method wou

Re: Storing bi-temporal data in Cassandra

2015-02-20 Thread Peter Lin
; record. > > About your comment on having valid_time in the keys, do I have a choice in > Cassandra, unless you are suggesting to use secondary indexes. > > I am new to bi-temporal data modeling. So please advise if you think > building this on top of Cassandra is a stupid idea.

Re: Accessing Cassandra Data from Excel / Tableau / R

2015-02-17 Thread Peter Lin
Hive can connect to Cassandra, so that means you can point Tableau to hive using JDBC. As long as you map Hive to cassandra, you should be able to query data just like regular hive On Tue, Feb 17, 2015 at 7:29 PM, Ashic Mahtab wrote: > What's a good way to load some cassandra data (perhaps resu

Re: Storing bi-temporal data in Cassandra

2015-02-15 Thread Peter Lin
I've built several different bi-temporal databases over the year for a variety of applications, so I have to ask "why are you modeling it this way?" Having a temperatures table doesn't make sense to me. Normally a bi-temporal database has transaction time and valid time. The transaction time is th

Re: Re: Dynamic Columns

2015-01-22 Thread Peter Lin
mn" in Thrift. The point in CQL3 > is not to eliminate a useful feature, dynamic column, but to repackage the > feature to make a lot more sense for the vast majority of use cases. Maybe > there are some cases that doesn't exactly fit as well as desired, but feel > free to spec

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
int. I don't know when that is, but every piece of software eventually dies or is abandoned. Except for Cobol. That thing will be around 200 yrs from now On Wed, Jan 21, 2015 at 6:57 PM, Robert Coli wrote: > On Wed, Jan 21, 2015 at 2:09 PM, Peter Lin wrote: > >> on the topic of m

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
rote: > On Wed, Jan 21, 2015 at 9:19 AM, Peter Lin wrote: > >> >> I consistently recommend new users learn and understand both Thrift and >> CQL. >> > > FWIW, I consider this a disservice to new users. New users should use CQL, > and not deploy against a d

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
tity. The mailing list isn't the right place to go into the theory and practice of temporal databases, but a lot of the design choices I made is based on formal logic. On Wed, Jan 21, 2015 at 4:06 PM, Sylvain Lebresne wrote: > On Wed, Jan 21, 2015 at 6:19 PM, Peter Lin wrote: >

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
isn't to put down CQL. It's because I care and want to help improve Cassandra by sharing my experience. I consistently recommend new users learn and understand both Thrift and CQL. On Wed, Jan 21, 2015 at 11:45 AM, Sylvain Lebresne wrote: > On Wed, Jan 21, 2015 at 4:44 PM, Peter Lin

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
I don't remember other people's examples in detail due to my shitty memory, so I'd rather not misquote. In my case, I mix static and dynamic columns in a single column family with primitives and objects. The objects are temporal object graphs with a known type. Doing this type of stuff is basicall

Re: Re: Dynamic Columns

2015-01-21 Thread Peter Lin
g about open source. Should thrift go away permanently I'll just fork Cassandra and do my own thing. On Wed, Jan 21, 2015 at 8:53 AM, Sylvain Lebresne wrote: > On Wed, Jan 21, 2015 at 3:46 AM, Peter Lin wrote: > >> >> I don't understand why people [...] pretend i

Re: Re: Dynamic Columns

2015-01-20 Thread Peter Lin
QL 3? > > At 2015-01-21 09:41:02, "Peter Lin" wrote: > > > I think that table example misses the point of chetan's functional > requirement. he actually needs dynamic columns. > > On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing > wrote: > >> Mayb

Re: Dynamic Columns

2015-01-20 Thread Peter Lin
I think that table example misses the point of chetan's functional requirement. he actually needs dynamic columns. On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing wrote: > Maybe this is the closest thing to "dynamic columns" in CQL 3. > > create table reivew ( > product_id bigint, > create

Re: Storing PDF data on Cassandra db

2015-01-13 Thread Peter Lin
you want to store the raw bytes, so look at examples for saving raw bytes. I generally recommend using Thrift if you're going to do a lot of read/write of binary data. CQL is good for primitive types, and maps/lists of primitive types. I'm bias, but it's simpler and easier to use thrift for storin

Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

2015-01-03 Thread Peter Lin
hence the value of > seasoned advice. > > > Best > > -- > Hugo José Pinto > > No dia 03/01/2015, às 23:43, Peter Lin escreveu: > > > listen to colin's advice, avoid the temptation of anti-patterns. > > On Sat, Jan 3, 2015 at 6:10 PM, Colin wrote:

Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

2015-01-03 Thread Peter Lin
listen to colin's advice, avoid the temptation of anti-patterns. On Sat, Jan 3, 2015 at 6:10 PM, Colin wrote: > Use a message bus with a transactional get, get the message, send to > cassandra, upon write success, submit to esp, commit get on bus. Messaging > systems like rabbitmq support this

Re: Best approach in Cassandra (+ Spark?) for Continuous Queries?

2015-01-03 Thread Peter Lin
It looks like you're using the wrong tool and architecture. If the use case really needs continuous query like event processing, use an ESP product to do that. You can still store data in Cassandra for persistence . The design you want is to have two paths: event stream and persistence. At the

Re: CQL3 vs Thrift

2014-12-29 Thread Peter Lin
ouldn't have had the same > opinion). The more I use it, the more I have come to like it. > > I started as a skeptic, and became a convert. > > On Mon, Dec 29, 2014 at 12:04 PM, Peter Lin wrote: > >> >> In my bias opinion something else should replace CQL and i

Re: CQL3 vs Thrift

2014-12-29 Thread Peter Lin
00% of the features that exist today Sent from my iPhone > On Dec 29, 2014, at 1:34 PM, Robert Coli wrote: > >> On Tue, Dec 23, 2014 at 10:26 AM, Peter Lin wrote: >> >> I'm bias in favor of using both thrift and CQL3, though many people on the >> list probab

Re: CQL3 vs Thrift

2014-12-24 Thread Peter Lin
and there are some really awesome features already released only for CQL, > and more are coming. Find a path that works for you in CQL; we had to > change our thinking about a number of things, but it's worth the effort. > > On Wed, Dec 24, 2014 at 8:48 AM, Peter Lin wrote: >

Re: CQL3 vs Thrift

2014-12-24 Thread Peter Lin
basically any time you want to store maps of maps, lists of lists or actual java objects, CQL is not a good fit. CQL is really only good for primitive types, flat lists, maps and sets. Using Cassandra pure with static columns is perfectly valid, but I don't live in that world. Most of what I do re

Re: CQL3 vs Thrift

2014-12-24 Thread Peter Lin
> corner cases, but it's also possible I have a modeling alternative that you > may not have considered yet, regardless it's good practice and background for > me. > >> On Tue, Dec 23, 2014 at 12:26 PM, Peter Lin wrote: >> >> I'm bias in favor of us

Re: CQL3 vs Thrift

2014-12-23 Thread Peter Lin
I'm bias in favor of using both thrift and CQL3, though many people on the list probably think I'm crazy. CQL3 is good if what you need fits nicely in static columns, but it doesn't if you want to use dynamic columns and/or mix & match both in the same columnFamily. For a lot of what I use Cassand

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
like Esper have offered joins for years. > > What hasnt are systems like storm, spark, etc which I dont really classify > as stream processors anyway. > > > > -- > *Colin Clark* > +1-320-221-9531 > > > On Dec 18, 2014, at 1:52 PM, Peter Lin wrote: > > that

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
hu, Dec 18, 2014 at 8:18 AM, Ryan Svihla >> wrote: >>> >>> I'll decline to continue the commentary on spark, as again this probably >>> belongs on another list, other than to say, microbatches is an intentional >>> design tradeoff that has notable benefi

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
(fraud using windows of various sizes, live aggregation of data, and > joins), typically pulling from a Kafka topic, but it can be adapted to > pretty much any source. > > I'd argue you were correct about everything at one time, but you're saying > it can't do things

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
ffs, it's a > bit harsh to dismiss as "basic" something that was chosen and provides some > improvements over say..the Storm model. > > On Thu, Dec 18, 2014 at 7:13 AM, Peter Lin wrote: >> >> >> some of the most common types of use cases in stream process

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
nalytics cases with > Spark?" the answer is absolutely yes (and Storm for that matter). If the > question is "Can you do your analytics queries on Cassandra while you have > Spark sitting there doing nothing?" then of course the answer is no, but > that'd be

Re: Cassandra for Analytics?

2014-12-18 Thread Peter Lin
that depends on what you mean by real-time analytics. For things like continuous data streams, neither are appropriate platforms for doing analytics. They're good for storing the results (aka output) of the streaming analytics. I would suggest before you decide cassandra vs hbase, first figure out

Re: Spark SQL Vs CQL performance on Cassandra

2014-12-11 Thread Peter Lin
Spark is an in-memory architecture, so you're not going to see it go faster than CQL for a simple select from 1 table on a few keys. Where you'll see a benefit is loading lots of data into memory and doing some "report like" query where you join data from multiple tables. On Thu, Dec 11, 2014 at 8

Re: Why is Quorum not sufficient for Linearization?

2014-10-16 Thread Peter Lin
To the best of my knowledge, only guaranteed way is with an ACID compliant system. The examples other have already provided should give you a decent idea. If that's not enough, you would need to read papers on CRDT's and how they compare to ACID systems. http://highscalability.com/blog/2010/12/23

Re: Dynamic schema modification an anti-pattern?

2014-10-07 Thread Peter Lin
Statically defining columsn using EAV table approach is totally a wrong fit for Cassandra. Taking a step back, EAV tables generally don't scale at no matter the database. I've done this on SqlServer, Oracle and DB2. Many products that use EAV approach like master data management products suffer fr

Re: [ANN] SparkSQL support for Cassandra with Calliope

2014-10-03 Thread Peter Lin
it's nice to see spark + cassandra work This give users an alternative to CQL that has more SQL functionality On Fri, Oct 3, 2014 at 2:16 PM, Rohit Rai wrote: > Hi All, > > An year ago we started this journey and laid the path for Spark + > Cassandra stack. We established the ground work and di

Re: Machine Learning With Cassandra

2014-08-30 Thread Peter Lin
there are other machine learning frameworks that scale better than hadoop + mahout http://hunch.net/~vw/ if the kind of machine learning you're doing is really large and speed matters, take a look at vowpal wabbit On Sat, Aug 30, 2014 at 4:58 PM, Adaryl "Bob" Wakefield, MBA < adaryl.wakefi...

Re: Why is the cassandra documentation such poor quality?

2014-07-24 Thread Peter Lin
there's quite a few blog entries on Datastax blog that really should be included in the docs On Thu, Jul 24, 2014 at 5:32 PM, Hao Cheng wrote: > I second this, especially since the version association for blog posts is > often vague. This makes looking at historical blog posts quite annoying >

Re: Why is the cassandra documentation such poor quality?

2014-07-24 Thread Peter Lin
for example, this old blog entry from way back in 2012 http://www.datastax.com/dev/blog/cql3-for-cassandra-experts On Thu, Jul 24, 2014 at 12:07 PM, Tyler Hobbs wrote: > > On Thu, Jul 24, 2014 at 3:55 AM, Nicholas Okunew > wrote: > >> most of the important stuff being in blog format > > > Whi

Re: Why is the cassandra documentation such poor quality?

2014-07-24 Thread Peter Lin
for starters all of the blog entries related to CQL3, like the change in terminology and compact storage. the last time I looked at the datastax documentation on CQL3, it wasn't nearly as detailed as the blog entries by jonathan ellis and sylvain. On Thu, Jul 24, 2014 at 12:07 PM, Tyler Hobbs w

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Peter Lin
ength > in all projects, often by vocal individuals such as yourselves who are > unhappy in some way with how the project is being run. However it is very > hard to please everyone - most of the time we can't even please all the > committers, and that is a much smaller and more

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Peter Lin
r driver, but I think it got > lost in the discussion of whether it supported CQL. If you say it supports > CQL and native protocol, I’m sure it will get very prompt attention. > > -- Jack Krupansky > > *From:* Peter Lin > *Sent:* Wednesday, July 23, 2014 8:30 AM > *To:* u

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Peter Lin
ng', or a 'slap in the face', or that it is > even particularly onerous. It is a slight psychological barrier, but in my > personal experience when a psychological barrier as low as this prevents me > from taking action, it's usually because I don't have as much de

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Peter Lin
y > add you within a day. It may be a psychological barrier, but it isn't > really a practical one. Still, if you feel the policy is incorrect, raise > this on the dev list also. > > > On Wed, Jul 23, 2014 at 1:33 PM, Peter Lin wrote: > >> >> I've tried

Re: Why is the cassandra documentation such poor quality?

2014-07-23 Thread Peter Lin
I've tried to contribute docs to Cassandra wiki in the past, but there's an obstacle. currently wiki.apache.org/cassandra is locked down, so only commiters can edit it. I really wish that wasn't the case, since it wastes time. the commiters are busy writing code. Having to email a commiter and ask

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
ecent discussion on cassandra dev and the choice was not to move to > it) > > I think the binary protocol is the way forward; CQL3 needs some new > features, or there need to be some other types of requests you can make > over the binary protocol > > On Jun 13, 2014, at 5:51

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
on. >> >> Long story short I think Thrift may have appropriate usage but only in >> very few use cases. Recently a lot of improvement and features have been >> added to CQL3 so that it shoud be considered as the first choice for most >> users and if they fall into

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
lastname text, > last_connection timestamp, > ); > > C* will create a column family with validation type = bytes to > accommodate the timestamp and text types for the firstname, lastname and > last_connection columns. Basically the CQL3 engine is doing the > s

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
; feature, which does not exist (and > probably won't) in CQL3, I don't see how you can have columns with > "different types" on the same row/partition > > > On Fri, Jun 13, 2014 at 11:06 PM, Peter Lin wrote: > >> >> when I say dynamic column, I mean no

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
when I say dynamic column, I mean non-static columns of different types within the same row. Some could be an object or one of the defined datatypes. with thrift I use the appropriate serializer to handle these dynamic columns. On Fri, Jun 13, 2014 at 4:55 PM, DuyHai Doan wrote: > Well, before

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
Like you, I make extensive use of dynamic columns for similar reasons. In our project, one of the goals is to give "end users" the ability to design their own schema without having to alter a table. If people really want strong schema, then just use old Sql or NewSql. RDB gives you the full power

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread Peter Lin
I like CQL, but it's not a hammer. If thrift is more appropriate for you, then use it. If Cassandra gets to the point where Thrift is removed, I'll just fork Cassandra. That's what's great about open source. On Fri, Jun 13, 2014 at 3:47 PM, DuyHai Doan wrote: > This strikes me as bad practice

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
for > .net, java, or python. Those firms that do are now starting to wrap those > drivers with any specific functionality they might require, like Netflix, > for example. Have you looked at DataStax's .NET driver? > > -- > Colin > +1 320 221 9531 > > > > On

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
n 2, 2014 at 8:29 AM, Benedict Elliott Smith < belliottsm...@datastax.com> wrote: > The native protocol specification has always been in the Apache Cassandra > repository. The implementations are not. > > > On 2 June 2014 13:25, Peter Lin wrote: > >> >> Ther

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
ife going forward. > > -- > Colin > +1 320 221 9531 > > > > On Mon, Jun 2, 2014 at 7:10 AM, Peter Lin wrote: > >> >> it is using thrift. I've updated the project page to state that info. >> >> >> On Mon, Jun 2, 2014 at 8:08 AM, Colin Cla

Re: Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
it is using thrift. I've updated the project page to state that info. On Mon, Jun 2, 2014 at 8:08 AM, Colin Clark wrote: > Is your version of Hector using native protocol or thrift? > > -- > Colin > +1 320 221 9531 > > > > On Mon, Jun 2, 2014 at 6:41 AM, Peter

Nectar client - New Cassandra Client for .Net

2014-06-02 Thread Peter Lin
I'm happy to announce Concord has decided to open source our port of Hector to .Net. The project is hosted on google code https://code.google.com/p/nectar-client/ I'm still adding code documentation and wiki pages. It has been tested against 1.1.x, 2.0.x thanks peter

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Peter Lin
I don't think anyone can predict the future. CQL is nice, but there's still lots of room for improvement. There's a reason why projects like spark, shark, impala and presto exist. I would expect something to replace CQL in the future as things evolve. Plus, the type safety that thrift clients shou

Re: Migrate from Hector(unmaintained) to Astyanax for Cassandra 2.0.7, (delaying thrift to CQL migration plan) ?

2014-05-28 Thread Peter Lin
I contribute to Hector. It is still being maintained. I still benefits of using thrift over CQL. On Wed, May 28, 2014 at 10:19 AM, user 01 wrote: > Currently I am using Hector which is no longer maintained by its > developers. So, for the past few days I have been looking at Astyanax & to > be

Re: Cassandra CSV & JSON uploader

2014-05-28 Thread Peter Lin
I think it's important to remember that distributed cache are different than NoSql database. As much as people like to think both of them are hammers, they're not. The kinds of workloads each is good at is different, so let's not recommend people misuse and abuse cassandra, dse or coherence. On T

Re: What % of cassandra developers are employed by Datastax?

2014-05-23 Thread Peter Lin
I think we can all agree that DataStax has been a positive for Cassandra. There's no point arguing that in my mind. A separate but important consideration is long term health of a project. Many apache projects face this issue. When a project doesn't continually grow the contributors and committers

Re: What % of cassandra developers are employed by Datastax?

2014-05-23 Thread Peter Lin
misleading. >> >> Why wouldn't a company want to hire people who have shown a desire and >> aptitude to work on products that they care about? It's just rational. And >> damn genius, actually. >> >> I'm sure they'd be happy to have an influx o

Re: What % of cassandra developers are employed by Datastax?

2014-05-17 Thread Peter Lin
if you look at the new committers since 2012 they are mostly datastax On Fri, May 16, 2014 at 9:14 PM, Kevin Burton wrote: > so 30%… according to that data. > > > On Thu, May 15, 2014 at 4:59 PM, Michael Shuler wrote: > >> On 05/14/2014 03:39 PM, Kevin Burton wrote: >> >>> I'm curious what % of

Re: What % of cassandra developers are employed by Datastax?

2014-05-16 Thread Peter Lin
perhaps the committers should invite other developers that have shown an interest in contributing to Cassandra. the rate of adding new non-Datastax committers "appears" to be low the last 2 years. I have no data to support it, it's just a feeling based personal observations the last 3 years.

Re: Select with filtering

2014-04-25 Thread Peter Lin
Other people have expressed an interest and there's existing jira ticket for this type if feature. Unfortunately it hasn't gotten much traction and the tickets are basically dead Sent from my iPhone > On Apr 25, 2014, at 12:03 PM, Mikhail Mazursky wrote: > > Hello Paco, > > thanks for respo

Re: Thrift -> CQL

2014-03-26 Thread Peter Lin
Hector has round robin and failover. Is there a particular kind of failover you're looking for? by default Hector will try another node if the first node it connects to is down. It's been that way since the 1.x client if I'm not mistaken. On Wed, Mar 26, 2014 at 9:41 AM, rubbish me wrote: > Hi

Re: Serial Consistency and Thrift API

2014-03-15 Thread Peter Lin
thanks for sharing that info. I haven't needed to use CAS yet and haven't bothered to look at it. I'll have to document that for hector. On Sat, Mar 15, 2014 at 5:45 AM, Sylvain Lebresne wrote: > On Fri, Mar 14, 2014 at 7:59 PM, Panagiotis Garefalakis < > panga...@gmail.com> wrote: > >> >> Hello

Re: Serial Consistency and Thrift API

2014-03-14 Thread Peter Lin
Recently I added CQL3 support to Hector, but I haven't had time to try out serial writes. On Fri, Mar 14, 2014 at 3:34 PM, Robert Coli wrote: > On Fri, Mar 14, 2014 at 11:59 AM, Panagiotis Garefalakis < > panga...@gmail.com> wrote: > >> I am running some tests in my cluster and I wanted to try

Re: CQL Select Map using an IN relationship

2014-03-13 Thread Peter Lin
ndra unit library I'm using for testing [1] I will >>>> try to fix my build dependencies and retry, thx. >>>> >>>> /Dave >>>> >>>> [1] https://github.com/jsevellec/cassandra-unit >>>> >>>> >>>> On Thu

Re: CQL Select Map using an IN relationship

2014-03-13 Thread Peter Lin
it's not clear to me if your "id" column is the KEY or just a regular column with secondary index. queries that have IN on non primary key columns isn't supported yet. not sure if that answers your question. On Thu, Mar 13, 2014 at 7:12 AM, David Savage wrote: > Hi there, > > I'm experimenting

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
al of "make writing complex > queries less painful and more efficient." by providing a deep integration > mechanism to host that code. It's very much a "enough rope to hang > ourselves" approach, but badly needed, IMO > > -Tupshin > On Mar 12, 2014 12

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
tions. > > Anybody interested in working on a coherent proposal with me? > > -Tupshin > > > On Wed, Mar 12, 2014 at 10:12 AM, Brian O'Neill wrote: > >> >> just when you thought the thread died... >> >> >> First, let me say we are *WAY* off topic. B

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
@Nate I don't want to change the separation of components in cassandra. My ultimate goal is "make writing complex queries less painful and more efficient." How that becomes reality is anyone's guess. There's different ways to get there. I also like having a plugging transport layer, which is why I

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
ists about how a driver should handle UDTs), > but it shows a problem with the the-spec-is-the-thruth argument. I think > we'll be fine as long as the spec is the truth, but that requires the spec > to be the truth and new features to not be bolted on outside of the spec. > > T#

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
love this community because there are a ton of passionate, smart people. > (often with differing perspectives ;) > > RE: Reporting against C* (@Peter Lin) > We've had the same experience. Pig + Hadoop is painful. We are > experimenting with Spark/Shark, operating directly again

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
es with subqueries, like, group by and joins" --> Did you have > a look at Intravert ? I think it does union & intersection on server side > for you. Not sure about join though.. > > > On Wed, Mar 12, 2014 at 12:44 PM, Peter Lin wrote: > >> >> Hi Ed, >

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Peter Lin
take a while. Back in the day no one wanted cassandra to > be heavy-weight and rejected ideas like read-before write operations. The > common advice was "do them client side". Now in the case of collections > sometimes they do read-before-write and it is the "stuff users want&

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-11 Thread Peter Lin
e time to do it." >>> >>> I see what your saying. CQL started as a way to make slice easier but it >>> is not even a query language, retrofitting these things is going to be very >>> hard. >>> >>> >>> >>> On Tue, Mar 11, 2014 at 7:

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-11 Thread Peter Lin
y optimizers. All of these things can be done, it's >> just a matter of people finding the time to do it." >> >> I see what your saying. CQL started as a way to make slice easier but it >> is not even a query language, retrofitting these things is going to be v

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-11 Thread Peter Lin
is a clear > message: The committers are unwilling to accept new thrift features even if > said features are contributed by others. > > Edward > > > > On Tue, Mar 11, 2014 at 5:51 PM, Peter Lin wrote: > >> >> My bias opinion, just because some member of cassand

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-11 Thread Peter Lin
My bias opinion, just because some member of cassandra develop want to abandon Thrift, I see benefits of continuing to improve it. The great thing about open source is that as long as some people want to keep working on it and improve it, it can happen. I plan to do my best to keep Thrift going, s

Re: How expensive are additional keyspaces?

2014-03-11 Thread Peter Lin
if I have time this summer, I may work on that, since I like having thrift. On Tue, Mar 11, 2014 at 12:05 PM, Edward Capriolo wrote: > This mistake is not a thrift limitation. In 0.6.X you could switch > keyspaces without calling setKeyspace(String) methods specified the > keyspace in every oper

Re: How expensive are additional keyspaces?

2014-03-11 Thread Peter Lin
I couldn't resist responding. Having done some experiments with lots of keyspaces and purposely created lots of keyspaces versus 1 keyspace, the only good reasons I see for many keyspaces 1. each keyspaces needs a different replication factor. Even in this case, I personally can't justify having

Re: Driver documentation questions

2014-03-07 Thread Peter Lin
you would create a new session. Don't create a new cluster, that will quickly exhaust the connections to the servers On Fri, Mar 7, 2014 at 3:42 PM, Green, John M (HP Education) < john.gr...@hp.com> wrote: > I've been tinkering with both the C++ and Java drivers but in neither > case have I got

Re: Query on blob col using CQL3

2014-02-28 Thread Peter Lin
why are you trying to view a blob with CQL3? and what kind of blob is it? if the blob is an object, there's no way to view that in CQL3. You'd need to do extra work like user defined types, but I don't know of anyone that's actually using that. On Fri, Feb 28, 2014 at 12:14 PM, Senthil, Athinant

Re: CQL decimal encoding

2014-02-26 Thread Peter Lin
You may need to bit shift if that is the case Sent from my iPhone > On Feb 26, 2014, at 2:53 AM, Ben Hood <0x6e6...@gmail.com> wrote: > > Hey Colin, > >> On Tue, Feb 25, 2014 at 10:26 PM, Colin Blower wrote: >> It looks like you are trying to implement the Decimal type. You might want >> to s

Re: CQL decimal encoding

2014-02-25 Thread Peter Lin
: > On Tue, Feb 25, 2014 at 12:50 PM, Peter Lin wrote: > > > > if I have time this week, I'll try to make a patch for the spec. Can't > > promise I can get to it this week, but having come across this issue with > > FluentCassandra, I'd like to help others a

  1   2   >