On Thu, Jul 28, 2011 at 10:59 AM, Sylvain Lebresne <sylv...@datastax.com>wrote:
> On Thu, Jul 28, 2011 at 4:00 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > > > > > > On Thu, Jul 28, 2011 at 9:35 AM, Jonathan Ellis <jbel...@gmail.com> > wrote: > >> > >> I'm talking about data compatibility, which is more important than cli > >> statement compatibility. > >> > >> Consider someone with a python program that creates a CF with the > >> default settings and inserts some (say) uuid columns and long data. > >> > >> If we changed CF creation to default to ascii we would break this > program. > >> > >> So we had to leave CF comparator defaulting to BytesType, when we > >> changed the CLI to respect comparator/validator definitions when > >> parsing user input. > >> > >> You could argue that CLI should continue to parse BytesType as ascii > >> but then how could a user input actual binary data? The lesser evil > >> here is to educate users that "if you want to use ascii column names, > >> that is how you should declare the comparator." > >> > >> On Thu, Jul 28, 2011 at 8:23 AM, Edward Capriolo <edlinuxg...@gmail.com > > > >> wrote: > >> > > >> > > >> > On Thu, Jul 28, 2011 at 8:46 AM, Jonathan Ellis <jbel...@gmail.com> > >> > wrote: > >> >> > >> >> It defaults to hex because that is how bytestype is represented. The > >> >> default remains bytestype to provide the kind of backwards > >> >> compatibility you are complaining about. :) > >> >> > >> >> On Thu, Jul 28, 2011 at 6:56 AM, Edward Capriolo > >> >> <edlinuxg...@gmail.com> > >> >> wrote: > >> >> > > >> >> > > >> >> > On Thursday, July 28, 2011, Sasha Dolgy <sdo...@gmail.com> wrote: > >> >> >> Unfortunately, the perception that I have as a business consumer > and > >> >> >> night-time hack, is that more importance and effort is placed on > >> >> >> ensuring information is up to date and correct on the > >> >> >> http://www.datastax.com/docs/0.8/index website and less on > keeping > >> >> >> the > >> >> >> wiki up to date or relevant... which forces people to be > introduced > >> >> >> to > >> >> >> a for-profit company to get relevant information ... which just so > >> >> >> happens to employ a substantial amount of Apache Cassandra > >> >> >> contributors ... not that there's anything wrong with that, right? > >> >> >> > >> >> >> On Thu, Jul 28, 2011 at 10:46 AM, David Boxenhorn > >> >> >> <da...@citypath.com> > >> >> >> wrote: > >> >> >>> This is part of a much bigger problem, one which has many parts, > >> >> >>> among > >> >> >>> them: > >> >> >>> > >> >> >>> 1. Cassandra is complex. Getting a gestalt understanding of it > >> >> >>> makes > >> >> >>> me > >> >> >>> think I understand how Alzheimer's patients must feel. > >> >> >>> 2. There is no official documentation. Perhaps everything is out > >> >> >>> there > >> >> >>> somewhere, who knows? > >> >> >>> 3. Cassandra is a moving target. Books are out of date before > they > >> >> >>> hit > >> >> >>> the > >> >> >>> press. > >> >> >>> 4. Most of the important knowledge about Cassandra exists in a > kind > >> >> >>> of > >> >> >>> oral > >> >> >>> history, that is hard to keep up with, and even harder to > >> >> >>> understand > >> >> >>> once > >> >> >>> it's long past. > >> >> >>> > >> >> >>> I think it is clear that we need a better one-stop-shop for good > >> >> >>> documentation. What hasn't been talked about much - but I think > >> >> >>> it's > >> >> >>> just > >> >> >>> as > >> >> >>> important - is a good one-stop-shop for Cassandra's oral history. > >> >> >>> > >> >> >>> (You might think this list is the place, but it's too noisy to be > >> >> >>> useful, > >> >> >>> except at the very tip of the cowcatcher. Cassandra needs a > >> >> >>> canonized > >> >> >>> version of its oral history.) > >> >> >> > >> >> > > >> >> > Well the problem is not lack of documentation but changing things > >> >> > that > >> >> > probably do not matter and thus invalidating all documentation. > >> >> > > >> >> > To stay on point. Why does the cli default to hex. Come on who is > >> >> > doing > >> >> > inserts in hex? Would it be more.natural for the cli to so this: > >> >> > > >> >> > 'ascii' auto function call ascii > >> >> > "utf8" auto function utf8 > >> >> > Oxafaf auto function hex > >> >> > > >> >> > Or really do not change get add a new statement > >> >> > Typedget > >> >> > And leave get alone > >> >> > > >> >> > The argument to have two methods that almost do the same thing is a > >> >> > bad > >> >> > one, > >> >> > but it is no worse then invalidating tons of docs. But really I > can't > >> >> > support a hex default, I know no one with a hex keyboard. > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Jonathan Ellis > >> >> Project Chair, Apache Cassandra > >> >> co-founder of DataStax, the source for professional Cassandra support > >> >> http://www.datastax.com > >> > > >> > I am a little confused. How can it be backwards compatible if the same > >> > statements don't work across versions? > >> > > >> > I am sure there is a good reason, but isn't there some clever way this > >> > can > >> > be done on the CLI without forcing me to create the column family with > >> > meta > >> > data or wrapping everything in asci('')? Something out of the box that > >> > is > >> > easy and makes both worlds happy? > >> > > >> > Remember I left the rdbms world to cure my addictions to schema's, > don't > >> > be > >> > a 'schema pusher' :) > >> > > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of DataStax, the source for professional Cassandra support > >> http://www.datastax.com > > > > I agree that defaulting a column family to ascii to make the CLI happy is > > the wrong thing to do. > > > > But I think that the CLI is for users, users are almost always working > with > > human readable data. > > > > I feel that most CLI's do not force users to wrap CLI strings in > ascii('') > > and are capable of working with binary data. > > http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html > > > > Maybe I am wrong, but I feel like this change could have been done > without > > being disruptive and forcing users to re-educate. Can the antlr grammar > be > > re-worked in any such way? > > I'll play devil's advocate here but it seems to me that changing it again > would > do exactly what you complain about here. Another change would confuse > users ever more. > > We could argue that what you propose is vastly superior to what we have > now and would thus justify yet one more change. But it's another debate, > one > that imho is debatable: I'm personally very happy when I use the CLI that > provided I have set the right comparator, I don't have to care about > quoting my > strings at all (having to care whether I need a single or double quote > would be > even more annoying). Also, making sure that people understand quickly that > there > are column comparators, that those are useful and that they better use > AsciiType or > UTF8Type if this is what they want is not entirely a bad thing in my > book. Again, > just saying that it's debatable and the current way the CLI does > things don't seem > so retarded to me. > > That changing was confusing to users, I agree. But it's done, let's avoid > doing > it again without a very good reason. > > -- > Sylvain > I'll play devil's advocate here but it seems to me that changing it again would do exactly what you complain about here. Another change would confuse users ever more. Who are your users? Cassandra 0.8.1 was released one month ago: http://data.story.lu/2011/06/29/cassandra-0-8-1 Sure those people that downloaded 0.8.1 for the first time using cassandra will be confused if we change it again. But those people are new likely confused by a lot of things. On the other hand all those who have been using Cassandra for a while with substantial deployments will not be confused because they likely are still on 0.7! and probably waiting a month to see how stable 0.8.X is before even playing with it. Not to rip apart the CLI, and get too far off topic but you mentioned: "I don't have to care about quoting my strings at all" [default@letsgo] set Users[jsmith][la[st] = 'Smith'; Command not found: `set Users[jsmith][la[st] = 'Smith';`. Type 'help;' or '?' for help. [default@letsgo] set Users[jsmith][la;st] = 'Smith'; Command not found: `set Users[jsmith][la;st] = 'Smith';`. Type 'help;' or '?' for help. So unless someone just drops a month into bullet proofing the CLI and the antlr grammar, it is going to end up changing anyway. ascii('edward')