Re: Zurich / Swiss / Alps meetup

2012-07-11 Thread Benoit Perroud
Coming back on this thread, we are proud to announce we opened a Swiss BigData UserGroup. http://www.bigdata-usergroup.ch/ Next meetup is July 16, with topic "NoSQL Storage: War Stories and Best Practices". Hope to meet you there ! Benoit. 2012/5/17 Sasha Dolgy : > All, > > A year ago I made

Small typo in conf/cassandra.yaml

2011-05-10 Thread Benoit Perroud
Hi all, I found out a small typo in cassandra.yaml, which can confuse inattentive copy-paster. Here is the patch. Index: conf/cassandra.yaml === --- conf/cassandra.yaml (revision 1101465) +++ conf/cassandra.yaml (working copy) @@ -8

Re: Cassandra start/stop scripts

2011-08-02 Thread Benoit Perroud
Kill -9 (SIGKILL) is the worst signal to use. It has the advantage to kill quickly the process, but no shutdown hook are called. You should better kill -15 (SIGTERM, which is the default). 2011/7/26 mcasandra : > I need to write cassandra start/stop script. Currently I run "cassandra" to > start

Re: Sample Cassandra project in Tomcat

2011-08-03 Thread Benoit Perroud
I suppose what you are looking for is an example of interacting with a java app. You should have a look at the high(er) level client hector https://github.com/rantav/hector/ You should find what you are looking for there. If you are looking for a tomcat (and .war) example, you should send an email

Re: Killing cassandra is not working

2011-08-03 Thread Benoit Perroud
Seems like you have already a Cassandra instance running, so the second instance cannot open the same port twice. I would suggest you to kill all instances of Cassandra and start it again. 2011/8/3 Nilabja Banerjee > try to use *grep* command to check the port where your cassandra was > runni

Re: Killing cassandra is not working

2011-08-03 Thread Benoit Perroud
so use netstat to find out which process had opened the port. 2011/8/3 CASSANDRA learner > Thnks for the reply Nila > > When i did PS command, I could not able to find any process related to > cassandra. Thts the problem.. > > > On Wed, Aug 3, 2011 at 4:12 PM,

Re: Significance of java_pidxxx.hprof

2011-08-03 Thread Benoit Perroud
When an OutOfMemoryError is thrown, a heap dump file named java_pid.hprof will be created automatically is you run your java app with +HeapDumpOnOutMemoryError 2011/8/3 CASSANDRA learner : > As per subject, Please explain me what is the significance of > java_pidxxx.hprof >

Re: Sample Cassandra project in Tomcat

2011-08-03 Thread Benoit Perroud
2011/8/3 CASSANDRA learner : > Hi, >  can you please send me the mailing list address of tomcat http://tomcat.apache.org/lists.html > On Wed, Aug 3, 2011 at 4:07 PM, Benoit Perroud wrote: >> >> I suppose what you are looking for is an example of interacting with a >>

Fewer wide rows vs. more smaller rows

2011-08-04 Thread Benoit Perroud
Hi All, In a conceptual point of view, I'm wondering what is the pros & cons, mainly in term of access efficiency, of both approach : - Grouping row keys together to reduce the number of keys, but having wider rows (with more columns) - One object in one row Let's illustrate with an example : I

Re: Sample Cassandra project in Tomcat

2011-08-04 Thread Benoit Perroud
Or directly what you are looking at (tomcat + cassandra using hector client) : https://github.com/riptano/twissjava 2011/8/3 Benoit Perroud : > 2011/8/3 CASSANDRA learner : >> Hi, >>  can you please send me the mailing list address of tomcat > > http://tomcat.apache.org/l

Re: HOW TO select a column or all columns that start with X

2011-08-04 Thread Benoit Perroud
https://github.com/edanuff/CassandraIndexedCollections 2011/8/4 CASSANDRA learner : > Can you please gimme an example on this using hector client > > On Thu, Aug 4, 2011 at 7:18 AM, Boris Yen wrote: >> >> It seems to me that your column name consists of two components. If you >> have the luxury t

Re: Fewer wide rows vs. more smaller rows

2011-08-04 Thread Benoit Perroud
Thanks for your advise. Make sense. And without sticking to my dummy example, conceptually, what has a smaller memory footprint : 1M rows of 1 column or 1 row with 1M columns ? And if the row key and column name are known, is there any performance difference between both scenarios ? Thanks

Re: How to solve this kind of schema disagreement...

2011-08-05 Thread Benoit Perroud
Based on http://wiki.apache.org/cassandra/FAQ#schema_disagreement, 75eece10-bf48-11e0--4d205df954a7 own the majority, so shutdown and remove the schema* and migration* sstables from both 192.168.1.28 and 192.168.1.27 2011/8/5 Dikang Gu : > [default@unknown] describe cluster; > Cluster Informa

Re: Fewer wide rows vs. more smaller rows

2011-08-07 Thread Benoit Perroud
performance http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ There is no magic number. The best advice is to follow Jonathan's advice. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Aug 2011, at 08:22, Benoit Pe

Re: Setup Cassandra0.8 in Eclipse

2011-08-07 Thread Benoit Perroud
Make sure svn is on the PATH. If you open a terminal (or cmd), running svn command should work. On 07. 08. 11 23:39, Alvin UW wrote: It seems svn wasn't installed, but i did install it.

Re: Need help in CF design

2011-08-11 Thread Benoit Perroud
You can apply this query really simply using cassandra and secondary indexes. You will have a CF "TABLE", where row keys are your PK. Just to be sure of my understanding, your SQL query will either return 1 row or no row, right ? 3) SliceQuery returns a range of columns for a given key, it m

Re: CompositeType

2011-08-15 Thread Benoit Perroud
You should give a look at https://github.com/edanuff/CassandraIndexedCollections This is a rather good starting point for Composites. 2011/8/15 Stephen Pope : >  Hey, is there any documentation or examples of how to use the CompositeType? > I can't find anything about it on the wiki or the datas

Re: The way to query a CF with "start > 10 and end < 100"

2011-08-29 Thread Benoit Perroud
queries start > 10 and end < 100 is not straight forward to modelize, you should use the value of start as column name, and check on client side the second condition. Just for comparison, modeling 10 < value < 100 is rather much easier if you set your values as column name, or using CompositeType

SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Hi All, I started using SSTableSimpleUnsortedWriter to load data, and my data has a few rows but a lot of column name in each rows. I call SSTableSimpleUnsortedWriter.newRow every 10'000 columns inserted. But the time taken to insert columns is increasing as the column family is increasing. The

Re: SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Thanks for your answer. 2011/9/2 Sylvain Lebresne : > On Fri, Sep 2, 2011 at 10:29 AM, Benoit Perroud wrote: >> Hi All, >> >> I started using SSTableSimpleUnsortedWriter to load data, and my data >> has a few rows but a lot of column name

Re: import data into cassandra

2011-09-18 Thread Benoit Perroud
There is no direct way to do that, but reading a CSV and inserting rows in Java is really easy. But you may want have a look at the new bulk loading tool, sstableloader, described here : http://www.datastax.com/dev/blog/bulk-loading Small detail, it seems you still write email at the incubator ML

Re: Possibility of going OOM using get_count

2011-09-19 Thread Benoit Perroud
The workaround for 0.7 is calling get_slice and count on client side. It's heavier, sure, but you will then be able to set start column accordingly. 2011/9/19 Tharindu Mathew : > Thanks Aaron and Jake for the replies. > Any chance of a possible workaround to use for Cassandra 0.7? > > On Mon, Se

Re: Bulk uploader issue on multi-node cluster

2011-09-23 Thread Benoit Perroud
On the sstableloader config, make sure you have the seed set and rpc_address and rpc_port pointing to your cassandra instance (127.0.0.2) 2011/9/23 Thamizh > Hi All, > > I am using bulk-loading to upload data(from lab02) to multi-node cluster of > 3 machines(lab02,lab03 & lab04) with sigle eth

Re: Multiple Keyword Lookup Indexes

2011-11-07 Thread Benoit Perroud
You could directly use secondary indexes on the other fields instead of handling yourself your indexes : Define your global id (can be UUID), and have columns loginName, email etc with a secondary index. Retrieval will then be fast. 2011/11/7 Felix Sprick : > Hallo, > > We are implementing a Cass

Off-heap caching through ByteBuffer.allocateDirect when JNA not available ?

2011-11-09 Thread Benoit Perroud
Hi, I wonder if you have already discussed about ByteBuffer.allocateDirect alternative to JNA memory allocation ? If so, do someone mind send me a pointer ? Thanks ! Benoit.

Re: Off-heap caching through ByteBuffer.allocateDirect when JNA not available ?

2011-11-10 Thread Benoit Perroud
ra/browse/CASSANDRA-3271 > > On Wed, Nov 9, 2011 at 5:54 AM, Benoit Perroud wrote: >> Hi, >> >> I wonder if you have already discussed about ByteBuffer.allocateDirect >> alternative to JNA memory allocation ? >> >> If so, do someone mind send me a pointer ? &g

Re: need help with choosing correct tokens for ByteOrderedPartitioner

2011-11-28 Thread Benoit Perroud
You may want to add 29991231 instead of appending. Le lundi 28 novembre 2011, Piavlo a écrit : > Anyone can help with this? > > Thanks > > On 11/24/2011 11:55 AM, Piavlo wrote: >> >> Hi, >> >> We need help with choosing correct tokens for ByteOrderedPartitioner >> Originally the key where suppo

Re: Counters and Top 10

2011-12-25 Thread Benoit Perroud
With Composite Column Name, you can even have column composed of sore (int) and userid (uuid or whatever). Empty column value to avoid repeating user UUID. 2011/12/22 R. Verlangen : > I would suggest you to create a CF with a single row (or multiple for > historical data) with a date as key (utf8

Re: Cassandra 1.1 row isolation cross datacenter replication

2012-02-21 Thread Benoit Perroud
The isolation is guarantee locally to the node. If two client are reading / writing to the same node, the one that read will not see partial mutations. 2012/2/21 Allen Servedio : > Hi, > > I saw that row level isolation was added in the beta of Cassandra 1.1 and I > have the following question: gi

Re: design that mimics twitter tweet search

2012-03-18 Thread Benoit Perroud
The simpliest modeling you could have is using the keyword as key, a timestamp/time UUID as column name and the tweetid as value -> cf['keyword']['timestamp'] = tweetid then you do a range query to get all tweetid sorted by time (you may want them in reverse order) and you can limit to the number

Re: Link in Wiki broken

2012-03-18 Thread Benoit Perroud
http://blip.tv/datastax/getting-to-know-the-cassandra-codebase-4034648 2012/3/18 Tharindu Mathew : > Hi, > > It seems that [1] is broken. Wonder if it exists somewhere else? > > [1] - > http://www.channels.com/episodes/show/11765800/Getting-to-know-the-Cassandra-Codebase > > -- > Regards, > > Tha

Re: Cassandra - crash with “free() invalid pointer”

2012-03-22 Thread Benoit Perroud
Sounds like a race condition in the off heap caching while calling Unsafe.free(). Do you use cache ? What is your use case when you encounter this error ? Are you able to reproduce it ? 2012/3/22 Maciej Miklas : > Hi *, > > My Cassandra installation runs on flowing system: > > Linux with Kernel

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-27 Thread Benoit Perroud
Hi All, Thanks a lot for the release. I just upgraded my 1.1-beta1 to 1.1-beta2, and I get the following error : INFO 10:56:17,089 Opening /app/cassandra/data/data/system/LocationInfo/system-LocationInfo-hc-18 (74 bytes) INFO 10:56:17,092 Opening /app/cassandra/data/data/system/LocationInfo/sys

Re: [BETA RELEASE] Apache Cassandra 1.1.0-beta2 released

2012-03-27 Thread Benoit Perroud
orry for any inconvenience. > > -- > Sylvain > > On Tue, Mar 27, 2012 at 12:57 PM, Benoit Perroud wrote: >> Hi All, >> >> Thanks a lot for the release. >> I just upgraded my 1.1-beta1 to 1.1-beta2, and I get the following error : >> >>  INFO 10:56

Bulk loading errors with 1.0.8

2012-04-05 Thread Benoit Perroud
Hi All, I'm experiencing the following errors while bulk loading data into a cluster ERROR [Thread-23] 2012-04-05 09:58:12,252 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-23,5,main] java.lang.RuntimeException: Insufficient disk space to flush 781359405649475491

Re: unsubscribe

2012-04-07 Thread Benoit Perroud
http://wiki.apache.org/cassandra/FAQ#unsubscribe Le 7 avril 2012 14:37, Jeffrey Fass a écrit : > unsubscribe > > -- sent from my Nokia 3210

Re: unsubscribe

2012-04-27 Thread Benoit Perroud
http://wiki.apache.org/cassandra/FAQ#unsubscribe Le 27 avril 2012 19:20, Ramkumar Vaidyanathan (PDF) a écrit : > unsubscribe > > > > > The information in this email and any attachments to it may be confidential > and/or privileged. Unless you are the intended recipient (or authorized to > receive

Re: Building SSTables with SSTableSimpleUnsortedWriter

2012-04-29 Thread Benoit Perroud
big buffer size will use more Heap memory at creation of the tables. Not sure impact on server side, but shouldn't be a big difference. I personally use 512Mb. 2012/4/28 sj.climber : > Can anyone comment on best practices for setting the buffer size used by > SSTableSimpleUnsortedWriter?  I'm

Re: Bulkload into a different CF

2012-05-01 Thread Benoit Perroud
!! Without any guarantee. I know it works but I never used this in production !! You can copy the sstables (renaming them accordingly) and call nodetool refresh. Don't forget to create your column family CF2 before. 2012/5/1 Oleg Proudnikov : > Hello, > > Is it possible to create an exact repli

Re: Bulkload into a different CF

2012-05-01 Thread Benoit Perroud
I would just try to copy instead of moving first, and dropping the old CF or the not needed snapshot if necessary when everything is ok. 2012/5/1 Oleg Proudnikov : > Benoit Perroud noisette.ch> writes: > >> >> You can copy the sstables (renaming them accordingly) and >

SSTableWriter and Bulk Loading life cycle enhancement

2012-05-03 Thread Benoit Perroud
Hi All, I'm bulk loading (a lot of) data from Hadoop into Cassandra 1.0.x. The provided CFOutputFormat is not the best case here, I wanted to use the bulk loading feature. I know 1.1 comes with a BulkOutputFormat but I wanted to propose a simple enhancement to SSTableSimpleUnsortedWriter that coul

Re: sstableloader 1.1 won't stream

2012-05-07 Thread Benoit Perroud
You may want to upgrade all your nodes to 1.1. The streaming process connect to every living nodes of the cluster (you can explicitely diable some nodes), so all nodes need to speak 1.1. 2012/5/7 Pieter Callewaert : > Hi, > > > > I’m trying to upgrade our bulk load process in our testing env. >

Re: Zurich / Swiss / Alps meetup

2012-05-18 Thread Benoit Perroud
+1 ! 2012/5/17 Sasha Dolgy : > All, > > A year ago I made a simple query to see if there were any users based in and > around Zurich, Switzerland or the Alps region, interested in participating > in some form of Cassandra User Group / Meetup.  At the time, 1-2 replies > happened.  I didn't do mu

Usage Pattern : "unique" value of a key.

2011-01-12 Thread Benoit Perroud
Hi ML, I wonder if someone has already experiment some kind of unique index on a column family key. Let's go for a short example : the key is the username. What happens if 2 users want to signup at the same time with the same username ? So has someone already addressed this "pattern" in Cassandr

Re: Usage Pattern : "unique" value of a key.

2011-01-13 Thread Benoit Perroud
rite, then both nodes think the key belongs to them. So my idea of writing a lock is not well suitable... Does anyone have another idea to share regarding this topic ? Thanks, Kind regards, Benoit. 2011/1/13 Oleg Anastasyev : > Benoit Perroud noisette.ch> writes: > >> >> My

Re: Quick Poll: Server names

2010-07-27 Thread Benoit Perroud
We use name of (european) cities for "logical" functionnalities : - berlin01, berlin02, berlin03 part are mysql cluster, - zurich1 and zurich2 are AD, - roma01, roma02, and so on are Cassanrda cluster for the Roma project - and so on. We found this way a good tradeoff. Regards, Benoit. 2010/7/

Re: Nodes Timing Out

2010-03-28 Thread Benoit Perroud
ulimit -n returns you unlimited ? 2010/3/28 James Golick : > unlimited > > On Sat, Mar 27, 2010 at 12:09 PM, Chris Goffinet wrote: >> >> what's the ulimit set to? >> -Chris >> On Mar 27, 2010, at 10:29 AM, James Golick wrote: >> >> Hey, >> I put our first cluster in to production (writing but no

Re: 0.5.1 exception: java.io.IOException: Reached an EOL or something bizzare occured

2010-03-28 Thread Benoit Perroud
I got the same error when the nodes are using lot of I/O, i.e during compaction. 2010/3/28 Eric Yu : > I have not restart my nodes. > OK, may be I should give 0.6 a try. > > On Sun, Mar 28, 2010 at 9:53 AM, Jonathan Ellis wrote: >> >> It means that a MessagingService socket closed unexpectedly.  

Re: get_range_slice leads to java.lang.OutOfMemoryError?

2010-04-02 Thread Benoit Perroud
A way to read all the db without having an OOM is to limit the amount of rows to be returned, and to iterate over the query, the starting key being the last returned key. Note that doing that way the first key of the next iteration is the same as the last key of the preivous iteration. The warning

Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
It exists other tools than jhat to browse a heap dump, which stream the heap dump instead of loading it full in memory like jhat do. Kind regards, Benoit. 2010/4/3 Weijun Li : > I'm running a test to write 30 million columns (700bytes each) to Cassandra: > the process ran smoothly for about 20mi

Re: Heap sudden jump during import

2010-04-03 Thread Benoit Perroud
: > Thank you Benoit. I did a search but couldn't find any that you mentioned. > Both jhat and netbean load entire map file int memory. Do you know the name > of the tools that requires less memory to view map file? > Thanks, > -Weijun > > On Sat, Apr 3, 2010 at 12:55 AM, Benoi

Re: multinode cluster wiki page

2010-04-03 Thread Benoit Perroud
Hi, Nice work. I guess just a small mistake : the second 192.168.1.1 should be 192.168.2.34 And I would suggest to add a small part on making the thrift interface listening on more than localhost. Kind regards, Benoit. 2010/4/3 Benjamin Black : > Just added this to the wiki as it seemed a ver

Re: 0.5.1 exception: java.io.IOException: Reached an EOL or something bizzare occured

2010-04-03 Thread Benoit Perroud
s, Benoit. 2010/4/3 Anty : > Does anyone have solve the problem?I encounter the same error too. > > On Mon, Mar 29, 2010 at 12:12 AM, Benoit Perroud wrote: >> >> I got the same error when the nodes are using lot of I/O, i.e during >> compaction. >> >> 201

Re: How many KeySpace will you use in a single application?

2010-04-10 Thread Benoit Perroud
One point in using several keyspaces is that replication factor is per keyspace. If you have a part of your application which generate a lot of data whoss can be lost (some non critical logs?), then a dedicated keyspace with a smaller replication factor can be a good thing. Kind regards, Benoit.

Re: ORM in Cassandra?

2010-04-23 Thread Benoit Perroud
I understand the question more like : Is there already a lib which help to get rid of writing hardcoded and hard to maintain lines like : MyClass data; String[] myFields = {"name", "label", ...} List columns; for (String field : myFields) { if (field == "name") { columns.add(new Column(

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
"orthogonal" means "go to the opposite direction, but without going back". Including "transaction" in Cassandra needs to turn 90 degrees the design of Cassandra. Kind regards, Benoit. 2010/4/24 dir dir : >>Transactions are orthogonal to the design of Cassandra > > Sorry, Would you want to tell

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
>>the design of Cassandra > > I do not understand what is the meaning of "needs to turn 90 degrees"?? > > Thank you. > > On Sun, Apr 25, 2010 at 12:30 AM, Benoit Perroud wrote: >> >> "orthogonal" means "go to the opposite direct

Re: Does anybody work about transaction on cassandra ?

2010-04-24 Thread Benoit Perroud
Ok in this particular context it means no dependencies. Thanks for your precision. Kind regards, Benoit. 2010/4/24 Jonathan Ellis : > On Sat, Apr 24, 2010 at 12:44 PM, Benoit Perroud wrote: >> "orthogonal" means "90 degrees".  Two lines are orthogonal if the >