Re: Modeling troubles

2011-07-21 Thread sridhar basam
On Thu, Jul 21, 2011 at 9:18 AM, Stephen Pope wrote:

>  For a side project I’m working on I want to store the entire set of
> possible Reversi boards. There are an estimated 10^28 possible boards. Each
> board (from the best way I could think of to implement it) is made up of 2,
> 64-bit numbers (black pieces, white pieces…pieces in neither of those are
> empty spaces) and a bit to indicate who’s turn it is. I’ve thought of a few
> possible ways to do it:
>
> ** **
>
> **-  **Entire board as row key, in an array of bytes. I’m not sure
> how well Cassandra can handle 10^28 rows. I could also break this up into
> separate cfs for each depth of move (initially there are 4 pieces on the
> board in total. I could make a cf for 5 piece, 6, etc to 64). I’m not sure
> if there’s any advantage to doing that.
>
> **-  **64-bit number for the black pieces as row key, with 65-bit
> column names (white pieces + turn). I’ve read somewhere that there’s a rough
> limit of 2-billion columns, so this will be problematic for certain. This
> can also be broken into separate cfs, but I’m still going to hit the column
> limit
>
> ** **
>
> Is there a better way to achieve what I’m trying to do, or will either of
> these approaches surprise me and work properly?
>

Short answer, it is just not possible to store or even compute the kind of
information you want to. You can do the math on how many years/centuries it
would take to compute that many combinations no to mention what it would
take to store on the order of 1000 YB.

 sridhar


Re: split large sstable

2011-11-21 Thread sridhar basam
On Mon, Nov 21, 2011 at 10:34 AM, Edward Capriolo wrote:

>
>
> On Mon, Nov 21, 2011 at 10:07 AM, Dan Hendry wrote:
>
>> Pretty sure your argument about indirect blocks making large files
>> inefficient only pertains to ext2/3 and not ext4. It seems ext4 replaces
>> the
>> 'indirect block' approach with extents
>> (
>> http://kernelnewbies.org/Ext4#head-7c5fd53118e8b888345b95cc11756346be4268f4
>> , http://en.wikipedia.org/wiki/Ext4#Features).
>>
>> I was not aware of this difference in the file systems and it seems to be
>> a
>> compelling reason ext4 should be chosen (over ext3) for Cassandra - at
>> least
>> when using size tiered compaction.
>>
>>
If you are using a Redhat distribution, at least in the 5.x series, make
sure that you pass in a '-O extent' option when you create the filesystem.
Otherwise extents are not enabled by default.


> IMHO there is only one good reason left to use ext3. For a 100MB /boot
> partition since the boot loaders have an easier time with it.
>
> EXT4 is better then EXT3 in every way. It is the default formatting for
> RHEL. Do not fight the future.
>
>
> http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/a_great_reason_to_use
>
>
I agree with ext4 being superior to ext3 but some constructive feedback
about your graphs.

You might want to add a legend or point out the before and after if you
want to show difference between ext3 and ext4. I can kind of see that
something might have changed on the Friday but without a legend it makes it
hard to see the point you are trying to make.

 Sridhar


Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread sridhar basam
There is no global setting in linux to turn off nagle.

 Sridhar


2012/1/26 Jeffrey Kesselman :
> You know... here aught to be a command line command to set it.  There is in
> Solaris and Windows.  But Im having trouble finding it for Linux.
>
>
> 2012/1/26 ruslan usifov 
>>
>> Sorry but you misunderstand me, is ask  is cassandra have any option to
>> control TCP_NODELAY behaviour, so we doesn't need patch cassandra or thrift
>> code.
>>
>> I found this article
>> https://wiki.cs.columbia.edu:8443/pages/viewpage.action?pageId=12585536,
>> where упоминается mentioned coreTransport.TcpClient.NoDelay, but what is
>> this i misunderstand
>>
>>
>>
>> 2012/1/26 Jeffrey Kesselman 
>>>
>>> "
>>> To set or get a TCP socket option, call getsockopt(2) to read
>>> or setsockopt(2) to write the option with the option level argument set
>>> to SOL_TCP. In addition, most SOL_IP socket options are valid on TCP
>>> sockets. For more information see ip(7).
>>> ...
>>> TCP_NODELAY If set, disable the Nagle algorithm. This means that segments
>>> are always sent as soon as possible, even if there is only a small amount of
>>> data. When not set, data is buffered until there is a sufficient amount to
>>> send out, thereby avoiding the frequent sending of small packets, which
>>> results in poor utilization of the network. This option cannot be used at
>>> the same time as the option TCP_CORK." http://bit.ly/zpvLbP
>>>
>>>
>>> On Thu, Jan 26, 2012 at 12:10 PM, ruslan usifov 
>>> wrote:



 2012/1/26 Jeffrey Kesselman 
>
> Most operating systems have a way to do this at the OS level.
>

 Could you please provide this way for linux?, for particular
 application? Maybe some sysctl?

>
>
> On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov
>  wrote:
>>
>> Hello
>>
>> Is it possible set TCP_NODELAY on thrift socket in cassandra?
>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.


>>>
>>>
>>>
>>> --
>>> It's always darkest just before you are eaten by a grue.
>>
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.


Re: Disable Nagle algoritm in thrift i.e. TCP_NODELAY

2012-01-26 Thread sridhar basam
Which socket API?

http://www.php.net/manual/en/function.socket-set-option.php

Is possible to do the appropriate setsockopt call to disable NAGLE.

 Sridhar

2012/1/26 ruslan usifov :
>
>
> 27 января 2012 г. 1:19 пользователь aaron morton 
> написал:
>>
>> Outgoing TCP connections between nodes have TCP_NODELAY on, so do server
>> side THRIFT sockets.
>>
> Thanks, for exhaustive answer
>
>
>>
>> I would assume your client will be setting it as well.
>>
>
> No php client doesn have TCP_NODELAY, because php stream sockets doesn't
> allow set sock options - ie no such API
>
>>
>> Cheers
>>
>>
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 27/01/2012, at 6:54 AM, sridhar basam wrote:
>>
>> There is no global setting in linux to turn off nagle.
>>
>> Sridhar
>>
>>
>> 2012/1/26 Jeffrey Kesselman :
>>
>> You know... here aught to be a command line command to set it.  There is
>> in
>>
>> Solaris and Windows.  But Im having trouble finding it for Linux.
>>
>>
>>
>> 2012/1/26 ruslan usifov 
>>
>>
>> Sorry but you misunderstand me, is ask  is cassandra have any option to
>>
>> control TCP_NODELAY behaviour, so we doesn't need patch cassandra or
>> thrift
>>
>> code.
>>
>>
>> I found this article
>>
>> https://wiki.cs.columbia.edu:8443/pages/viewpage.action?pageId=12585536,
>>
>> where упоминается mentioned coreTransport.TcpClient.NoDelay, but what is
>>
>> this i misunderstand
>>
>>
>>
>>
>> 2012/1/26 Jeffrey Kesselman 
>>
>>
>> "
>>
>> To set or get a TCP socket option, call getsockopt(2) to read
>>
>> or setsockopt(2) to write the option with the option level argument set
>>
>> to SOL_TCP. In addition, most SOL_IP socket options are valid on TCP
>>
>> sockets. For more information see ip(7).
>>
>> ...
>>
>> TCP_NODELAY If set, disable the Nagle algorithm. This means that segments
>>
>> are always sent as soon as possible, even if there is only a small amount
>> of
>>
>> data. When not set, data is buffered until there is a sufficient amount to
>>
>> send out, thereby avoiding the frequent sending of small packets, which
>>
>> results in poor utilization of the network. This option cannot be used at
>>
>> the same time as the option TCP_CORK." http://bit.ly/zpvLbP
>>
>>
>>
>> On Thu, Jan 26, 2012 at 12:10 PM, ruslan usifov 
>>
>> wrote:
>>
>>
>>
>>
>> 2012/1/26 Jeffrey Kesselman 
>>
>>
>> Most operating systems have a way to do this at the OS level.
>>
>>
>>
>> Could you please provide this way for linux?, for particular
>>
>> application? Maybe some sysctl?
>>
>>
>>
>>
>> On Thu, Jan 26, 2012 at 8:17 AM, ruslan usifov
>>
>>  wrote:
>>
>>
>> Hello
>>
>>
>> Is it possible set TCP_NODELAY on thrift socket in cassandra?
>>
>>
>>
>>
>>
>> --
>>
>> It's always darkest just before you are eaten by a grue.
>>
>>
>>
>>
>>
>>
>> --
>>
>> It's always darkest just before you are eaten by a grue.
>>
>>
>>
>>
>>
>>
>> --
>>
>> It's always darkest just before you are eaten by a grue.
>>
>>
>


Re: reduced cached mem; resident set size growth

2011-01-28 Thread sridhar basam
On Thu, Jan 27, 2011 at 12:23 PM, Chris Burroughs  wrote:

> java -version
> java version "1.6.0_20"
> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
>
> cmd line arg (paths edited):
> /usr/java/jdk1.6.0_20/bin/java -Xms1500M -cXmx1500M -ea -XX:+UseParNewGC
>

Is the above optiont for setting the max heap size a typo "-cXmx1500M"?

 sridhar


Re: reduced cached mem; resident set size growth

2011-01-28 Thread sridhar basam
What about your permgen usage? Do you track that? Use something like "jstat
-gc -t  5s 100" to track it. Or turn up verbose GC on your command
line options to what is happening.

 Sridhar

On Fri, Jan 28, 2011 at 11:38 AM, Chris Burroughs  wrote:

> On 01/28/2011 11:29 AM, Jake Luciani wrote:
> > Are you using a row cache? if so what is it set too? in general it should
> > not be a percentage.
> >
>
>
> KeysCached="0" KeyCacheSavePeriodInSeconds="0"
>RowsCached="40" RowCacheSavePeriodInSeconds="1800"
>/>
>
> row_cache_size == row_cache_capacity before the start of RSS data
> collection.  According to jconsole heap size is not growing larger than
> the maximum set at the command line.
>


Re: Tracking down read latency

2011-02-03 Thread sridhar basam
The data provided is also a average value since boot time. Run the -x as
suggested below but run it via a interval of around 5 seconds. You very well
could be having i/o issue, it is hard to tell from the overall average value
you provided. Collect "iostat -x 5" during the times when you see slow reads
and see how busy the disks are.


 Sridhar

On Thu, Feb 3, 2011 at 3:21 AM, Peter Schuller
wrote:

> > $ iostat
>
> As rcoli already mentioned you don't seen to have an I/O problem, but
> as a point of general recommendation: When determining whether you are
> blocking on disk I/O, pretty much *always* use "iostat -x" rather than
> the much less useful default mode of iostat. The %util and queue
> wait/average time columns are massively useful/important; without them
> one is much more blind as to whether or not storage devices are
> actually saturated.
>
> --
> / Peter Schuller
>


Re: Using Cassandra to store files

2011-02-04 Thread sridhar basam
For the  number of file the OP has why not just use a traditional filesystem
and solr to index the pdf data. You get to search inside of the files for
relevant information?

 Sri

On Fri, Feb 4, 2011 at 12:47 PM, buddhasystem  wrote:

>
> Even when storage is in NFS, Cassandra can still be quite useful as a file
> catalog. Your physical storage can change, move etc. Therefore, it's a good
> idea to provide mapping of logical names to physical store points (which in
> fact can be many). This is a standard technique used in mass storage.
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Using-Cassandra-to-store-files-tp5988698p5993357.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at
> Nabble.com.
>


Re: Tracking down read latency

2011-02-04 Thread sridhar basam
On Fri, Feb 4, 2011 at 2:44 PM, David Dabbs  wrote:

>
> Our data is on sdb, commit logs on sdc.
> So do I read this correctly that we're 'await'ing 6+millis on average for
> data drive (sdb)
> requests to be serviced?
>
>
That is right. Those numbers look pretty good for rotational media. What
sort of read latencies do you see? Have you also looked into GC.

 Sridhar


Re: OOM during batch_mutate

2011-02-07 Thread sridhar basam
Looks like you don't have a big enough working set from your GC logs, there
doesn't seem to be a lot being reclaimed in the GC process. The process is
reclaiming a few hundred MB and is running every few seconds. How big are
your caches? The probable reason that it works the first couple times when
you create it due to nothing being in cache as it gets built up.

 Sridhar



On Mon, Feb 7, 2011 at 8:31 AM, Patrik Modesto wrote:

> Just tried current 0.7.1 from cassandra-0.7 branch and it does the
> same. OOM after three runs.
>
> -Xm* setting is computed by cassandra-env.sh like this:  -Xms8022M
> -Xmx8022M -Xmn2005M
>
> What am I doing wrong?
>
> Thanks,
> Patrik
>
> On Mon, Feb 7, 2011 at 14:18, Patrik Modesto 
> wrote:
> > I forgot to mention I use 0.7.0 stable version.
> >
> > HTH,
> > Patrik
> >
>


Re: mutator.execute() timings - big variance noted - pointers needed on understanding/improving it

2011-03-10 Thread sridhar basam
Sounds like GC from your description of fast->slow->fast. Collect GC times
from both the client and server side and plot against your application
timing.

 If you uncomment the verbose GC entries in the cassandra-env.sh file you
should get timing for the server side, pass in the same arguments for your
client. Align time across the 3 files and plot to see if GC is the cause.

 Sridhar



On Thu, Mar 10, 2011 at 9:30 AM, Roshan Dawrani wrote:

> Hi,
>
> I am in the middle of some load testing on a 1-node Cassandra setup. We are
> not on very high loads yet. We have recorded the timings taken up by
> mutator.execute() calls and we see this kind of variation during the test
> run:
>
> So, 25% of the times, execute() calls come back in 25 milli-seconds, but
> the longer calls go upto 4 seconds.
>
> Can someone please provide some pointers on what and where to focus on in
> my Hector / Cassandra setup? We are mostly on the default Cassandra
> configuration at this time - only change is the max connection pool size
> (CassandraHostConfigurator.maxActive) is changed to 300 from a default of
> 50.
>
> I would also like to add that the time increase is not linear - it starts
> fast, goes, slow, very slow, and becomes faster again.
>
> 
>   25% 29
>   50%105
>   66%185
>   70%208
>   75%240
>   80%297
>   90%510
>   95%854
>   98%   1075
>   99%   1215
>  100%   4442
> 
>
> --
> Roshan
> Blog: http://roshandawrani.wordpress.com/
> Twitter: @roshandawrani 
> Skype: roshandawrani
>
>


Re: Deleting "old" SSTables

2011-03-22 Thread sridhar basam
Force a GC to remove the unused sstables. Use something like jconsole or cmd
line "jmap -histo:live ". You would run the jmap command as the
cassandra user or root. The jmap will give you a bunch of output on live
objects in the heap if you choose to look at it.

 Sridhar

On Tue, Mar 22, 2011 at 8:30 AM, Jonathan Colby wrote:

> According to the Wiki Page on compaction:  once compaction is finished, the
> old SSTable files may be deleted*
>
> * http://wiki.apache.org/cassandra/MemtableSSTable
>
> I thought the old SSTables would be deleted automatically, but this wiki
> page got me thinking otherwise.
>
> Question is,  if it is true that old SSTables must be manually deleted, how
> can one safely identify which SSTables can be deleted??
>
> Jon
>
>
>
>
>
>


Re: running compaction from a machine outside the cluster

2011-04-05 Thread sridhar basam
If you can reach your jmx ip/port, you can use any jmx client to start a
compaction. Use jconsole to connect to your jmx ip/port and then navigate to
mbeans->org.apache.cassandra.db->columnfamilies-> ->operations

Underneath there you can invoke a bunch of methods including compaction.

 Sridhar

On Tue, Apr 5, 2011 at 4:38 PM, Peter Schuller
wrote:

> >  Is there a way I can run compaction on the cassandra cluster
> from a
> > machine where cassandra is not installed.I have a cluster of 6 machines
> > but I want to run compaction on them from a different machine which does
> not
> > have cassandra installed.
>
> "nodetool -h your-remote-host compact", assuming firewall allows.
> Unless "does not have cassandra installed" means that you can't use
> nodetool either?
>
> --
> / Peter Schuller
>


Re: cassandra as an email store ...

2011-04-29 Thread sridhar basam
Have you already looked at some research out of IBM about this usecase?
Paper is at

http://ewh.ieee.org/r6/scv/computer/nfic/2009/IBM-Jun-Rao.pdf

 Sridhar


Re: using too much RAM

2010-10-14 Thread sridhar basam
Yes, on linux atleast, lsof would show you that. lsof -d mem -p . You
can also look at /proc//maps, again linux centric.

 Sridhar

On Thu, Oct 14, 2010 at 3:44 PM, B. Todd Burruss  wrote:

>  thx, it does say that in the log, but that is probably just a reflection
> of whatever is read from cassandra.yaml.
>
> i am wondering if some unix tool can tell me if my process is mmap'ing
> files.  maybe lsof?
>
>
> On 10/14/2010 12:07 PM, Rob Coli wrote:
>
>> On 10/14/10 10:59 AM, B. Todd Burruss wrote:
>>
>>> 0.7.0-beta2
>>>
>>> top is reporting my cassandra process as using 11g. i have set
>>> "disk_access_mode: standard" and Xmx8G (verified via JMX)
>>>
>>> i have only noticed using more RAM than Xmx when using mmap i/o. this
>>> leads me to believe that disk_access_mode was not set properly, even
>>> though it is in the config. is there a way to verify this via JMX? (or
>>> some other way)
>>>
>> There is a log message at startup which will tell you the DiskAccessMode
>> and IndexAccessMode in use. It looks like..
>>
>> INFO 16:46:06,875 DiskAccessMode 'auto' determined to be mmap,
>> indexAccessMode is mmap
>>
>> =Rob
>>
>>


-- 
Sridhar