.
A memory mapped file is "just" memory so it's accessed using a
ByteBuffer pointing to off heap memory. Works the same as if you had
mapped in some anonymous memory.
> Not sure what you mean here. Aren't there going to be cache and TLB
> misses for any I/O, whether via mm
here you could effectively do a JNI
> touch of the mmap region you’re going to need next.
>
> On Oct 8, 2016, at 7:17 PM, Graham Sanderson wrote:
>
> We don’t use Azul’s Zing, but it does have the nice feature that all threads
> don’t have to reach safepoints at the same time. Tha
It's a lot of cache and TLB misses
>> > with out prefetching though.
>> >
>> > There is a system call to page the memory in which might be better for
>> > larger reads. Still no guarantee things stay cached though.
>> >
>> > Ariel
>>
efetching though.
Not sure what you mean here. Aren't there going to be cache and TLB misses for
any I/O, whether via mmap or syscall?
> There is a system call to page the memory in which might be better for
> larger reads. Still no guarantee things stay cached though.
The approach
ing buffered IO. It's a lot of cache and TLB misses
> > with out prefetching though.
> >
> > There is a system call to page the memory in which might be better for
> > larger reads. Still no guarantee things stay cached though.
> >
> > Ariel
> >
> &g
gt;> > larger reads. Still no guarantee things stay cached though.
>> >
>> > Ariel
>> >
>> >
>> > On Sat, Oct 8, 2016, at 08:21 PM, Graham Sanderson wrote:
>> >> I haven’t studied the read path that carefully, but there might
>&
Potentially relevant reading
https://issues.apache.org/jira/browse/CASSANDRA-10249
From: Benedict Elliott Smith
Reply-To: "user@cassandra.apache.org"
Date: Sunday, October 9, 2016 at 2:39 AM
To: "user@cassandra.apache.org"
Subject: Re: JVM safepoints, mmap, an
n which might be better for
> > larger reads. Still no guarantee things stay cached though.
> >
> > Ariel
> >
> >
> > On Sat, Oct 8, 2016, at 08:21 PM, Graham Sanderson wrote:
> >> I haven’t studied the read path that carefully, but there might be a
>
;> I haven’t studied the read path that carefully, but there might be a spot at
>> the C* level rather than JVM level where you could effectively do a JNI
>> touch of the mmap region you’re going to need next.
>>
>>> On Oct 8, 2016, at 7:17 PM, Graham Sanderson wr
:
> I haven’t studied the read path that carefully, but there might be a spot at
> the C* level rather than JVM level where you could effectively do a JNI touch
> of the mmap region you’re going to need next.
>
>> On Oct 8, 2016, at 7:17 PM, Graham Sanderson wrote:
>>
>
I haven’t studied the read path that carefully, but there might be a spot at
the C* level rather than JVM level where you could effectively do a JNI touch
of the mmap region you’re going to need next.
> On Oct 8, 2016, at 7:17 PM, Graham Sanderson wrote:
>
> We don’t use Azul’s Zin
We don’t use Azul’s Zing, but it does have the nice feature that all threads
don’t have to reach safepoints at the same time. That said we make heavy use of
Cassandra (with off heap memtables - not directly related but allows us a lot
more GC headroom) and SOLR where we switched to mmap because
ite unappetizing.
>
> 2) have fewer safepoints
>
> Two of the biggest sources of safepoints are garbage collection and
> revocation
> of biased locks. Evidence points toward biased locking being unhelpful for
> Cassandra's purposes, so turning it off (-XX:-UseBiasedLocking) is a
fault frequency, which is another thing we're
trying to avoid! I don't view this as a serious option.
3) use a different IO strategy
Looking at the Cassandra source code, there appears to be an un(der)documented
configuration parameter called disk_access_mode. It appea
onds
>>>>
>>>> In this way, JVM safepoints become a powerful weapon for
>>>> transmuting a single
>>>> thread's slow I/O into the entire JVM's lockup.
>>>>
>>>> Does all of the above sound correct?
>>>>
>
icing
page cache would increase page fault frequency, which is another thing we're
trying to avoid! I don't view this as a serious option.
3) use a different IO strategy
Looking at the Cassandra source code, there appears to be an un(der)documented
configuration parameter
>>
>> 2) have fewer safepoints
>>
>> Two of the biggest sources of safepoints are garbage collection and
>> revocation
>> of biased locks. Evidence points toward biased locking being
>> unhelpful for
>> Cassandra's purposes, so turning it off
use a different IO strategy
Looking at the Cassandra source code, there appears to be an un(der)documented
configuration parameter called disk_access_mode. It appears that changing this
to 'standard' would switch to using pread() and pwrite() for I/O, instead of
mmap. I imagine there w
ion parameter called disk_access_mode. It appears that changing this
to 'standard' would switch to using pread() and pwrite() for I/O, instead of
mmap. I imagine there would be a throughput penalty here for the case when
pages are in the disk cache.
Is this a serious option? It seems far to
This issue has to be looked from a micro and macro level. On the microlevel
the "best" way is workload specific. On the macro level this mostly boils
down to data and memory size.
Companions are going to churn cache, this is unavoidable. Imho solid state
makes the micro optimization meanless in th
On Thu, Dec 6, 2012 at 7:36 PM, aaron morton wrote:
> So for memory mapped files, compaction can do a madvise SEQUENTIAL instead
> of current DONTNEED flag after detecting appropriate OS versions. Will this
> help?
>
>
> AFAIK Compaction does use memory mapped file access.
The history :
https://
hed. Technically is uses posix_fadvise if you want to
> look it up.
>
> Cheers
>
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/12/2012, at 11:04 PM, Ravikum
www.thelastpickle.com
>
> On 5/12/2012, at 11:04 PM, Ravikumar Govindarajan <
> ravikumar.govindara...@gmail.com> wrote:
>
> Thanks Aaron,
>
> I am not quite clear on how MMap loads SSTables other than the fact that
> it kicks in only during a first-time access
>
>
osix_fadvise if you want to
look it up.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 5/12/2012, at 11:04 PM, Ravikumar Govindarajan
wrote:
> Thanks Aaron,
>
> I am not quite clear on how MMap loads SS
Thanks Aaron,
I am not quite clear on how MMap loads SSTables other than the fact that it
kicks in only during a first-time access
Is it going to load only relevant pages per SSTable on read or is it going
to load an entire SSTable on first access?
Say suppose compaction kicks in. Will it then
> Will MMapping data files be detrimental for reads, in this case?
No.
> In general, when should we opt for MMap data files and what are the factors
> that need special attention when enabling the same?
mmapping is the default, so I would say use it until you have a reason not to.
This FAQ entry and the linked document provide a pretty good explanation:
http://wiki.apache.org/cassandra/FAQ#mmap
By the way, you should almost always turn off swap.
On Thu, Nov 17, 2011 at 1:16 AM, Jaesung Lee wrote:
> I am running 7 nodes cassandra(v1.0.2) cluster.
> I am putting 20
sults.
> using mmap:
> VIRT: 566g RES: 36g SHR:12g
> standard disk access mode
> VIRT:24.7g RES: 24g SHR:68m
I allocated 24g memory for JVM heap.
I have some questions about mmap.
It is easy to analyze standard disk access mode's memory result.
I know cassandra use huge virtual m
it works fine ?
2 - I read a lot about mmap without understanding clearly if I should keep
it "auto" or "mmap_index_only". Anyways, I didn't find the
"disk_access_mode" option in my cassandra.yaml, which let me think that it
was removed and I have no other choi
sk_access_mode: standard
>> >
>> > Cheers
>> >
>> > -
>> > Aaron Morton
>> > Freelance Cassandra Developer
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> >
>> > On 20/09/2011, at 6
, at 6:55 AM, Jonathan Ellis wrote:
> >
> >> You should start with scrub.
> >>
> >> On Mon, Sep 19, 2011 at 1:04 PM, Eric Czech wrote:
> >>> I'm getting a lot of errors that look something like "java.io.IOError:
> >>> java.io.IOEx
loper
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 20/09/2011, at 6:55 AM, Jonathan Ellis wrote:
> >
> >> You should start with scrub.
> >>
> >> On Mon, Sep 19, 2011 at 1:04 PM, Eric Czech
> wrote:
> >>> I'm get
t;
>>> Bye,
>>> Norman
>>>
>>> 2011/9/30 Yi Yang :
>>>> It is meaningless to release such memory. The counting includes the data
>>>> you reached in the SSTable. Those data locates on your hard drive. So it
>>>> is not the RAM
t; It is meaningless to release such memory. The counting includes the data
>>> you reached in the SSTable. Those data locates on your hard drive. So it is
>>> not the RAM spaces you have actually used.
>>>
>>> -Y.
>>> --Original Message--
>
gt;> --Original Message--
>> From: Yang
>> To: user@cassandra.apache.org
>> ReplyTo: user@cassandra.apache.org
>> Subject: release mmap memory through jconsole?
>> Sent: Oct 1, 2011 12:40 AM
>>
>> I gave an -Xmx50G to my Cassandra java processs, now
Is it? Heard that twitter uses 60G, if I have remembered correctly.
--Original Message--
From: Norman Maurer
To: user@cassandra.apache.org
To: i...@iyyang.com
Subject: Re: release mmap memory through jconsole?
Sent: Oct 1, 2011 12:55 AM
I would also not use such a big heap. I think most
is not
> the RAM spaces you have actually used.
>
> -Y.
> --Original Message--
> From: Yang
> To: user@cassandra.apache.org
> ReplyTo: user@cassandra.apache.org
> Subject: release mmap memory through jconsole?
> Sent: Oct 1, 2011 12:40 AM
>
> I gave an -Xmx50G
@cassandra.apache.org
Subject: release mmap memory through jconsole?
Sent: Oct 1, 2011 12:40 AM
I gave an -Xmx50G to my Cassandra java processs, now "top" shows its
virtual memory address space is 82G, is there
a way to release that memory through JMX ?
Thanks
Yang
?? BlackBerry?0?3 ?o???b??
I gave an -Xmx50G to my Cassandra java processs, now "top" shows its
virtual memory address space is 82G, is there
a way to release that memory through JMX ?
Thanks
Yang
On 20/09/2011, at 6:55 AM, Jonathan Ellis wrote:
>
>> You should start with scrub.
>>
>> On Mon, Sep 19, 2011 at 1:04 PM, Eric Czech wrote:
>>> I'm getting a lot of errors that look something like "java.io.IOError:
>>> java.io.IOException: mmap seg
Ellis wrote:
> You should start with scrub.
>
> On Mon, Sep 19, 2011 at 1:04 PM, Eric Czech wrote:
>> I'm getting a lot of errors that look something like "java.io.IOError:
>> java.io.IOException: mmap segment underflow; remaining is 348268797
>> but 892
You should start with scrub.
On Mon, Sep 19, 2011 at 1:04 PM, Eric Czech wrote:
> I'm getting a lot of errors that look something like "java.io.IOError:
> java.io.IOException: mmap segment underflow; remaining is 348268797
> but 892417075 requested" on one node in
I'm getting a lot of errors that look something like "java.io.IOError:
java.io.IOException: mmap segment underflow; remaining is 348268797
but 892417075 requested" on one node in a 10 node cluster. I'm
currently running version 0.8.4 but this is data that was carried ov
cassandra list in this
> regard would also be much appreciated. Specifically, I'd like to dissect the
> sstable and to figure out what the key is to the bad row and what is wrong
> with the columns/supercolumns in that row.
>
> The only issue I've found WRT to mmap
tips from developers on the cassandra list in this regard
would also be much appreciated. Specifically, I'd like to dissect the sstable
and to figure out what the key is to the bad row and what is wrong with the
columns/supercolumns in that row.
The only issue I've found WRT to mmap
Try the patch at https://issues.apache.org/jira/browse/CASSANDRA-2417
On Mon, Apr 4, 2011 at 11:18 AM, Or Yanay wrote:
> Hi All,
>
>
>
> I have upgraded from 0.7.0 to 0.7.4, and while running scrub I get the
> following exception quite a lot:
>
>
>
> java.lan
user@cassandra.apache.org
Subject: mmap segment underflow
Hi All,
I have upgraded from 0.7.0 to 0.7.4, and while running scrub I get the
following exception quite a lot:
java.lang.AssertionError: mmap segment underflow; remaining is 73936639 but
1970430821 requested
Hi All,
I have upgraded from 0.7.0 to 0.7.4, and while running scrub I get the
following exception quite a lot:
java.lang.AssertionError: mmap segment underflow; remaining is 73936639 but
1970430821 requested
at
org.apache.cassandra.io.util.MappedFileDataInput.readBytes
>
> On Mar 22, 2011, at 12:52 PM, Adi wrote:
>
>> On Tue, Mar 22, 2011 at 3:44 PM, ruslan usifov
>> wrote:
>>
>>
>> 2011/3/22 Adi
>> I have been going through the mailing list and compiling suggestions to
>> address the swapping due
:52 PM, Adi wrote:
> On Tue, Mar 22, 2011 at 3:44 PM, ruslan usifov
> wrote:
>
>
> 2011/3/22 Adi
> I have been going through the mailing list and compiling suggestions to
> address the swapping due to mmap issue.
>
> 1) Use JNA (done but)
> Are these steps also
On Tue, Mar 22, 2011 at 3:44 PM, ruslan usifov wrote:
>
>
> 2011/3/22 Adi
>
>> I have been going through the mailing list and compiling suggestions to
>> address the swapping due to mmap issue.
>>
>> 1) Use JNA (done but)
>> Are these steps also requir
2011/3/22 Adi
> I have been going through the mailing list and compiling suggestions to
> address the swapping due to mmap issue.
>
> 1) Use JNA (done but)
> Are these steps also required:
> - Start Cassandra with CAP_IPC_LOCK (or as "root"). (not done)
>
And what is CAP_IPC_LOCK?
I have been going through the mailing list and compiling suggestions to
address the swapping due to mmap issue.
1) Use JNA (done but)
Are these steps also required:
- Start Cassandra with CAP_IPC_LOCK (or as "root"). (not done)
grep Unevictable /proc/meminfo
- set /proc/sys/vm/swa
mcasandra gmail.com> writes:
>
> Thanks! I think it still is a good idea to enable HiugePages and use
> UseLargePageSize option in JVM. What do you think?
I experimented with it. It was about 10% performance improvement. But this was
on 100% row cache hit. On smaller cache hit ratios the perfor
Thanks! I think it still is a good idea to enable HiugePages and use
UseLargePageSize option in JVM. What do you think?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Linux-HugePages-and-mmap-tp6170193p6171008.html
Sent from the cassandra-u
On Mon, Mar 14, 2011 at 3:01 PM, mcasandra wrote:
>
> Jonathan Ellis-3 wrote:
>>
>> Wrong. The recommendation is to leave it on auto.
>>
> this is where I see mmap recommended for index.
> http://wiki.apache.org/cassandra/StorageConfiguration
FTFY.
>> H
Jonathan Ellis-3 wrote:
>
> Wrong. The recommendation is to leave it on auto.
>
this is where I see mmap recommended for index.
http://wiki.apache.org/cassandra/StorageConfiguration
http://wiki.apache.org/cassandra/StorageConfiguration
Jonathan Ellis-3 wrote:
>
> HugePages
l need to worry about setting
> disk_access_mode to mmap? I am planning to enable HugePages and use
> -XX:+UseLargePages option in JVM. I had a very good experience using
> HugePages with Oracle.
HugePages has nothing to do with disk access mode.
--
Jonathan Ellis
Project Chair, Apache
Currently, in cassandra.yaml disk_access_mode is set to "auto" but the
recommendation seems to be to use 'mmap_index_only'.
If we use HugePages then do we still need to worry about setting
disk_access_mode to mmap? I am planning to enable HugePages and use
-XX:+UseLargePages
are hundreds of threads. Here is the settings of Cassandra:
> 1) *8
> 128*
>
> The thread stack size on this server is 1MB. So I observe hundreds of
> single mmap segment as 1MB.
>
> Can you also post the full commandline as well?
>>
> Sure. All of th
Hello Peter,
So more information on that problem :
Yes I am using this node with very few data, it is used to design requests
so I don't need a very large dataset.
I am running Apache Cassandra 0.6.6 on a Debian Stable, with java version
"1.6.0_22".
I recently restarted cassandra, thus I have thi
> vic...@:~$ sudo ps aux | grep "cassandra"
> cassandra 11034 0.2 22.9 1107772 462764 ? Sl Dec17 6:13
> /usr/bin/java -ea -Xms128M -Xmx512M -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOc
>
>
>>
>> I'll paste it to open-jdk mailist to seek for help.
>>
>> Zhu,
>>> Couple of quick questions:
>>> How many threads are in your JVM?
>>>
>>
>> There are hundreds of threads. Here is the settings of Cassandra:
>&g
of threads. Here is the settings of Cassandra:
> 1) *8
> 128*
>
> The thread stack size on this server is 1MB. So I observe hundreds of
> single mmap segment as 1MB.
>
> Can you also post the full commandline as well?
>>
> Sure. All of them are default settings
in your JVM?
>
There are hundreds of threads. Here is the settings of Cassandra:
1) *8
128*
The thread stack size on this server is 1MB. So I observe hundreds of single
mmap segment as 1MB.
Can you also post the full commandline as well?
>
Sure. All of them are default settings.
/usr/bi
>>>>
>>> >>>> I have a test node with apache-cassandra-0.6.8 on ubuntu 10.4. The
>>> >>>> hardware environment is an OpenVZ container. JVM settings is
>>> >>>> # java -Xmx128m -version
>>> >>>> java ver
ava version "1.6.0_18"
>> >>>> OpenJDK Runtime Environment (IcedTea6 1.8.2) (6b18-1.8.2-4ubuntu2)
>> >>>> OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode)
>> >>>>
>> >>>> This is the memory settings:
>> &g
> Sorry for spam again. :-)
No, thanks a lot for tracking that down and reporting details!
Presumably a significant amount of users are on that version of Ubuntu
running with openjdk.
--
/ Peter Schuller
NI *VIRT* *RES* SHR S %CPU %MEMTIME+
>>> COMMAND
>>>
>>> 7836 root 15 0 *3300m* *2.4g* 13m S0 26.0 2:58.51
>>> java
>>>
>>> The jvm heap utilization is quite normal:
>>>
>>> #sudo jstat -gc -J"-X
The jvm heap utilization is quite normal:
>>
>> #sudo jstat -gc -J"-Xmx128m" 7836
>> S0CS1CS0US1U *EC* *EU* *OC**
>> OU* *PC PU* YGC YGCT FGCFGCT
>> GCT
>> 8512.0 8512.0 372.8
512.0 8512.0 372.8 0.0 *68160.0* *5225.7* *963392.0 508200.7
> 30604.0 18373.4*4803.979 2 0.0053.984
>
> And then I try "pmap" to see the native memory mapping. *There is two
> large anonymous mmap regions.*
>
> 080dc000 1573568K rw
*EU* *OC*
*OU*
*PC PU* YGC YGCT FGCFGCT GCT
8512.0 8512.0 372.8 0.0 *68160.0* *5225.7* *963392.0 508200.7
30604.0 18373.4*4803.979 2 0.0053.984
And then I try "pmap" to see the native memory mapping. *There is two large
> Not 100% relevant but I found this to be interesting if you're nodes are
> doing heavy disk I/O:
>
> http://rackerhacker.com/2008/08/07/reduce-disk-io-for-small-reads-using-memory/
There are some pitfalls though, or at least there were the last time I
was tweaking such stuff for a PostgreSQL da
Not 100% relevant but I found this to be interesting if you're nodes are doing
heavy disk I/O:
http://rackerhacker.com/2008/08/07/reduce-disk-io-for-small-reads-using-memory/
-Chris
On Jul 15, 2010, at 11:47 PM, Peter Schuller wrote:
>> This would require that Cassandra run as root on Linux sy
> This would require that Cassandra run as root on Linux systems, as 'man
> mlockall' states:
IIRC, mlock() (as opposed to mlockall()) does not require root
privileges - but is subject to resource limitations.
However, given a lack of control of how memory is allocated in the JVM
I suppose mlock
On Thu, Jul 15, 2010 at 5:46 PM, Clint Byrum wrote:
> One other approach that works on Linux is to use HugeTLB. This post details
> the process for doing so with a jvm:
>
> http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html
>
> Basically when mmapping using HUGETLB you don't have t
create, which is a big hit to performance
>>> when you're allocating buffers for file i/o on each request instead of
>>> just mmaping things. Re-using those buffers would be possible but
>>> difficult; I think using mlockall to "fix" the mmap approach is
ocating buffers for file i/o on each request instead of
>> just mmaping things. Re-using those buffers would be possible but
>> difficult; I think using mlockall to "fix" the mmap approach is more
>> promising.
>
> Sorry if it is a silly question, but what wo
aping things. Re-using those buffers would be possible but
> difficult; I think using mlockall to "fix" the mmap approach is more
> promising.
Sorry if it is a silly question, but what would be the approach? issue
a mlockall with the current set (MLC_CURRENT) before mmap the files?
Di
isting on
> zeroing out any buffer you create, which is a big hit to performance
> when you're allocating buffers for file i/o on each request instead of
> just mmaping things. Re-using those buffers would be possible but
> difficult; I think using mlockall to "fix" the
On Thu, Jul 15, 2010 at 11:41 AM, Peter Schuller
wrote:
> Not really. That is, the intent of mmap is to let the OS dynamically
> choose what gets swapped in and out. The practical problem is that the
> OS will often tend to swap too much. I got the impression jbellis
> wasn't
I found, for large dataset, long-term random reading test, the performance
with mmap is very bad.
See the attached chart in
https://issues.apache.org/jira/browse/CASSANDRA-1214.
On Fri, Jul 16, 2010 at 12:41 AM, Peter Schuller <
peter.schul...@infidyne.com> wrote:
> > Can someone pl
> Can someone please explain the mmap issue.
> mmap is default for all storage files for 64bit machines.
> according to this case https://issues.apache.org/jira/browse/CASSANDRA-1214
> it might not be a good thing.
> Is it right to say that you should use mmap only if your MAX exp
Can someone please explain the mmap issue.
mmap is default for all storage files for 64bit machines.
according to this case
https://issues.apache.org/jira/browse/CASSANDRA-1214it might not be a
good thing.
Is it right to say that you should use mmap only if your MAX expected data
is smaller then
84 matches
Mail list logo