Unsubscribe 

> On Jun 28, 2019, at 11:00, riak-users-requ...@lists.basho.com wrote:
> 
> Send riak-users mailing list submissions to
>    riak-users@lists.basho.com
> 
> To subscribe or unsubscribe via the World Wide Web, visit
>    http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> or, via email, send a message with subject or body 'help' to
>    riak-users-requ...@lists.basho.com
> 
> You can reach the person managing the list at
>    riak-users-ow...@lists.basho.com
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of riak-users digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: Riak 2.9.0 - Update Available (Martin Sumner)
>   2. Re: Riak 2.9.0 - Update Available (Martin Sumner)
>   3. Re: Riak 2.9.0 - Update Available (Russell Brown)
>   4. Re: Riak 2.9.0 - Update Available (Bryan Hunt)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 28 Jun 2019 09:34:58 +0100
> From: Martin Sumner <martin.sum...@adaptip.co.uk>
> To: riak-users@lists.basho.com
> Subject: Re: Riak 2.9.0 - Update Available
> Message-ID:
>    <canzjuxcj9nbwlpqfyavyexjzzza9atmnsvxpcisvr1xf3ec...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> There is now a second update available for 2.9.0:
> https://github.com/basho/riak/tree/riak-2.9.0p2.
> 
> This patch, like the patch before, resolves a memory management issue in
> leveled, which this time could be triggered by sending many large objects
> in a short period of time.  The underlying problem is described a bit
> further here https://github.com/martinsumner/leveled/issues/285, and is
> resolved by leveled working more sympathetically with the beam binary
> memory management.
> 
> Switching to the patched version is not urgent unless you are using the
> leveled backend, and may send a large number of large objects in a burst.
> 
> Updated packages are available (thanks to Nick Adams at TI Tokyo) -
> https://files.tiot.jp/riak/kv/2.9/2.9.0p2/
> 
> Thanks again to the testing team at the NHS Spine project, Aaron Gibbon
> (BJSS) and Ramen Sen, who discovered the problem.  The issue was discovered
> in a handoff scenario where there were a tens of thousands of 2MB objects
> stored in a portion of the keyspace at the end of the handoff - which led
> to memory issues until either more PUTs were received (to force a persist
> to disk) or a restart occurred..
> 
> Regards
> 
> 
> On Sat, 25 May 2019 at 09:35, Martin Sumner <martin.sum...@adaptip.co.uk>
> wrote:
> 
>> Unfortunately, Riak 2.9.0 was released with an issue whereby a race
>> condition in heavy-PUT scenarios (e.g. handoffs), could cause a leak of
>> file descriptors.
>> 
>> The issue is described here - https://github.com/basho/riak_kv/issues/1699,
>> and the underlying issue here -
>> https://github.com/martinsumner/leveled/issues/278.
>> 
>> There is a new patched version of the release available (2.9.0p1) at
>> https://github.com/basho/riak/tree/riak-2.9.0p1.  This should be used in
>> preference to the original release of 2.9.0.
>> 
>> Updated packages are available (thanks to Nick Adams at TI Tokyo) -
>> https://files.tiot.jp/riak/kv/2.9/2.9.0p1/
>> 
>> Thanks also to the testing team at the NHS Spine project, Aaron Gibbon
>> (BJSS) and Ramen Sen, who discovered the problem.
>> 
>> Regards
>> 
>> Martin
>> 
>> 
>> 
>> 
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20190628/66b6528b/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 2
> Date: Fri, 28 Jun 2019 10:24:28 +0100
> From: Martin Sumner <martin.sum...@adaptip.co.uk>
> To: b h <bryanhuntwit...@gmail.com>
> Cc: riak-users@lists.basho.com
> Subject: Re: Riak 2.9.0 - Update Available
> Message-ID:
>    <CANzjUxAYQdimTt9EFj23=4kkab2pdgf3hf4e9_hes-sh30s...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Bryan,
> 
> We saw that Riak was using much more memory than was expected at the end of
> the handoffs.  Using `riak-admin top` we could see that this wasn't process
> memory, but binaries.  Firstly did some work via attach looping over
> processes and running GC to confirm that this wasn't a failure to collect
> garbage - the references to memory were real.  Then did a bit of work in
> attach writing some functions to analyse process_info/2 for each process
> (looking at binary and memory), and discovered that there were penciller
> processes that had lots of references to lots of large binaries (and this
> accounted for all the unexpected memory use), and where the penciller was
> the only process with a reference to the binary.  This made no sense
> initially as the penciller should only have small binaries (metadata).
> Then looked at the running state of the penciller processes and could see
> no large binaries in the state, but could see that a lot of the active keys
> in the penciller were keys that were known to have large object values (but
> small amounts of metadata) - and that the size of the object values were
> the same as the size of the binary references found on the penciller
> process via process_info/2..
> 
> I then recalled the first part of this:
> https://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html.
> It was obvious that in extracting the metadata the beam was naturally
> retaining a reference to the whole binary, as long as the sub-binary was
> retained by the a process (the Penciller).  Forcing a binary copy resolved
> this referencing issue.  It was nice that the same tools used to detect the
> issue, made it quite easy to write a test to confirm resolution -
> https://github.com/martinsumner/leveled/blob/master/test/end_to_end/riak_SUITE.erl#L1214-L1239
> .
> 
> The memory leak section of Fred Herbert's http://www.erlang-in-anger.com/ is
> great reading for helping with these types of issues.
> 
> Thanks
> 
> Martin
> 
> 
>> On Fri, 28 Jun 2019 at 09:46, b h <bryanhuntwit...@gmail.com> wrote:
>> 
>> Nice work - I've read issue / PR - how did you discover / track it down -
>> tools or just reading the code ?
>> 
>> On Fri, 28 Jun 2019 at 09:35, Martin Sumner <martin.sum...@adaptip.co.uk>
>> wrote:
>> 
>>> There is now a second update available for 2.9.0:
>>> https://github.com/basho/riak/tree/riak-2.9.0p2.
>>> 
>>> This patch, like the patch before, resolves a memory management issue in
>>> leveled, which this time could be triggered by sending many large objects
>>> in a short period of time.  The underlying problem is described a bit
>>> further here https://github.com/martinsumner/leveled/issues/285, and is
>>> resolved by leveled working more sympathetically with the beam binary
>>> memory management.
>>> 
>>> Switching to the patched version is not urgent unless you are using the
>>> leveled backend, and may send a large number of large objects in a burst.
>>> 
>>> Updated packages are available (thanks to Nick Adams at TI Tokyo) -
>>> https://files.tiot.jp/riak/kv/2.9/2.9.0p2/
>>> 
>>> Thanks again to the testing team at the NHS Spine project, Aaron Gibbon
>>> (BJSS) and Ramen Sen, who discovered the problem.  The issue was discovered
>>> in a handoff scenario where there were a tens of thousands of 2MB objects
>>> stored in a portion of the keyspace at the end of the handoff - which led
>>> to memory issues until either more PUTs were received (to force a persist
>>> to disk) or a restart occurred..
>>> 
>>> Regards
>>> 
>>> 
>>> On Sat, 25 May 2019 at 09:35, Martin Sumner <martin.sum...@adaptip.co.uk>
>>> wrote:
>>> 
>>>> Unfortunately, Riak 2.9.0 was released with an issue whereby a race
>>>> condition in heavy-PUT scenarios (e.g. handoffs), could cause a leak of
>>>> file descriptors.
>>>> 
>>>> The issue is described here -
>>>> https://github.com/basho/riak_kv/issues/1699, and the underlying issue
>>>> here - https://github.com/martinsumner/leveled/issues/278.
>>>> 
>>>> There is a new patched version of the release available (2.9.0p1) at
>>>> https://github.com/basho/riak/tree/riak-2.9.0p1.  This should be used
>>>> in preference to the original release of 2.9.0.
>>>> 
>>>> Updated packages are available (thanks to Nick Adams at TI Tokyo) -
>>>> https://files.tiot.jp/riak/kv/2.9/2.9.0p1/
>>>> 
>>>> Thanks also to the testing team at the NHS Spine project, Aaron Gibbon
>>>> (BJSS) and Ramen Sen, who discovered the problem.
>>>> 
>>>> Regards
>>>> 
>>>> Martin
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>> 
>> 
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20190628/83bbddc9/attachment-0001.html>
> 
> ------------------------------
> 
> Message: 3
> Date: Fri, 28 Jun 2019 10:26:47 +0100
> From: Russell Brown <russell.br...@mac.com>
> To: riak-users@lists.basho.com
> Subject: Re: Riak 2.9.0 - Update Available
> Message-ID: <c5d9e864-b9b1-ae36-a197-0ba3e5038...@mac.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
> 
> Good job on finding and fixing so fast.
> 
> I have to ask. What's with the naming scheme? Why not 2.9.2 instead of 
> 2.9.0p2?
> 
> Cheers
> 
> Russell
> 
>> On 28/06/2019 10:24, Martin Sumner wrote:
>> Bryan,
>> 
>> We saw that Riak was using much more memory than was expected at the 
>> end of the handoffs.? Using `riak-admin top` we could see that this 
>> wasn't process memory, but binaries. Firstly did some work via attach 
>> looping over processes and running GC to confirm that this wasn't a 
>> failure to collect garbage - the references to memory were real.? Then 
>> did a bit of work in attach writing some functions to analyse 
>> process_info/2 for each process (looking at binary and memory), and 
>> discovered that there were penciller processes that had lots of 
>> references to lots of large binaries (and this accounted for all the 
>> unexpected memory use), and where the penciller was the only process 
>> with a reference to the binary.? This made no sense initially as the 
>> penciller should only have small binaries (metadata).? Then looked at 
>> the running state of the penciller processes and could see no large 
>> binaries in the state, but could see that a lot of the active keys in 
>> the penciller were keys that were known to have large object values 
>> (but small amounts of metadata) - and that the size of the object 
>> values were the same as the size of the binary references found on the 
>> penciller process via process_info/2..
>> 
>> I then recalled the first part of this: 
>> https://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html.
>>  
>> It was obvious that in extracting the metadata the beam was naturally 
>> retaining a reference to the whole binary, as long as the sub-binary 
>> was retained by the a process (the Penciller).? Forcing a binary copy 
>> resolved this referencing issue.? It was nice that the same tools used 
>> to detect the issue, made it quite easy to write a test to confirm 
>> resolution - 
>> https://github.com/martinsumner/leveled/blob/master/test/end_to_end/riak_SUITE.erl#L1214-L1239.
>> 
>> The memory leak section of Fred Herbert's 
>> http://www.erlang-in-anger.com/?is great reading for helping with 
>> these types of issues.
>> 
>> Thanks
>> 
>> Martin
>> 
>> 
>> On Fri, 28 Jun 2019 at 09:46, b h <bryanhuntwit...@gmail.com 
>> <mailto:bryanhuntwit...@gmail.com>> wrote:
>> 
>>    Nice work - I've read issue / PR - how did you discover / track it
>>    down - tools or just reading the code ?
>> 
>>    On Fri, 28 Jun 2019 at 09:35, Martin Sumner
>>    <martin.sum...@adaptip.co.uk <mailto:martin.sum...@adaptip.co.uk>>
>>    wrote:
>> 
>>        There is now a second update available for 2.9.0:
>>        https://github.com/basho/riak/tree/riak-2.9.0p2.
>> 
>>        This patch, like the patch before, resolves a memory
>>        management issue in leveled, which this time could be
>>        triggered by sending many large objects in a short period of
>>        time.? The underlying problem is described a bit further here
>>        https://github.com/martinsumner/leveled/issues/285, and is
>>        resolved by leveled working more sympathetically with the beam
>>        binary memory management.
>> 
>>        Switching to the patched version is not urgent unless you are
>>        using the leveled backend, and may send a large number of
>>        large objects in a burst.
>> 
>>        Updated packages are available (thanks to Nick Adams at TI
>>        Tokyo) - https://files.tiot.jp/riak/kv/2.9/2.9.0p2/
>> 
>>        Thanks again to the testing team at the NHS Spine project,
>>        Aaron Gibbon (BJSS) and Ramen Sen, who discovered the
>>        problem.? The issue was discovered in a handoff scenario where
>>        there were a tens of thousands of 2MB objects stored in a
>>        portion of the keyspace at the end of the handoff - which led
>>        to memory issues until either more PUTs were received (to
>>        force a persist to disk) or a restart occurred..
>> 
>>        Regards
>> 
>> 
>>        On Sat, 25 May 2019 at 09:35, Martin Sumner
>>        <martin.sum...@adaptip.co.uk
>>        <mailto:martin.sum...@adaptip.co.uk>> wrote:
>> 
>>            Unfortunately, Riak 2.9.0 was released with an issue
>>            whereby a race condition in heavy-PUT scenarios (e.g.
>>            handoffs), could cause a leak of file descriptors.
>> 
>>            The issue is described here -
>>            https://github.com/basho/riak_kv/issues/1699, and the
>>            underlying issue here -
>>            https://github.com/martinsumner/leveled/issues/278.
>> 
>>            There is a new patched version of the release available
>>            (2.9.0p1) at
>>            https://github.com/basho/riak/tree/riak-2.9.0p1. This
>>            should be used in preference to the original release of 2.9.0.
>> 
>>            Updated packages are available (thanks to Nick Adams at TI
>>            Tokyo) - https://files.tiot.jp/riak/kv/2.9/2.9.0p1/
>> 
>>            Thanks also to the testing team at the NHS Spine project,
>>            Aaron Gibbon (BJSS) and Ramen Sen, who discovered the problem.
>> 
>>            Regards
>> 
>>            Martin
>> 
>> 
>> 
>> 
>>        _______________________________________________
>>        riak-users mailing list
>>        riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
>>        http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Fri, 28 Jun 2019 11:04:56 +0100
> From: Bryan Hunt <bryan.h...@erlang-solutions.com>
> To: Martin Sumner <martin.sum...@adaptip.co.uk>
> Cc: b h <bryanhuntwit...@gmail.com>, riak-users@lists.basho.com
> Subject: Re: Riak 2.9.0 - Update Available
> Message-ID:
>    <6d5955f8-8255-4a51-9bc0-04cee1bdb...@erlang-solutions.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Top quality spelunking - always fun to read - thanks Martin !
> 
>> On 28 Jun 2019, at 10:24, Martin Sumner <martin.sum...@adaptip.co.uk> wrote:
>> 
>> Bryan,
>> 
>> We saw that Riak was using much more memory than was expected at the end of 
>> the handoffs.  Using `riak-admin top` we could see that this wasn't process 
>> memory, but binaries.  Firstly did some work via attach looping over 
>> processes and running GC to confirm that this wasn't a failure to collect 
>> garbage - the references to memory were real.  Then did a bit of work in 
>> attach writing some functions to analyse process_info/2 for each process 
>> (looking at binary and memory), and discovered that there were penciller 
>> processes that had lots of references to lots of large binaries (and this 
>> accounted for all the unexpected memory use), and where the penciller was 
>> the only process with a reference to the binary.  This made no sense 
>> initially as the penciller should only have small binaries (metadata).  Then 
>> looked at the running state of the penciller processes and could see no 
>> large binaries in the state, but could see that a lot of the active keys in 
>> the penciller were keys that were known to have large object values (but 
>> small amounts of metadata) - and that the size of the object values were the 
>> same as the size of the binary references found on the penciller process via 
>> process_info/2.. 
>> 
>> I then recalled the first part of this: 
>> https://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html
>>  
>> <https://dieswaytoofast.blogspot.com/2012/12/erlang-binaries-and-garbage-collection.html>.
>>   It was obvious that in extracting the metadata the beam was naturally 
>> retaining a reference to the whole binary, as long as the sub-binary was 
>> retained by the a process (the Penciller).  Forcing a binary copy resolved 
>> this referencing issue.  It was nice that the same tools used to detect the 
>> issue, made it quite easy to write a test to confirm resolution - 
>> https://github.com/martinsumner/leveled/blob/master/test/end_to_end/riak_SUITE.erl#L1214-L1239
>>  
>> <https://github.com/martinsumner/leveled/blob/master/test/end_to_end/riak_SUITE.erl#L1214-L1239>.
>> 
>> The memory leak section of Fred Herbert's http://www.erlang-in-anger.com/ 
>> <http://www.erlang-in-anger.com/> is great reading for helping with these 
>> types of issues. 
>> 
>> Thanks
>> 
>> Martin
>> 
>> 
>> On Fri, 28 Jun 2019 at 09:46, b h <bryanhuntwit...@gmail.com 
>> <mailto:bryanhuntwit...@gmail.com>> wrote:
>> Nice work - I've read issue / PR - how did you discover / track it down - 
>> tools or just reading the code ? 
>> 
>> On Fri, 28 Jun 2019 at 09:35, Martin Sumner <martin.sum...@adaptip.co.uk 
>> <mailto:martin.sum...@adaptip.co.uk>> wrote:
>> There is now a second update available for 2.9.0: 
>> https://github.com/basho/riak/tree/riak-2.9.0p2 
>> <https://github.com/basho/riak/tree/riak-2.9.0p2>.
>> 
>> This patch, like the patch before, resolves a memory management issue in 
>> leveled, which this time could be triggered by sending many large objects in 
>> a short period of time.  The underlying problem is described a bit further 
>> here https://github.com/martinsumner/leveled/issues/285 
>> <https://github.com/martinsumner/leveled/issues/285>, and is resolved by 
>> leveled working more sympathetically with the beam binary memory management. 
>> 
>> Switching to the patched version is not urgent unless you are using the 
>> leveled backend, and may send a large number of large objects in a burst.  
>> 
>> Updated packages are available (thanks to Nick Adams at TI Tokyo) - 
>> https://files.tiot.jp/riak/kv/2.9/2.9.0p2/ 
>> <https://files.tiot.jp/riak/kv/2.9/2.9.0p2/>
>> 
>> Thanks again to the testing team at the NHS Spine project, Aaron Gibbon 
>> (BJSS) and Ramen Sen, who discovered the problem.  The issue was discovered 
>> in a handoff scenario where there were a tens of thousands of 2MB objects 
>> stored in a portion of the keyspace at the end of the handoff - which led to 
>> memory issues until either more PUTs were received (to force a persist to 
>> disk) or a restart occurred..
>> 
>> Regards
>> 
>> 
>> On Sat, 25 May 2019 at 09:35, Martin Sumner <martin.sum...@adaptip.co.uk 
>> <mailto:martin.sum...@adaptip.co.uk>> wrote:
>> Unfortunately, Riak 2.9.0 was released with an issue whereby a race 
>> condition in heavy-PUT scenarios (e.g. handoffs), could cause a leak of file 
>> descriptors.
>> 
>> The issue is described here - https://github.com/basho/riak_kv/issues/1699 
>> <https://github.com/basho/riak_kv/issues/1699>, and the underlying issue 
>> here - https://github.com/martinsumner/leveled/issues/278 
>> <https://github.com/martinsumner/leveled/issues/278>.
>> 
>> There is a new patched version of the release available (2.9.0p1) at 
>> https://github.com/basho/riak/tree/riak-2.9.0p1 
>> <https://github.com/basho/riak/tree/riak-2.9.0p1>.  This should be used in 
>> preference to the original release of 2.9.0.
>> 
>> Updated packages are available (thanks to Nick Adams at TI Tokyo) - 
>> https://files.tiot.jp/riak/kv/2.9/2.9.0p1/ 
>> <https://files.tiot.jp/riak/kv/2.9/2.9.0p1/>
>> 
>> Thanks also to the testing team at the NHS Spine project, Aaron Gibbon 
>> (BJSS) and Ramen Sen, who discovered the problem.
>> 
>> Regards
>> 
>> Martin
>> 
>> 
>> 
>> 
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com <mailto:riak-users@lists.basho.com>
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com 
>> <http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com>
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> -- 
> 
> 
> Code Sync & Erlang Solutions Conferences
> 
> Code Elixir LDN 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66sbcx> - London: 18 
> July 2019
> 
> Code BEAM Lite BD 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66scls> - Budapest: 
> 20 September 2019
> 
> Code BEAM Lite NYC 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66scvd> - NYC: 01 
> October 2019
> 
> RabbitMQ Summit 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66sd8l> - London: 4 
> November 2019
> 
> Code Mesh LDN 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66sd8x> - London: 7-8 
> November 2019
> 
> Code BEAM Lite India - Bangalore: 14 November 2019
> 
> Code 
> BEAM Lite AMS <https://www2.erlang-solutions.com/l/23452/2019-06-24/66sdbs> 
> - Amsterdam: 29 November 2019
> 
> Lambda Days 
> <https://www2.erlang-solutions.com/l/23452/2019-06-24/66sdcd> - Krak?w: 
> 13-14 February 2020
> 
> Code BEAM SF - San Francisco: 5-6 March 2020
> 
> 
> 
> 
> 
> *Erlang Solutions cares about your data and privacy; please find all 
> details about the basis for communicating with you and the way we process 
> your data in our?**Privacy Policy* 
> <https://www.erlang-solutions.com/privacy-policy.html>*.You can update your 
> email preferences or opt-out from receiving Marketing emails?here 
> <http://www2.erlang-solutions.com/emailpreference>.*
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: 
> <http://lists.basho.com/pipermail/riak-users_lists.basho.com/attachments/20190628/207a585e/attachment-0001.html>
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> 
> ------------------------------
> 
> End of riak-users Digest, Vol 118, Issue 1
> ******************************************

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to