You could try setting it to run with SimpleMessenger instead of AsyncMessenger -- the default changed across those releases. I imagine the root of the problem though is that with BlueStore the OSD is using a lot more memory than it used to and so we're overflowing the 32-bit address space...which means a more permanent solution might require turning down the memory tuning options. Sage has discussed those in various places. On Sun, Sep 10, 2017 at 11:52 PM Dyweni - Ceph-Users < 6exbab4fy...@dyweni.com> wrote:
> Hi, > > Is anyone running Ceph Luminous (12.2.0) on 32bit Linux? Have you seen > any problems? > > > > My setup has been 1 MON and 7 OSDs (no MDS, RGW, etc), all running Jewel > (10.2.1), on 32bit, with no issues at all. > > I've upgraded everything to latest version of Jewel (10.2.9) and still > no issues. > > Next I upgraded my MON to Luminous (12.2.0) and added MGR to it. Still > no issues. > > Next I removed one node from the cluster, wiped it clean, upgraded it to > Luminous (12.2.), and created a new BlueStore data area. Now this node > crashes with segmentation fault usually within a few minutes of starting > up. I've loaded symbols and used GDB to examine back traces. From what > I can tell, the seg faults are happening randomly, and the stack is > corrupted, so traces from GDB are unusable (even with all symbols > installed for all packages on the system). However, in all cases, the > seg fault is occuring in the 'msgr-worker-<n>' thread. > > > > > My data is fine, just would like to get Ceph 12.2.0 running stably on > this node, so I can upgrade the remaining nodes and switch everything > over to BlueStore. > > > > Thanks, > Dyweni > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com