Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Hector Martin
be degraded until you can bring the host back, and will not be able to recover those chunks anywhere (since the ruleset prevents so), so any further failure of an OSD while a host is down will necessarily lose data. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mar

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Hector Martin
On 06/02/15 21:07, Udo Lembke wrote: > Am 06.02.2015 09:06, schrieb Hector Martin: >> On 02/02/15 03:38, Udo Lembke wrote: >>> With 3 hosts only you can't survive an full node failure, because for >>> that you need >>> host >= k + m. >> >> Su

[ceph-users] pg_num docs conflict with Hammer PG count warning

2015-08-06 Thread Hector Martin
everything by 12? The cluster is currently very overprovisioned for space, so we're probably not going to be adding OSDs for quite a while, but we'll be adding pools. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc _

Re: [ceph-users] pg_num docs conflict with Hammer PG count warning

2015-08-06 Thread Hector Martin
ely larger factors as you add pools. We are following the hardware recommendations for RAM: 1GB per 1TB of storage, so 16GB for each OSD box (4GB per OSD daemon, each OSD being one 4TB drive). -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/

[ceph-users] CephFS dropping data with rsync?

2018-06-15 Thread Hector Martin
thout leaving any evidence behind. Any ideas what might've happened here? If this happens again / is reproducible I'll try to see if I can do some more debugging... -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub _

Re: [ceph-users] CephFS dropping data with rsync?

2018-06-15 Thread Hector Martin
On 2018-06-16 13:04, Hector Martin wrote: > I'm at a loss as to what happened here. Okay, I just realized CephFS has a default 1TB file size... that explains what triggered the problem. I just bumped it to 10TB. What that doesn't explain is why rsync didn't complain about an

Re: [ceph-users] Filestore to Bluestore migration question

2018-10-31 Thread Hector Martin
-data hdd1/data1 --block.db ssd/db1 ... ceph-volume lvm activate --all I think it might be possible to just let ceph-volume create the PV/VG/LV for the data disks and only manually create the DB LVs, but it shouldn't hurt to do it on your own and just give ready-made LVs to ceph-volume

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-04 Thread Hector Martin
f you just try to start the OSDs again? Maybe check the overall system log with journalctl for hints. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
systemd is still trying to mount the old OSDs, which used disk partitions. Look in /etc/fstab and in /etc/systemd/system for any references to those filesystems and get rid of them. /dev/sdh1 and company no longer exist, and nothing should reference them. -- Hector Martin (hec...@marcansof

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
On 11/6/18 1:08 AM, Hector Martin wrote: > On 11/6/18 12:42 AM, Hayashida, Mami wrote: >> Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not >> mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns >> nothing.).  They don't show

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
0G  0 lvm > >> >> >>> > ├─ssd0-db61    252:1    0    40G  0 lvm > >> >> >>> > ├─ssd0-db62    252:2    0    40G  0 lvm > >> >> >>> > ├─ssd0-db63    252:3    0    40G  0 lvm > >> >> >>> > ├─ssd0-db64    252:4    0    40G  0 lvm > >> >> >>> > ├─

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
hat references any of the old partitions that don't exist (/dev/sdh1 etc) should be removed. The disks are now full-disk LVM PVs and should have no partitions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
On 11/6/18 3:21 AM, Alfredo Deza wrote: > On Mon, Nov 5, 2018 at 11:51 AM Hector Martin wrote: >> >> Those units don't get triggered out of nowhere, there has to be a >> partition table with magic GUIDs or a fstab or something to cause them >> to be triggered. The

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
;change", SUBSYSTEM=="block", ENV{DEVTYPE}=="disk", \ ENV{DM_LV_NAME}=="db*", ENV{DM_VG_NAME}=="ssd0", \ OWNER="ceph", GROUP="ceph", MODE="660" Reboot after that and see if the OSDs come up without further action. -- Hec

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Hector Martin
s with symlinks to block devices. I'm not sure what happened there. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
; sdh              8:112  0   3.7T  0 disk  > └─hdd60-data60 252:1    0   3.7T  0 lvm  > > and "ceph osd tree" shows  > 60   hdd    3.63689         osd.60         up  1.0 1.0 That looks correct as far as the weight goes, but I'm really confused as to why you have

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
quot; and "mount | grep osd" instead and see if ceph-60 through ceph-69 show up. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
adata is in LVM, it's safe to move or delete all those OSD directories for BlueStore OSDs and try activating them cleanly again, which hopefully will do the right thing. In the end this all might fix your device ownership woes too, making the udev rule unnecessary. If it all works ou

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
en care of by the ceph-volume activation. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
d to be safe, and might avoid trouble if some FileStore remnant tries to mount phantom partitions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
g to wipe because there is a backup at the end of the device, but wipefs *should* know about that as far as I know. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lis

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-07 Thread Hector Martin
-- 1 ceph ceph 6 Oct 28 16:12 ready -rw--- 1 ceph ceph 10 Oct 28 16:12 type -rw--- 1 ceph ceph 3 Oct 28 16:12 whoami (lockbox.keyring is for encryption, which you do not use) -- Hector Martin (hec...@marcansoft.com) Public Key: https

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2018-11-08 Thread Hector Martin
ot always viable. Right now it seems that besides the cache, OSDs will creep up in memory usage up to some threshold, and I'm not sure what determines what that baseline usage is or whether it can be controlled. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub

[ceph-users] Effects of restoring a cluster's mon from an older backup

2018-11-08 Thread Hector Martin
ata: http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds Would this be preferable to just restoring the mon from a backup? What about the MDS map? -- Hector Martin (hec...@marcansoft.com) Public Key: https://m

Re: [ceph-users] Effects of restoring a cluster's mon from an older backup

2018-11-12 Thread Hector Martin
ow. I'll see if I can do some DR tests when I set this up, to prove to myself that it all works out :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-23 Thread Hector Martin
nly? Or only if several things go down at once?) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-25 Thread Hector Martin
hose pages are flushed? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS file contains garbage zero padding after an unclean cluster shutdown

2018-11-25 Thread Hector Martin
On 26/11/2018 11.05, Yan, Zheng wrote: > On Mon, Nov 26, 2018 at 4:30 AM Hector Martin wrote: >> >> On 26/11/2018 00.19, Paul Emmerich wrote: >>> No, wait. Which system did kernel panic? Your CephFS client running rsync? >>> In this case this would be expect

[ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
M (suspend/resume), which has higher impact but also probably a much lower chance of messing up (or having excess latency), since it doesn't involve the guest OS or the qemu agent at all... -- Hector Martin (hec...@marcansoft.com) Public Key

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
27;m increasing our timeout to 15 minutes, we'll see if the problem recurs. Given this, it makes even more sense to just avoid the freeze if at all reasonable. There's no real way to guarantee that a fsfreeze will complete in a "reasonable" amount of time as far as I ca

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-18 Thread Hector Martin
out/retries, then switch to unconditionally reset the VM if thawing fails. Ultimately this whole thing is kind of fragile, so if I can get away without freezing at all it would probably make the whole process a lot more robust. -- Hector Martin (hec

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-20 Thread Hector Martin
On 21/12/2018 03.02, Gregory Farnum wrote: > RBD snapshots are indeed crash-consistent. :) > -Greg Thanks for the confirmation! May I suggest putting this little nugget in the docs somewhere? This might help clarify things for others :) -- Hector Martin (hec...@marcansoft.com) Public Key:

[ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Hector Martin
e happy to test this again with osd.1 if needed and see if I can get it fixed. Otherwise I'll just re-create it and move on. # ceph --version ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic (stable) -- Hector Martin (hec...@marcansoft.com) Public Key: https://

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
ith CEPH_ARGS="--debug-bluestore 20 --debug-bluefs 20 --log-file bluefs-bdev-expand.log" Perhaps it makes sense to open a ticket at ceph bug tracker to proceed... Thanks, Igor On 12/27/2018 12:19 PM, Hector Martin wrote: Hi list, I'm slightly expanding the underlying LV for two

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Hector Martin
o problem then, good to know it isn't *supposed* to work yet :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Boot volume on OSD device

2019-01-18 Thread Hector Martin
). So the OSDs get set up with some custom code, but then normal usage just uses ceph-disk (it certainly doesn't care about extra partitions once everything is set up). This was formerly FileStore and now BlueStore, but it's a legacy setup. I expect to move this over to ceph-volume at

Re: [ceph-users] Suggestions/experiences with mixed disk sizes and models from 4TB - 14TB

2019-01-18 Thread Hector Martin
m to work well so far in my home cluster, but I haven't finished setting things up yet. Those are definitely not SMR. -- Hector Martin (hec...@marcansoft.com) Public Key: https://marcan.st/marcan.asc ___ ceph-users mailing list ceph-us

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-18 Thread Hector Martin
ision thing) to hopefully squash more lurking Python 3 bugs. (just my 2c - maybe I got unlucky and otherwise things work well enough for everyone else in Py3; I'm certainly happy to get rid of Py2 ASAP). -- Hector Martin (hec...@marcansoft.com) Public Key: https://

Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-18 Thread Hector Martin
On 18/01/2019 22.33, Alfredo Deza wrote: > On Fri, Jan 18, 2019 at 7:07 AM Hector Martin wrote: >> >> On 17/01/2019 00:45, Sage Weil wrote: >>> Hi everyone, >>> >>> This has come up several times before, but we need to make a final >>> decis

Re: [ceph-users] Boot volume on OSD device

2019-01-18 Thread Hector Martin
On 19/01/2019 02.24, Brian Topping wrote: > > >> On Jan 18, 2019, at 4:29 AM, Hector Martin wrote: >> >> On 12/01/2019 15:07, Brian Topping wrote: >>> I’m a little nervous that BlueStore assumes it owns the partition table and >>> will not be happy tha

Re: [ceph-users] Boot volume on OSD device

2019-01-20 Thread Hector Martin
d-raid-on-lvm, which as you can imagine required some tweaking of startup scripts to make work with LVM on both ends!) Ultimately a lot of this is dictated by whatever tools you feel comfortable using :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CephFS performance vs. underlying storage

2019-01-30 Thread Hector Martin
ring raw storage performance. * Ceph has a slight disadvantage here because its chunk of the drives is logically after the traditional RAID, and HDDs get slower towards higher logical addresses, but this should be on the order of a 15-20% hit at most. -- Hector Martin (hec.

Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-04 Thread Hector Martin
ho 'rc_need="ceph-mon.0"' > /etc/conf.d/ceph-osd The Gentoo initscript setup for Ceph is unfortunately not very well documented. I've been meaning to write a blogpost about this to try to share what I've learned :-) -- Hector Martin (hec...@marcansoft.com) Public

[ceph-users] CephFS overwrite/truncate performance hit

2019-02-06 Thread Hector Martin
dance, if I can guarantee they're atomic. Is there any documentation on what write operations incur significant overhead on CephFS like this, and why? This particular issue isn't mentioned in http://docs.ceph.com/docs/master/cephfs/app-best-practices/ (which seems like it mostly deals

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
though. There's some discussion on this here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020510.html -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
600 or so? You might want to go through your snapshots and check that you aren't leaking old snapshots forever, or deleting the wrong ones. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-07 Thread Hector Martin
or writing to an existing one without truncation does not. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] I get weird ls pool detail output 12.2.11

2019-02-07 Thread Hector Martin
ying pools, one apparently created on deletion (I wasn't aware of this). So for ~700 snapshots the output you're seeing is normal. It seems that using a "rolling snapshot" pattern in CephFS inherently creates a "one present, one deleted" pattern in the underlying pools. --

Re: [ceph-users] change OSD IP it uses

2019-02-08 Thread Hector Martin
and trying to connect via the external IP of that node. Does your ceph.conf have the right network settings? Compare it with the other nodes. Also check that your network interfaces and routes are correctly configured on the problem node, of course. -- Hector Martin (hec...@marcansoft.com) Publi

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Hector Martin
t data pool. The FSMap seems to store pools by ID, not by name, so renaming the pools won't work. This past thread has an untested procedure for migrating CephFS pools: https://www.spinics.net/lists/ceph-users/msg29536.html -- Hector Martin (hec...@marcansoft.com) Public Ke

Re: [ceph-users] change OSD IP it uses

2019-02-08 Thread Hector Martin
27;), e.g. 'ceph osd purge --yes-i-really-mean-it' and make sure there isn't a spurious entry for it in ceph.conf, then re-deploy. Once you do that there is no possible other place for the OSD to somehow remember its old IP. -- Hector Martin (hec...@marcansoft.com) Public Key: htt

[ceph-users] Controlling CephFS hard link "primary name" for recursive stat

2019-02-08 Thread Hector Martin
s formula and then just do the above dance for every hardlinked file to move the primaries off, but this seems fragile and likely to break in certain situations (or do needless work). Any other ideas? Thanks, -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Downsizing a cephfs pool

2019-02-08 Thread Hector Martin
h` back to the cluster seed. > > > I appreciate small clusters are not the target use case of Ceph, but > everyone has to start somewhere! > > ___________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-12 Thread Hector Martin
It's just a 128-byte flag file (formerly variable length, now I just pad it to the full 128 bytes and rewrite it in-place). This is good information to know for optimizing things :-) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub

Re: [ceph-users] Controlling CephFS hard link "primary name" for recursive stat

2019-02-12 Thread Hector Martin
tat(), right. (I only just realized this :-)) Are there Python bindings for what ceph-dencoder does, or at least a C API? I could shell out to ceph-dencoder but I imagine that won't be too great for performance. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrc

Re: [ceph-users] Files in CephFS data pool

2019-02-26 Thread Hector Martin
to know about all the files in a pool. As far as I can tell you *can* read the ceph.file.layout.pool xattr on any files in CephFS, even those that haven't had it explicitly set. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.s

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-27 Thread Hector Martin
odified time of 2 Sept 2028, the day and month are also wrong. Obvious question: are you sure the date/time on your cluster nodes and your clients is correct? Can you track down which files (if any) have the ctime in the future by following the rctime

Re: [ceph-users] Mimic and cephfs

2019-02-27 Thread Hector Martin
4 for months now without any issues in two single-host setups. I'm also in the process of testing and migrating a production cluster workload from a different setup to CephFS on 13.2.4 and it's looking good. -- Hector Martin (hec...@marcansoft.com) Public Key: http

Re: [ceph-users] Erasure coded pools and ceph failure domain setup

2019-03-04 Thread Hector Martin
m OSDs without regard for the hosts; you will be able to use effectively any EC widths you want, but there will be no guarantees of data durability if you lose a whole host. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph

Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread Hector Martin
nt, I have been doing this on two machines (single-host Ceph clusters) for months with no ill effects. The FUSE client performs a lot worse than the kernel client, so I switched to the latter, and it's been working well with no deadlocks. -- Hector Martin (hec...@marcansoft.com) Pu

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
snapshots ( one per day), have one active metadata server, and change several TB daily - it's much, *much* faster than with fuse. Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding. ta ta Jake On 3/6/19 11:10 AM, Hector Martin wrote: On 06/03/2019 12:07, Zhenshi Zhou

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
oit.io> > Tel: +49 89 1896585 90 > > On Tue, Mar 12, 2019 at 10:07 AM Hector Martin > mailto:hec...@marcansoft.com>> wrote: > > > > It's worth noting that most containerized deployments can effectively > > limit RAM for containers (cg

Re: [ceph-users] Rebuild after upgrade

2019-03-17 Thread Hector Martin
/ In particular, you turned on CRUSH_TUNALBLES5, which causes a large amount of data movement: http://docs.ceph.com/docs/master/rados/operations/crush-map/#jewel-crush-tunables5 Going from Firefly to Hammer has a much smaller impact (see the CRUSH_V4 section). -- Hector Martin (hec...@marcansof

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Hector Martin
ith such a wide EC encoding, but if you do lose a PG you'll lose more data because there are fewer PGs. Feedback on my math welcome. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-02 Thread Hector Martin
peek and comment. https://www.memset.com/support/resources/raid-calculator/ I'll take a look tonight :) -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure Coding failure domain (again)

2019-04-10 Thread Hector Martin
n, and you need to hit all 3). This is marginally higher than the ~ 0.00891% with uniformly distributed PGs, because you've eliminated all sets of OSDs which share a host. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___

[ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
, "event": "dispatched" }, { "time": "2019-06-12 16:15:59.096318", "event": "failed to rdlock, waiting" }, { "time": "2019-06-12 16:15:59.268368", "event": "failed to rdlock, waiting" } ] } } ], "num_ops": 1 } My guess is somewhere along the line of this process there's a race condition and the dirty client isn't properly flushing its data. A 'sync' on host2 does not clear the stuck op. 'echo 1 > /proc/sys/vm/drop_caches' does not either, while 'echo 2 > /proc/sys/vm/drop_caches' does fix it. So I guess the problem is a dentry/inode that is stuck dirty in the cache of host2? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
NDING_CAPSNAP)) > cap->mark_needsnapflush(); > } > > > That was quick, thanks! I can build from source but I won't have time to do so and test it until next week, if that's okay. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___

[ceph-users] Broken mirrors: hk, us-east, de, se, cz, gigenet

2019-06-16 Thread Hector Martin
://mirrors.gigenet.com/ceph/ This one is *way* behind on sync, it doesn't even have Nautilus. Perhaps there should be some monitoring for public mirror quality? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing

Re: [ceph-users] Ceph Clients Upgrade?

2019-06-18 Thread Hector Martin
:ceph-users@lists.ceph.com> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- *Pardhiv Karri* "Rise and Rise again untilLAMBSbecome LIONS" ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Protecting against catastrophic failure of host filesystem

2019-06-18 Thread Hector Martin
ure you test that they work (not sure if they need to be base64 decoded or what have you) if you really want to go this route. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-19 Thread Hector Martin
On 13/06/2019 14.31, Hector Martin wrote: > On 12/06/2019 22.33, Yan, Zheng wrote: >> I have tracked down the bug. thank you for reporting this. 'echo 2 > >> /proc/sys/vm/drop_cache' should fix the hang. If you can compile ceph >> from source, please try follo

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-27 Thread Hector Martin
roperly and tested and everything seems fine. I deployed it to production and got rid of the drop_caches hack and I've seen no stuck ops for two days so far. If there is a bug or PR opened for this can you point me to it so I can track when it goes into a release? Thanks! -- Hector Ma

[ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-13 Thread Hector Martin
nings and cephfs talking about reconnections and such) and seems to be fine. I can't find these errors anywhere, so I'm guessing they're not known bugs? -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-use

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Hector Martin
neg%edx 0xd788 <+536>: mov%edx,0x48(%r15) That means req->r_reply_info.filelock_reply was NULL. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Stray count increasing due to snapshots (?)

2019-09-05 Thread Hector Martin
side: is there any good documentation about the on-RADOS data structures used by CephFS? I would like to get more familiar with everything to have a better chance of fixing problems should I run into some data corruption in the future) -- Hector Martin (hec...@marcansoft.com) Public K

Re: [ceph-users] Stray count increasing due to snapshots (?)

2019-09-05 Thread Hector Martin
x27;t involve keeping two months worth of snapshots? That CephFS can't support this kind of use case (and in general that CephFS uses the stray subdir persistently for files in snapshots that could remain forever, while the stray dirs don't scale) sounds like a bug. -- Hec

Re: [ceph-users] Ceph for "home lab" / hobbyist use?

2019-09-10 Thread Hector Martin
oding and dm-crypt (AES-NI) under the OSDs. Since you'd be running a single OSD per host, I imagine you should be able to get reasonable aggregate performance out of the whole thing, but I've never tried a setup like that. I'm actually considering this kind of thing in the fut

[ceph-users] CephFS deletion performance

2019-09-13 Thread Hector Martin
e on the MDS much at that time, so I'm not sure what the bottleneck is here. Is this expected for CephFS? I know data deletions are asynchronous, but not being able to delete metadata/directories without an undue impact on the whole filesystem performance is somewhat problematic. -- Hector

Re: [ceph-users] CephFS deletion performance

2019-09-14 Thread Hector Martin
On 13/09/2019 16.25, Hector Martin wrote: > Is this expected for CephFS? I know data deletions are asynchronous, but > not being able to delete metadata/directories without an undue impact on > the whole filesystem performance is somewhat problematic. I think I'm getting a feeli

Re: [ceph-users] CephFS deletion performance

2019-09-18 Thread Hector Martin
d some strays are never getting cleaned up. I guess I'll see once I catch up on snapshot deletions. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph df shows global-used more than real data size

2019-12-24 Thread Hector Martin
, you need to reduce bluestore_min_alloc_size. -- Hector Martin (hec...@marcansoft.com) Public Key: https://mrcn.st/pub ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin "marcan"
hould zap these 10 osds and start over although at >this point I am afraid even zapping may not be a simple task > > > >On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin >wrote: > >> On 11/7/18 5:27 AM, Hayashida, Mami wrote: >> > 1. Stopped osd.60-69: