Hi all!
I’m new to cephFS. My test file system uses a replicated pool on NVMe SSDs for
metadata and an erasure coded pool on HDDs for data. All OSDs uses bluestore.
I used the ceph version 16.2.6 for all daemons - created with this version and
running this version. The linux kernel that I used f
Hi Zakhar,
I don't have much experience with Ceph, so you should read my words with
reasonable skepticism.
If your failure domain should be the host level, then k=4, m=2 is you most
space efficient option for 6 server that allows you to still do write IO when
one of the servers failed. Assumin
Hi Stefan,
thank you for sharing your experience! After I read your mail I did some more
testing and for me the issue is strictly related to snapshots and perfectly
reproducible. However, I mad two new observations that was not clear for me
until now.
First, snapshots that was created before
Hi all,
after a reboot of a cluster 3 OSDs can not be started. The OSDs exit with the
following error message:
2021-12-21T01:01:02.209+0100 7fd368cebf00 4 rocksdb:
[db_impl/db_impl.cc:396] Shutdown: canceling all background work
2021-12-21T01:01:02.209+0100 7fd368cebf00 4 rock
line at dmesg
Thanks,
Sebastian
> On 21.12.2021, at 19:29, c...@elchaka.de wrote:
>
> Hi,
> This
> > fsck failed: (5) Input/output error
>
> Sounds like an Hardware issue.
> Did you have a Look on "dmesg"?
>
> Hth
> Mehmet
>
> Am 21. Dezemb
ith a raised debug level for that specific OSD. The major
> problem with this bug debugging is that we can see its consequences - but we
> have no clue about what was happening when actual corruption happened. Hence
> we need to reproduce that somehow. So please let me know if we can use your
Hi Mazzystr,
thank you very much for your suggestion! The OSDs did find the bluestore block
device and I do not use any USB drives. All failed OSD are on SATA drives
connected to AMD CPUs / Chipsets.
It seams now clear that the problem is that one of the RocksDBs is corrupted on
each of the fa
Hi Christoph,
I do not have any answer for you, but I find the question very interesting.
I wonder if it is possible to let the HDDs sleep or if the OSD daemons prevent
a hold of the spindle motors. Or can it even create some problems for the OSD
deamon if the HDD spines down?
However, it shoul
g new snapshots unexpectedly. It my have something to do with a reboots
of the ceph FS MDS, but I’m not sure.
Best regards,
Sebastian
> On 24.12.2021, at 13:05, Igor Fedotov wrote:
>
> Hey Sebastian,
>
> On 12/22/2021 1:53 AM, Sebastian Mazza wrote:
>>
>>> 9)
> On 21.01.2022, at 14:36, Marc wrote:
>
>>
>> I wonder if it is possible to let the HDDs sleep or if the OSD daemons
>> prevent a hold of the spindle motors. Or can it even create some problems
>> for the OSD deamon if the HDD spines down?
>> However, it should be easy to check on a cluster
> When having software raid solutions, I was also thinking about spinning them
> down and researching how to do this. I can't exactly remember, but a simple
> hdparm/sdparm command was not sufficient. Now I am bit curious if you solved
> this problem with mdadm / software raid?
>
On the first
> The OSD daemon would crash I would assume.
Since I don't understand why the OSDs should crash just because a disk goes
into standby, I just tried it now.
The result is very unspectacular und fits perfectly to Gregorys great
explanation.
The drive goes into standby for around 2 or 3 seconds an
Hey Igor,
thank you for your response and your suggestions.
>> I've tried to simulate every imaginable load that the cluster might have
>> done before the three OSD crashed.
>> I rebooted the servers many times while the Custer was under load. If more
>> than a single node was rebooted at the s
> Hmm, I see on the man page
>
> -B
> Get/set Advanced Power Management feature, if the drive
> supports it. A low value means aggressive power management
> and a high value means better performance. Possible
> settings range from values 1 through
Hey Igor,
thank you for your response!
>>
>> Do you suggest to disable the HDD write-caching and / or the
>> bluefs_buffered_io for productive clusters?
>>
> Generally upstream recommendation is to disable disk write caching, there
> were multiple complains it might negatively impact the perf
I have a problem with the snap_schedule MGR module. It seams to forget at least
parts of the configuration after the active MGR is restarted.
The following cli commands (lines starting with ‘$’) and their std out (lines
starting with >) demonstrates the problem.
$ ceph fs snap-schedule add /shar
Hey Venky,
thank you very much for your response!
> It would help if you could enable debug log for ceph-mgr, repeat the
> steps you mention above and upload the log in the tracker.
I have already collected log files after enabling the debug log by `ceph config
set mgr mgr/snap_schedule/log_le
:07, Venky Shankar wrote:
>
> On Fri, Jan 28, 2022 at 3:03 PM Sebastian Mazza wrote:
>>
>> Hey Venky,
>>
>> thank you very much for your response!
>>
>>> It would help if you could enable debug log for ceph-mgr, repeat the
>>> steps you ment
Hi Igor,
it happened again. One of the OSDs that crashed last time, has a corrupted
RocksDB again. Unfortunately I do not have debug logs from the OSDs again. I
was collecting hundreds of Gigabytes of OSD debug logs in the last two month.
But this week, I disabled the debug logging, because I d
.02.2022, at 12:18, Igor Fedotov wrote:
>
> Hi Sebastian,
>
> could you please share failing OSD startup log?
>
>
> Thanks,
>
> Igor
>
> On 2/20/2022 5:10 PM, Sebastian Mazza wrote:
>> Hi Igor,
>>
>> it happened again. One of the OSD
Hi Igor,
today (21-02-2022) at 13:49:28.452+0100, I crashed the OSD 7 again. And this
time I have logs with “debug bluefs = 20” and "debug bdev = 20” for every OSD
in the cluster! It was the OSD with the ID 7 again. So the HDD has failed now
the third time! Coincidence? Probably not…
The import
rits? E.g. something like disk wiping procedure which writes
> all-zeros to an object followed by object truncate or removal comes to my
> mind. If you can identify something like that - could you please collect OSD
> log for such an operation (followed by OSD restart) with debug-bluest
memtest86
> or https://github.com/martinwhitaker/pcmemtest (which is a fork of
> memtest86+). Ignore the suggestion if you have ECC RAM.
>
> вт, 22 февр. 2022 г. в 15:45, Igor Fedotov :
>>
>> Hi Sebastian,
>>
>> On 2/22/2022 3:01 AM, Sebastian Mazza wrote:
>
it down. I'm really looking forward to your
interpretation of the logs.
Best Regards,
Sebastian
> On 22.02.2022, at 11:44, Igor Fedotov wrote:
>
> Hi Sebastian,
>
> On 2/22/2022 3:01 AM, Sebastian Mazza wrote:
>> Hey Igor!
>>
>>
>>> thanks a
logs contain what you need?
Pleas tell me If you need more data from the OSD. If not I would rebuild it.
Best wishes,
Sebastian
> On 26.02.2022, at 00:12, Igor Fedotov wrote:
>
> Sebastian,
>
> On 2/25/2022 7:17 PM, Sebastian Mazza wrote:
>> Hi Igor,
>>
>>>
it some
> benchmarking tool or any other artificial load generator? If so could you
> share job desriptions or scripts if any?
>
>
> Thanks,
>
> Igor
>
>
> On 3/10/2022 10:36 PM, Sebastian Mazza wrote:
>> Hi Igor!
>>
>> I hope I've cr
cket (missed I already had one):
> https://tracker.ceph.com/issues/54547
>
> So please expect all the related new there.
>
>
> Kind regards,
>
> Igor
>
>
>
> On 3/14/2022 5:54 PM, Sebastian Mazza wrote:
>> Hallo Igor,
>>
>> I'm glad
time though...
>
>
> Thanks,
>
> Igor
>
> On 3/14/2022 9:33 PM, Sebastian Mazza wrote:
>> Hi Igor,
>>
>> great that you was able to reproduce it!
>>
>> I did read your comments at the issue #54547. Am I right that I probably
>> have hundreds
28 matches
Mail list logo