On Tue, Sep 25, 2018 at 2:23 AM Andras Pataki
wrote:
>
> The whole cluster, including ceph-fuse is version 12.2.7.
>
If this issue happens again, please set "debug_objectcacher" option of
ceph-fuse to 15 (for 30 seconds) and set ceph-fuse log to us
Regards
Yan, Zheng
> A
> On Sep 25, 2018, at 20:24, Ilya Dryomov wrote:
>
> On Tue, Sep 25, 2018 at 2:05 PM 刘 轩 wrote:
>>
>> Hi Ilya:
>>
>> I have some questions about the commit
>> d84b37f9fa9b23a46af28d2e9430c87718b6b044 about the function
>> handle_cap_export. In which case, issued! = cap->implemented may occ
Thinks for the log. I think it's caused by http://tracker.ceph.com/issues/36192
Regards
Yan, Zheng
On Wed, Sep 26, 2018 at 1:51 AM Andras Pataki
wrote:
>
> Hi Zheng,
>
> Here is a debug dump:
> https://users.flatironinstitute.org/apataki/public_www/7f0011f676112cd4/
> I
does the log show which assertion was triggered?
Yan, Zheng
On Mon, Oct 8, 2018 at 9:20 AM Alfredo Daniel Rezinovsky
wrote:
>
> Cluster with 4 nodes
>
> node 1: 2 HDDs
> node 2: 3 HDDs
> node 3: 3 HDDs
> node 4: 2 HDDs
>
> After a problem with upgrade from 13.2.1
On Fri, Oct 5, 2018 at 6:57 PM Burkhard Linke
wrote:
>
> Hi,
>
>
> a user just stumbled across a problem with directory content in cephfs
> (kernel client, ceph 12.2.8, one active, one standby-replay instance):
>
>
> root@host1:~# ls /ceph/sge-tmp/db/work/06/ | wc -l
> 224
> root@host1:~# uname -a
Sorry. this is caused wrong backport. downgrading mds to 13.2.1 and
marking mds repaird can resolve this.
Yan, Zheng
On Sat, Oct 6, 2018 at 8:26 AM Sergey Malinin wrote:
>
> Update:
> I discovered http://tracker.ceph.com/issues/24236 and
> https://github.com/ceph/ceph/pull/22146
Sorry there is bug in 13.2.2 that breaks compatibility of purge queue
disk format. Please downgrading mds to 13.2.1, then run 'ceph mds
repaired cephfs_name:0'.
Regards
Yan, Zheng
On Mon, Oct 8, 2018 at 9:20 AM Alfredo Daniel Rezinovsky
wrote:
>
> Cluster with 4 nodes
>
> n
There is a bug in v13.2.2 mds, which causes decoding purge queue to
fail. If mds is already in damaged state, please downgrade mds to
13.2.1, then run 'ceph mds repaired fs_name:damaged_rank' .
Sorry for all the trouble I caused.
Yan, Zheng
__
un 'ceph mds repaired ...' ?
>
> Thanks for your help
>
> On 07/10/18 23:18, Yan, Zheng wrote:
> > Sorry there is bug in 13.2.2 that breaks compatibility of purge queue
> > disk format. Please downgrading mds to 13.2.1, then run 'ceph mds
> > repaired cephfs_name:0
I can
>> downgrade at all ? I am using ceph with docker deployed with
>> ceph-ansible. I wonder if I should push downgrade or basically wait for
>> the fix. I believe, a fix needs to be provided.
>>
>> Thank you,
>>
>> On 10/7/2018 9:30 PM, Yan, Zheng wrote:
32 pg crush rule replicated_ssd
>
> test done by fio: fio --randrepeat=1 --ioengine=libaio --direct=1
> --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k
> --iodepth=64 --size=1G --readwrite=randrw --rwmixread=75
>
kernel version? maybe cephfs driver in your kernel does not
On Mon, Oct 8, 2018 at 3:38 PM Tomasz Płaza wrote:
>
>
> On 08.10.2018 at 09:21, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 1:54 PM Tomasz Płaza wrote:
> >> Hi,
> >>
> >> Can someone please help me, how do I improve performance on our CephFS
> >&
at recursive
> scrub.
> After that I only mounted the fs read-only to backup the data.
> Would anything even work if I had mds journal and purge queue truncated?
>
did you backed up whole metadata pool? did you make any modification
to the original metadata pool? If you did, what mo
On Mon, Oct 8, 2018 at 5:43 PM Sergey Malinin wrote:
>
>
>
> > On 8.10.2018, at 12:37, Yan, Zheng wrote:
> >
> > On Mon, Oct 8, 2018 at 4:37 PM Sergey Malinin wrote:
> >>
> >> What additional steps need to be taken in order to (try to) regain acce
On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
wrote:
>
>
>
> On 08/10/18 09:45, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 6:40 PM Alfredo Daniel Rezinovsky
> > wrote:
> >> On 08/10/18 07:06, Yan, Zheng wrote:
> >>> On Mon,
On Mon, Oct 8, 2018 at 9:46 PM Alfredo Daniel Rezinovsky
wrote:
>
>
>
> On 08/10/18 10:20, Yan, Zheng wrote:
> > On Mon, Oct 8, 2018 at 9:07 PM Alfredo Daniel Rezinovsky
> > wrote:
> >>
> >>
> >> On 08/10/18 09:45, Yan, Zheng wrote:
> >&g
On Tue, Oct 9, 2018 at 5:39 AM Alfredo Daniel Rezinovsky
wrote:
>
> It seems my purge_queue journal is damaged. Even if I reset it keeps
> damaged.
>
> What means inotablev mismatch ?
>
>
> 2018-10-08 16:40:03.144 7f05b6099700 -1 log_channel(cluster) log [ERR] :
> journal replay inotablev mismatch
So far there is no way to do this
On Thu, Oct 11, 2018 at 4:54 PM Felix Stolte wrote:
>
> Hey folks,
>
> I use nfs-ganesha to export cephfs to nfs. nfs-ganesha can talk to
> cephfs via libcephfs so there is no need for mounting cephfs manually. I
> also like to use directory quotas from cephfs. An
agressive "read-ahead" play a role?
>
> Other thoughts on what root cause on the different behaviour could be?
>
> Clients are using 4.15 kernel.. Anyone aware of newer patches in this area
> that could impact ?
>
how many cephfs mounts that a
with
> >> "dotlock".
> >>
> >> It ends to a unresponsive MDS (100% CPU hang, switching to another MDS
> >> but always staying at 100% CPU usage). I can't even use the admin socket
> >> when MDS is hanged.
> >>
For issue like
On Thu, Oct 18, 2018 at 3:35 PM Florent B wrote:
>
> I'm not familiar with gdb, what do I need to do ? Install "-gdb" version
> of ceph-mds package ? Then ?
> Thank you
>
install ceph with debug info, install gdb. run 'gdb attach '
> On 18/10/2018
-p > /sys/kernel/debug/dynamic_debug/control;
time for i in $(seq 0 3); do echo "dd if=test.$i.0 of=/dev/null
bs=1M"; done | parallel -j 4
If you can reproduce this issue. please send kernel log to us.
Regards
Yan, Zheng
> ___
>
On Mon, Oct 8, 2018 at 2:57 PM Dylan McCulloch wrote:
>
> Hi all,
>
>
> We have identified some unexpected blocking behaviour by the ceph-fs kernel
> client.
>
>
> When performing 'rm' on large files (100+GB), there appears to be a
> significant delay of 10 seconds or more, before a 'stat' opera
no action is required. mds fixes this type of error atomically.
On Fri, Oct 19, 2018 at 6:59 PM Burkhard Linke
wrote:
>
> Hi,
>
>
> upon failover or restart, or MDS complains that something is wrong with
> one of the stray directories:
>
>
> 2018-10-19 12:56:06.442151 7fc908e2d700 -1 log_channel(c
illed with zeros.
>
was your cluster near full when the issue happens ?
Regards
Yan, Zheng
> I tried kernel client 4.x and ceph-fuse client with same result.
>
> I'm using erasure for cephfs data pool, cache tier and my storage is
> bluestore and filestore mixed.
>
>
On Sat, Mar 3, 2018 at 6:17 PM, Jan Pekař - Imatic wrote:
> On 3.3.2018 11:12, Yan, Zheng wrote:
>>
>> On Tue, Feb 27, 2018 at 2:29 PM, Jan Pekař - Imatic
>> wrote:
>>>
>>> I think I hit the same issue.
>>> I have corrupted data on cephf
to debug or what should I do to help to find the problem?
>
> With regards
> Jan Pekar
>
>
> On 14.12.2017 15:41, Yan, Zheng wrote:
>>
>> On Thu, Dec 14, 2017 at 8:52 PM, Florent B wrote:
>>>
>>> On 14/12/2017 03:38, Yan, Zheng wrote:
>>>&
ceph-dencoder can dump individual metadata. most cephfs metadata are
stored in omap headers/values. you can write a scripts that fetch
metadata objects' omap header/values and dump them using
ceph-dencoder.
On Fri, Mar 9, 2018 at 10:09 PM, Pavan, Krish wrote:
> Hi All,
>
> We have cephfs with lar
On Wed, Mar 14, 2018 at 3:17 AM, David C wrote:
> Hi All
>
> I have a Samba server that is exporting directories from a Cephfs Kernel
> mount. Performance has been pretty good for the last year but users have
> recently been complaining of short "freezes", these seem to coincide with
> MDS related
o reduce mds_log_events_per_segment back to its original
> value (1024), because performance is not optimal, and stutters a bit
> too much.
>
> Thanks for your advice!
>
This seems like load balancer bug. Improving load balancer is on the
top of our todo list.
Regards
Yan, Zheng
&
Did the fs have lots of mount/umount? We recently found a memory leak
bug in that area https://github.com/ceph/ceph/pull/20148
Regards
Yan, Zheng
On Thu, Mar 22, 2018 at 5:29 PM, Alexandre DERUMIER wrote:
> Hi,
>
> I'm running cephfs since 2 months now,
>
> and my active
e else run into this? Am I missing something obvious?
>
ceph-fuse does permission check according to localhost's config of
supplement group. that's why you see this behavior.
Regards
Yan, Zheng
> Thanks!
> Josh
>
>
> _
On Fri, Mar 23, 2018 at 9:50 PM, Josh Haft wrote:
> On Fri, Mar 23, 2018 at 12:14 AM, Yan, Zheng wrote:
>>
>> On Fri, Mar 23, 2018 at 5:14 AM, Josh Haft wrote:
>> > Hello!
>> >
>> > I'm running Ceph 12.2.2 with one primary and one standby MDS.
On Sat, Mar 24, 2018 at 11:34 AM, Josh Haft wrote:
>
>
> On Fri, Mar 23, 2018 at 8:49 PM, Yan, Zheng wrote:
>>
>> On Fri, Mar 23, 2018 at 9:50 PM, Josh Haft wrote:
>> > On Fri, Mar 23, 2018 at 12:14 AM, Yan, Zheng wrote:
>> >>
>> >
On Fri, Mar 23, 2018 at 7:45 PM, Perrin, Christopher (zimkop1)
wrote:
> Hi,
>
> Last week out MDSs started failing one after another, and could not be
> started anymore. After a lot of tinkering I found out that MDSs crashed after
> trying to rejoin the Cluster. The only Solution I found that, l
upport/dir/
> foo
>
This is expected behaviour. When fuse_default_permissions=0, all
permission checks are done in ceph-fuse. In your case, ceph-fuse can't
find which groups request initiator are in. This is due to limitation
of fuse API. I don't have idea how to fix it.
Regards
On Thu, Mar 29, 2018 at 3:16 PM, Zhang Qiang wrote:
> Hi,
>
> Ceph version 10.2.3. After a power outage, I tried to start the MDS
> deamons, but they stuck forever replaying journals, I had no idea why
> they were taking that long, because this is just a small cluster for
> testing purpose with on
ed to new format. Format conversion
requires traversing while whole filesystem tree. Not easy to
implement.
2. Ask user to delete all old snapshots before upgrading to mimic,
make mds just ignore old format snaprealms.
Regards
Yan, Zheng
___
ceph-
On Wed, Apr 11, 2018 at 10:10 AM, Sage Weil wrote:
> On Tue, 10 Apr 2018, Patrick Donnelly wrote:
>> On Tue, Apr 10, 2018 at 5:54 AM, John Spray wrote:
>> > On Tue, Apr 10, 2018 at 1:44 PM, Yan, Zheng wrote:
>> >> Hello
>> >>
>> >> To simpl
On Wed, Apr 11, 2018 at 3:34 AM, Gregory Farnum wrote:
> On Tue, Apr 10, 2018 at 5:54 AM, John Spray wrote:
>> On Tue, Apr 10, 2018 at 1:44 PM, Yan, Zheng wrote:
>>> Hello
>>>
>>> To simplify snapshot handling in multiple active mds setup, we changed
massive' slow down.
>>>
>>> This can probably be influenced by tuning the MDS balancer settings, but
>>> I am not sure yet where to start, any suggestions?
>>
>> Well, you can disable directory fragmentation, but if it's happening
>> automatic
On Sat, Apr 14, 2018 at 9:23 PM, Alexandre DERUMIER wrote:
> Hi,
>
> Still leaking again after update to 12.2.4, around 17G after 9 days
>
>
>
>
> USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND
>
> ceph 629903 50.7 25.9 17473680 17082432 ? Ssl avril05 6498:21
>
ps - these are
> all 12.2.4 Fuse clients.
>
> Any ideas how to find the cause?
> It only happens since recently, and under high I/O load with many metadata
> operations.
>
Sounds like bug in readdir cache. Could you try the attached patch.
Regards
Yan, Zheng
> Cheers,
&
On Fri, Apr 27, 2018 at 11:49 PM, Oliver Freyermuth
wrote:
> Dear Yan Zheng,
>
> Am 27.04.2018 um 15:32 schrieb Yan, Zheng:
>> On Fri, Apr 27, 2018 at 7:10 PM, Oliver Freyermuth
>> wrote:
>>> Dear Yan Zheng,
>>>
>>> Am 27.04.2018 um 02:58 schrieb Y
On Fri, Apr 27, 2018 at 11:49 PM, Oliver Freyermuth
wrote:
> Dear Yan Zheng,
>
> Am 27.04.2018 um 15:32 schrieb Yan, Zheng:
>> On Fri, Apr 27, 2018 at 7:10 PM, Oliver Freyermuth
>> wrote:
>>> Dear Yan Zheng,
>>>
>>> Am 27.04.2018 um 02:58 schrieb Y
On Sat, Apr 28, 2018 at 10:25 AM, Oliver Freyermuth
wrote:
> Am 28.04.2018 um 03:55 schrieb Yan, Zheng:
>> On Fri, Apr 27, 2018 at 11:49 PM, Oliver Freyermuth
>> wrote:
>>> Dear Yan Zheng,
>>>
>>> Am 27.04.2018 um 15:32 schrieb Yan, Zheng:
>>>&
try running "rados -p touch 1002fc5d22d."
before mds restart
On Thu, May 3, 2018 at 2:31 AM, Pavan, Krish wrote:
>
>
> We have ceph 12.2.4 cephfs with two active MDS server and directory are
> pinned to MDS servers. Yesterday MDS server crashed. Once all fuse clients
> have unmounted,
ktrace
>
> * This left me with the attached backtraces (which I think are wrong as I
> see a lot of ?? yet gdb says
> /usr/lib/debug/.build-id/1d/23dc5ef4fec1dacebba2c6445f05c8fe6b8a7c.debug was
> loaded)
>
> kh10-8 mds backtrace -- https://pastebin.com/bwqZGcfD
> kh09-8 mds backtrace -- https://pastebin.co
it's controlled by kernel. basically, the more memory a program uses,
then more it uses swap. If you don't like this, try tuning
/proc/sys/vm/swappiness.
On Sat, May 5, 2018 at 12:18 PM, Marc Roos wrote:
>
> Should I then start increasing the mds_cache_memory_limit?
>
> PID=3909094 - Swap used: 8
On Mon, May 14, 2018 at 5:37 PM, Josef Zelenka
wrote:
> Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48
> OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday,
> we were doing a HW upgrade of the nodes, so they went down one by one - the
> cluster was
On Mon, May 21, 2018 at 3:22 AM, Philip Poten wrote:
> Hi,
>
> I managed to mess up the cache pool on an erasure coded cephfs:
>
> - I split pgs on the cache pool, and got a stray/unknown pg somehow
> - added a second cache pool in the hopes that I'll be allowed to remove the
> first, broken one
l. Is there something else I should try to get
> more info? I was hoping for something closer to a python trace where it says
> a variable is a different type or a missing delimiter. womp. I am definitely
> out of my depth but now is a great time to
ive": 0,
> "command_send": 0,
> "command_resend": 0,
> "map_epoch": 3907,
> "map_full": 0,
> "map_inc": 601,
> "osd_sessions": 18,
> "osd_session_open": 20,
; "op_pg": 0,
>> "osdop_stat": 1518341,
>> "osdop_create": 4314348,
>> "osdop_read": 79810,
>> "osdop_write": 59151421,
>> "osdop_writefull": 237358,
>> "osdop_writesame": 0,
&
>>>> "handle_slave_request": 0,
>>>> "req_create": 4116952,
>>>> "req_getattr": 18696543,
>>>> "req_getfilelock": 0,
>>>> "req_link": 6625,
>>>> "req_lookup": 142582473
On Fri, May 25, 2018 at 4:28 PM, Yan, Zheng wrote:
> I found some memory leak. could you please try
> https://github.com/ceph/ceph/pull/22240
>
the leak only affects multiple active mds, I think it's unrelated to your issue.
>
> On Fri, May 25, 2018 at 1:49 PM, Alexandr
end_bytes": 605992324653,
>> "op_resend": 22,
>> "op_reply": 197932421,
>> "op": 197932421,
>> "op_r": 116256030,
>> "op_w": 81676391,
>> "op_rmw": 0,
>> "op_pg": 0,
>> "osdop
Could you try path https://github.com/ceph/ceph/pull/22240/files.
The leakage of MMDSBeacon messages can explain your issue.
Regards
Yan, Zheng
On Mon, May 28, 2018 at 12:06 PM, Alexandre DERUMIER
wrote:
>>>could you send me full output of dump_mempools
>
> # ceph dae
Single or multiple acitve mds? Were there "Client xxx failing to
respond to capability release" health warning?
On Mon, May 28, 2018 at 10:38 PM, Oliver Freyermuth
wrote:
> Dear Cephalopodians,
>
> we just had a "lockup" of many MDS requests, and also trimming fell behind,
> for over 2 days.
> O
g every single time that happens.
>
>
> From: ceph-users on behalf of Yan, Zheng
>
> Sent: Tuesday, 29 May 2018 9:53:43 PM
> To: Oliver Freyermuth
> Cc: Ceph Users; Peter Wienemann
> Subject: Re: [ceph-users] Ceph-fuse getting stuck wit
(revoke)"
warnings in cluster log. please send them to me if there were.
> Cheers,
> Oliver
>
> Am 30.05.2018 um 03:25 schrieb Yan, Zheng:
>> I could be http://tracker.ceph.com/issues/24172
>>
>>
>> On Wed, May 30, 2018 at 9:01 AM, Linh Vu wrote:
>>>
On Wed, May 30, 2018 at 5:17 PM, Oliver Freyermuth
wrote:
> Am 30.05.2018 um 10:37 schrieb Yan, Zheng:
>> On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth
>> wrote:
>>> Hi,
>>>
>>> ij our case, there's only a single active MDS
>>> (+1
On Wed, May 30, 2018 at 5:17 PM, Oliver Freyermuth
wrote:
> Am 30.05.2018 um 10:37 schrieb Yan, Zheng:
>> On Wed, May 30, 2018 at 3:04 PM, Oliver Freyermuth
>> wrote:
>>> Hi,
>>>
>>> ij our case, there's only a single active MDS
>>> (+1
Bryan Henderson 于 2018年6月2日周六 10:23写道:
> >Luckily; it's not. I don't remember if the MDS maps contain entirely
> >ephemeral data, but on the scale of cephfs recovery scenarios that's just
> >about the easiest one. Somebody would have to walk through it; you
> probably
> >need to look up the table
1524578400.M820820P705532.dovecot-15-hgjlx,S=425674,W=431250:2,RS"
> },
> {
> "damage_type": "backtrace",
> "id": 121083841,
> "ino": 1099515215027,
> "path":
> "/path/to/mails/other_user/Maildir/.Junk/cur/1528189
Tob 于 2018年6月6日周三 22:21写道:
> Hi!
>
> Thank you for your reply.
>
> I just did:
>
> > The correct commands should be:
> >
> > ceph daemon scrub_path / force recursive repair
> > ceph daemon scrub_path '~mdsdir' force recursive repair
>
> They returned instantly and in the mds' logfile only the f
h
> snapshots in an active/standby MDS environment. It seems like a silly
> question since it is considered stable for multi-mds, but better safe than
> sorry.
>
Snapshot on active/standby MDS environment is also supported.
Yan, Zheng
> __
On Thu, Jun 7, 2018 at 2:44 PM, Tobias Florek wrote:
> Hi!
>
> Thank you for your help! The cluster is running healthily for a day now.
>
> Regarding the problem, I just checked in the release notes [1] and on
> docs.ceph.com and did not find the right invocation after an upgrade.
> Maybe that oug
On Wed, Jun 13, 2018 at 7:06 PM Alessandro De Salvo
wrote:
>
> Hi,
>
> I'm trying to migrate a cephfs data pool to a different one in order to
> reconfigure with new pool parameters. I've found some hints but no
> specific documentation to migrate pools.
>
> I'm currently trying with rados export
On Wed, Jun 13, 2018 at 3:34 AM Webert de Souza Lima
wrote:
>
> hello,
>
> is there any performance impact on cephfs for using file layouts to bind a
> specific directory in cephfs to a given pool? Of course, such pool is not the
> default data pool for this cephfs.
>
For each file, no matter w
On Wed, Jun 13, 2018 at 9:35 PM Alessandro De Salvo
wrote:
>
> Hi,
>
>
> Il 13/06/18 14:40, Yan, Zheng ha scritto:
> > On Wed, Jun 13, 2018 at 7:06 PM Alessandro De Salvo
> > wrote:
> >> Hi,
> >>
> >> I'm trying to migrate a cephfs dat
On Sat, Jun 16, 2018 at 12:23 PM Hector Martin wrote:
>
> I'm at a loss as to what happened here.
>
> I'm testing a single-node Ceph "cluster" as a replacement for RAID and
> traditional filesystems. 9 4TB HDDs, one single (underpowered) server.
> Running Luminous 12.2.5 with BlueStore OSDs.
>
> I
ubuntu/mirror/mirror.uni-trier.de/ubuntu/pool/universe/g/gst-fluendo-mp3
/damaged_dirs
ceph daemon mds. journal flush
ceph daemon mds. journal flush
ceph daemon mds. scrub_path /damaged_dirs force recursive repair
Now you should be able to rmdir directories in /dmamged_dirs/
Regards
Yan, Zhen
On Wed, Jun 27, 2018 at 6:16 PM Dennis Kramer (DT) wrote:
>
> Hi,
>
> Currently i'm running Ceph Luminous 12.2.5.
>
> This morning I tried running Multi MDS with:
> ceph fs set max_mds 2
>
> I have 5 MDS servers. After running above command,
> I had 2 active MDSs, 2 standby-active and 1 standby.
On Wed, Jun 27, 2018 at 8:04 PM Yu Haiyang wrote:
>
> Hi All,
>
> Using fio with job number ranging from 1 to 128, the random write speed for
> 4KB block size has been consistently around 1MB/s to 2MB/s.
> Random read of the same block size can reach 60MB/s with 32 jobs.
run fio on ceph-fuse? If
On Thu, Jun 28, 2018 at 10:30 AM Yu Haiyang wrote:
>
> Hi Yan,
>
> Thanks for your suggestion.
> No, I didn’t run fio on ceph-fuse. I mounted my Ceph FS in kernel mode.
>
command option of fio ?
> Regards,
> Haiyang
>
> > On Jun 27, 2018, at 9:45 PM, Yan, Zhe
On Fri, Jun 29, 2018 at 10:01 AM Yu Haiyang wrote:
>
> Ubuntu 16.04.3 LTS
>
4.4 kernel? AIO on cephfs is not supported by 4.4 kernel, AIO
actually is synchronized IO. 4.5 kernel is the first version that
support AIO on cephfs
> On Jun 28, 2018, at 9:00 PM, Yan, Zheng wrote
6719msec, maxt=66719msec
>> -----------
>>
fio tests AIO performance in this case. cephfs does not handle AIO
properly, AIO is actually SYNC IO. that's why cephfs is so slow in
this case.
Regards
Yan, Zheng
>> It seems
red+known_if_redirected e193868) currently reached_pg
>
> It's not a single OSD misbehaving. It seems to be any OSD. The OSDs have
> plenty of disk space, and there's nothing in the osd logs that points to a
> problem.
Is there any suspected kernel message? (ceph or ne
. It might hiccup 1 or 2 times
> in the find across 10k files.
>
When operation hangs, do you see any 'slow request ...' log message in
the cluster log. Besides, do have have multiple clients accessing the
filesystem? which version of ceph do you use?
Regards
Yan, Zheng
> This
On Mon, Dec 21, 2015 at 5:15 PM, Florent B wrote:
> Hi all,
>
> It seems I had an MDS crash being in standby-replay.
>
> Version is Infernalis, running on Debian Jessie (packaged version).
>
> Log is here (2.5MB) : http://paste.ubuntu.com/14126366/
>
> Has someone information about it ?
>
> Thank
On Mon, Dec 21, 2015 at 11:46 PM, Don Waterloo wrote:
> On 20 December 2015 at 22:47, Yan, Zheng wrote:
>>
>> >> ---
>> >>
>>
>>
>> fio tests AIO performance in this case. cephfs does
On Tue, Dec 22, 2015 at 7:18 PM, Don Waterloo wrote:
> On 21 December 2015 at 22:07, Yan, Zheng wrote:
>>
>>
>> > OK, so i changed fio engine to 'sync' for the comparison of a single
>> > underlying osd vs the cephfs.
>> >
>> > the
On Tue, Dec 22, 2015 at 9:29 PM, Francois Lafont wrote:
> Hello,
>
> On 21/12/2015 04:47, Yan, Zheng wrote:
>
>> fio tests AIO performance in this case. cephfs does not handle AIO
>> properly, AIO is actually SYNC IO. that's why cephfs is so slow in
>> this ca
On Mon, Dec 28, 2015 at 1:24 PM, Francois Lafont wrote:
> Hi,
>
> Sorry for my late answer.
>
> On 23/12/2015 03:49, Yan, Zheng wrote:
>
>>>> fio tests AIO performance in this case. cephfs does not handle AIO
>>>> properly, AIO is actually SYNC IO. that
On Wed, Dec 30, 2015 at 5:59 AM, Mykola Dvornik
wrote:
> Hi guys,
>
> I have 16 OSD/1MON/1MDS ceph cluster serving CephFS.
>
> The FS is mounted on 11 clients using ceph-fuse. In some cases there are
> multiple ceph-fuse processes per client, each with its own '-r' option.
>
> The problem is that
5-12-29 11:35 /ceph.conf
>
> This doesn't work (using user 'ceph' to list files)
>
> ubuntu@cephmaster:~/ceph-cluster$ sudo -u ceph hadoop fs -ls /
>
> ls: Permission denied
can 'user ceph' access the ceph keyring file?
Yan, Zheng
>
>
> I thin
On Tue, Dec 29, 2015 at 5:20 PM, Francois Lafont wrote:
> Hi,
>
> On 28/12/2015 09:04, Yan, Zheng wrote:
>
>>> Ok, so in a client node, I have mounted cephfs (via ceph-fuse) and a rados
>>> block device formatted in XFS. If I have well understood, cephfs uses sync
&
hat this issue just went unnoticed, and it only being a
> infernalis problem is just a red herring. With that, it is oddly
> coincidental that we just started seeing issues.
This seems like seekdir bugs in kernel client, could you try 4.0+ kernel.
Besides, do you enable "mds bal frag
ait a while, the
second mds will go to standby state. Occasionally, the second MDS can
stuck in stopping state. If it happens, restart all MDS, then repeat
step 3.
Regards
Yan, Zheng
>
>
> On Wed, Jan 13, 2016 at 7:05 PM, Yan, Zheng wrote:
>>
>> On Thu, Jan 14, 2016 at 3:3
om with the actual
> monitor commands you ran and as much of the backtrace/log as you can;
> we don't want to have commands which break the system! ;)
> -Greg
>
>>
>> Mike C
>>
>>
>>
>> On Thu, Jan 14, 2016 at 3:33 PM, Yan, Zheng wrote:
&g
ph.com with the actual
> monitor commands you ran and as much of the backtrace/log as you can;
> we don't want to have commands which break the system! ;)
> -Greg
the problem is that he ran ‘ceph mds rmfailed 1’ and there is no command to
undo this. I think we need a command “ceph mds add
.
I’m writing the patch.
>
> On Thu, Jan 14, 2016 at 4:15 PM, Yan, Zheng wrote:
>
> > On Jan 15, 2016, at 08:01, Gregory Farnum wrote:
> >
> > On Thu, Jan 14, 2016 at 3:46 PM, Mike Carlson wrote:
> >> Hey Zheng,
> >>
> >> I've be
to our cluster?
>
> On Thu, Jan 14, 2016 at 4:19 PM, Yan, Zheng wrote:
>
> > On Jan 15, 2016, at 08:16, Mike Carlson wrote:
> >
> > Did I just loose all of my data?
> >
> > If we were able to export the journal, could we create a brand new mds out
> > of
On Fri, Jan 22, 2016 at 6:24 AM, Gregory Farnum wrote:
> On Fri, Jan 15, 2016 at 9:00 AM, HMLTH wrote:
>> Hello,
>>
>> I'm evaluating cephfs on a virtual machines cluster. I'm using Infernalis
>> (9.2.0) on debian Jessie as client and server.
>>
>> I'm trying to get some performance numbers on op
rados -p metadata rmomapkey 1. _head
Before running above commands, please help us debug this issue. Set
debug_mds = 10, restart mds and access the bad file.
Regards
Yan, Zheng
> Regards,
> Burkhard
>
> ___
> ceph-users mailing
On Mon, Jan 25, 2016 at 9:43 PM, Burkhard Linke
wrote:
> Hi,
>
> On 01/25/2016 01:05 PM, Yan, Zheng wrote:
>>
>> On Mon, Jan 25, 2016 at 3:43 PM, Burkhard Linke
>> wrote:
>>>
>>> Hi,
>>>
>>> there's a rogue file in our Ceph
On Mon, Jan 25, 2016 at 10:40 PM, Burkhard Linke
wrote:
> Hi,
>
>
> On 01/25/2016 03:27 PM, Yan, Zheng wrote:
>>
>> On Mon, Jan 25, 2016 at 9:43 PM, Burkhard Linke
>> wrote:
>>>
>>> Hi,
>>>
>>> On 01/25/2016 01:05 PM, Yan, Zhen
On Tue, Jan 26, 2016 at 3:16 PM, Burkhard Linke
wrote:
> Hi,
>
> On 01/26/2016 07:58 AM, Yan, Zheng wrote:
>
> *snipsnap*
>
>> I have a few questions
>>
>> Which version of ceph are you using? When was the filesystem created?
>> Did you manually delete 10
On Tue, Jan 26, 2016 at 5:58 PM, Burkhard Linke
wrote:
> Hi,
>
> On 01/26/2016 10:24 AM, Yan, Zheng wrote:
>>
>> On Tue, Jan 26, 2016 at 3:16 PM, Burkhard Linke
>> wrote:
>>>
>>> Hi,
>>>
>>> On 01/26/2016 07:58 AM, Yan,
301 - 400 of 548 matches
Mail list logo