st_e.version.version <
e.version.version)
common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")
Does anyone have any suggestions on how to recover our cluster?
Thanks!
Jeff
___
ceph-users mailing list
ceph-users@lis
Udo,
Yes, the osd is mounted: /dev/sda4 963605972 260295676 703310296
28% /var/lib/ceph/osd/ceph-2
Thanks,
Jeff
Original Message
Subject: Re: [ceph-users] Power failure recovery woes
Date: 2015-02-17 04:23
From: Udo Lembke
To: Jeff , ceph-users
er=ceph -i 0 -f
Is there any way to get the cluster to recognize them as being up?
osd-1 has the "FAILED assert(last_e.version.version <
e.version.version)" errors.
Thanks,
Jeff
# idweight type name up/down reweight
-1 10.22 root default
Should I infer from the silence that there is no way to recover from the
"FAILED assert(last_e.version.version < e.version.version)" errors?
Thanks,
Jeff
- Forwarded message from Jeff -
Date: Tue, 17 Feb 2015 09:16:33 -0500
From: Jeff
To: ceph-users@l
ocked messages. Any idea(s) on what's wrong/where to look?
Thanks!
Jeff
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
: /var/log/ceph/ceph-osd.11.log 81
ceph5: /var/log/ceph/ceph-osd.12.log 393
I'll try to catch them while they're happening and see what I can
learn.
Thanks again!!
Jeff
On Thu, Nov 20, 2014 at 06:40:57AM -0800, Jean-Charles LOPEZ wrote:
> Hi Jeff,
>
> it would pro
sing files; e.g.:
/var/lib/ceph/mon/ceph-ceph4/store.db/4011258.ldb
2015-01-09 11:30:32.024445 b6ea1740 -1 failed to create new leveldb store
Does anyone have any suggestions for how to get these two monitors running
again?
Thanks!
Jeff
___
Thanks - ceph health is now reporting HEALTH_OK :-)
On Sat, Jan 10, 2015 at 02:55:01AM +, Joao Eduardo Luis wrote:
> On 01/09/2015 04:31 PM, Jeff wrote:
> >We had a power failure last night and our five node cluster has
> >two nodes with mon's that fail to start.
o do everything manually right now to get a better understanding of it
all.
The ceph docs seem to be version controlled but I can't seem to find the
repo to update, if you can point me to it I'd be happy to submit patches
to it.
Thnx in advance!
Jeff.
_
it
}
rule rule-district-2 {
ruleset 1
type replicated
min_size 2
max_size 3
step take district-2
step chooseleaf firstn 0 type osd
step emit
}
# end crush map
Does anyone have any insight into diagnosing this problem?
Jeff
__
of 3 this is
2200 pgs / OSD, which might be too much and unnecessarily increase the
load on your OSDs.
Best regards,
Lionel Bouton
Our workload involves creating and destroying a lot of pools. Each pool
has 100 pgs, so it adds up. Could this be causing the problem? What
would you suggest inste
d various maps
updated cluster wide. Rince and repeat until all objects have been dealt
with.
Quite a bit more involved, but that's the price you have to pay when you
have a DISTRIBUTED storage architecture that doesn't rely on a single item
(like an inode) to reflect things for the w
a region as available and allowing it to be overwritten, as
would a traditional file system?
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi Christian
This sounds like the same problem we are having. We get long wait times
on ceph nodes, with certain commands (in our case, mainly mkfs) blocking
for long periods of time, stuck in a wait (and not read or write) state.
We get the same warning messages in syslog, as well.
Jeff
On 04/10/2015 10:10 AM, Lionel Bouton wrote:
On 04/10/15 15:41, Jeff Epstein wrote:
[...]
This seems highly unlikely. We get very good performance without
ceph. Requisitioning and manupulating block devices through LVM
happens instantaneously. We expect that ceph will be a bit slower by
, an outdated
kernel driver isn't out of the question; if anyone has any concrete
information, I'd be grateful.
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
192.168.128.4:6800 socket closed (con state OPEN)
Jeff
On 04/23/2015 12:26 AM, Jeff Epstein wrote:
Do you have some idea how I can diagnose this problem?
I'll look at ceph -s output while you get these stuck process to see
if there's any unusual activity (scrub/deep
scrub/recovery/bacfill
's a pastebin from an OSD experiencing the problem I described. I
set debug_osd to 5/5. If you can provide any insight, I'd be grateful.
http://pastebin.com/kLSwbVRb
Also, if you have any more suggestions on how I can collect potentially
interesting debug info, please let me know. Tha
s now normal. Odd that no one here suggested this fix, and
all the messing about with various topologies, placement groups, and so
on, was for naught.
Jeff
On 04/09/2015 11:25 PM, Jeff Epstein wrote:
As a follow-up to this issue, I'd like to point out some other things
I've notice
ceph bits are up to date as of
yesterday (ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff).
Thanks for any help/suggestions!!
Jeff
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
This is the same issue as yesterday, but I'm still searching for a
solution. We have a lot of data on the cluster that we need and can't
get to it reasonably (It took over 12 hours to export a 2GB image).
The only thing that status reports as wrong is:
health HEALTH_WARN 1 pgs incomplete;
7-30 06:08:18.883179 11127'11658123 12914'1506
[11,9] [11,9] 10321'11641837 2013-07-28 00:59:09.552640 10321'11641837
Thanks again!
Jeff
On Tue, Jul 30, 2013 at 11:44:58AM +0200, Jens Kristian S?gaard wrote:
> Hi,
>
>> This is
OK - so while things are definitely better, we still are not where we
were and "rbd ls -l" still hangs.
Any suggestions?
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
MB in blocks of 4096 KB in 240.974360 sec at 4351 KB/sec
2013-08-01 12:43:39.320462 osd.12 172.16.170.5:6801/1700 1348 : [INF]
bench: wrote 1024 MB in blocks of 4096 KB in 259.023646 sec at 4048 KB/sec
Jeff
--
___
ceph-users mailing list
ceph-users
TH_WARN 32 pgs degraded; 86 pgs stuck unclean
Thanks!
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
aded 2013-08-06 12:00:47.758742 21920'85238 21920'206648
[4,6] [4,6] 0'0 2013-08-05 06:58:36.681726 0'0 2013-08-05
06:58:36.681726
0.4e0 0 0 0 0 0 0 active+remapped
2013-08-06 12:00:47.765391
Thanks for the suggestion. I had tried stopping each OSD for 30
seconds, then restarting it, waiting 2 minutes and then doing the next
one (all OSD's eventually restarted). I tried this twice.
--
___
ceph-users mailing list
ceph-users@lists.ceph.co
Hi,
The activity on our ceph cluster has gone up a lot. We are using exclusively
RBD
storage right now.
Is there a tool/technique that could be used to find out which rbd images are
receiving the most activity (something like "rbdtop")?
Thanks,
Sam,
I've attached both files.
Thanks!
Jeff
On Mon, Aug 12, 2013 at 01:46:57PM -0700, Samuel Just wrote:
> Can you attach the output of ceph osd tree?
>
> Also, can you run
>
> ceph osd getmap -o /tmp/osdmap
>
> and attach /tmp/osdmap?
> -Sam
>
Sam,
3, 14 and 16 have been down for a while and I'll eventually replace
those drives (I could do it now)
but didn't want to introduce more variables.
We are using RBD with Proxmox, so I think the answer about kernel
clients is yes
Jeff
On Mon, Aug 12, 2013 at
Sam,
Thanks that did it :-)
health HEALTH_OK
monmap e17: 5 mons at
{a=172.16.170.1:6789/0,b=172.16.170.2:6789/0,c=172.16.170.3:6789/0,d=172.16.170.4:6789/0,e=172.16.170.5:6789/0},
election epoch 9794, quorum 0,1,2,3,4 a,b,c,d,e
osdmap e23445: 14 osds: 13 up, 13 in
pgmap v1355
Giuseppe,
You could install the kernel from wheezy backports - it is currently at 3.9.
http://backports.debian.org/Instructions/
http://packages.debian.org/source/stable-backports/linux
Regards,
Jeff
On 14 August 2013 10:08, Giuseppe 'Gippa' Paterno' wrote:
> Hi Sage,
>
ncing everything is working fine :-)
(ceph auth del osd.x ; ceph osd crush rm osd.x ; ceph osd rm osd.x).
Jeff
On Wed, Aug 14, 2013 at 01:54:16PM -0700, Gregory Farnum wrote:
> On Thu, Aug 1, 2013 at 9:57 AM, Jeff Moskow wrote:
> > Greg,
> >
> > Thanks for the hints.
t are the recommended ways of seeing who/what is consuming the largest
amount of disk/network bandwidth?
Thanks!
Jeff
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi,
I am now occasionally seeing a ceph statuses like this:
health HEALTH_WARN 2 requests are blocked > 32 sec
They aren't always present even though the cluster is still slow, but
they may be a clue....
Jeff
On Sat, Aug 17, 2013 at 02:32:47PM -07
Hi,
More information. If I look in /var/log/ceph/ceph.log, I see 7893 slow
requests in the last 3 hours of which 7890 are from osd.4. Should I
assume a bad drive? I SMART says the drive is healthy? Bad osd?
Thanks,
Jeff
Martin,
Thanks for the confirmation about 3-replica performance.
dmesg | fgrep /dev/sdb # returns no matches
Jeff
--
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Is there an issue ID associated with this? For those of us who made the
long jump and want to avoid any unseen problems.
Thanks,
Jeff
On Tue, Aug 20, 2013 at 7:57 PM, Sage Weil wrote:
> We've identified a problem when upgrading directly from bobtail to
> dumpling; please wait u
Previous experience with OCFS2 was that its actual performance was pretty
lackluster/awful. The bits Oracle threw on top of (I think) ext3 to make it
work as a multi-writer filesystem with all of the signalling that implies
brought the overall performance down.
Jeff
On Wed, Sep 11, 2013 at 9:58
/csHHjC2h
I have run the osds with the debug statements per the email, but I'm unsure
where to post them, they are 108M each without compression. Should I create a
bug on the tracker?
Thanks,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
We're running xfs on a 3.8.0-31-generic kernel
Thanks,
Jeff
On 10/21/13 1:54 PM, "Samuel Just" wrote:
>It looks like an xattr vanished from one of your objects on osd.3.
>What fs are you running?
>
>On Mon, Oct 21, 2013 at 9:58 AM, Jeff Williams
>wrote:
>&
What is the best way to do that? I tried ceph pg repair, but it only did
so much.
On 10/21/13 3:54 PM, "Samuel Just" wrote:
>Can you get the pg to recover without osd.3?
>-Sam
>
>On Mon, Oct 21, 2013 at 1:59 PM, Jeff Williams
>wrote:
>> We're runn
I apologize, I should have mentioned that both osd.3 and osd.11 crash
immediately and if I do not 'set noout', the crash cascades to the rest of the
cluster.
Thanks,
Jeff
Sent from my Samsung Galaxy Note™, an AT&T LTE smartphone
Original message
From: Sam
On 10/13/2014 4:56 PM, Sage Weil wrote:
On Mon, 13 Oct 2014, Eric Eastman wrote:
I would be interested in testing the Samba VFS and Ganesha NFS integration
with CephFS. Are there any notes on how to configure these two interfaces
with CephFS?
For ganesha I'm doing something like:
FSAL
{
CE
On 12/16/2013 2:36 PM, Dan Van Der Ster wrote:
On Dec 16, 2013 8:26 PM, Gregory Farnum wrote:
On Mon, Dec 16, 2013 at 11:08 AM, Dan van der Ster
wrote:
Hi,
Sorry to revive this old thread, but I wanted to update you on the current
pains we're going through related to clients' nproc (and now
just curious if this situation is rectified?
Thanks,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
n this host) and osd (on other hosts) bind
to 0.0.0.0 and a public IP, respectively.
At this point public/cluster addr/network are WAY overspecified in
ceph.conf, but the problem appeared with far less specification.
Any ideas? Thanks,
Jeff
___
cep
If I understand correctly then, I should either not specify mon addr or
set it to an external IP?
Thanks for the clarification,
Jeff
On 01/15/2014 03:58 PM, John Wilkins wrote:
Jeff,
First, if you've specified the public and cluster networks in
[global], you don't need to
ead 7fa7524f67a0
The SRPM for what ended up on ceph-extras wasn't uploaded to the repo,
so I didn't check to see if it was the Basho patch being applied again
or something else. Downgrading back to leveldb 1.7.0-2 resolved my problem.
Is
the pg, but I'd prefer to learn enough of the innards to understand what
is going on, and possible means of fixing it.
Thanks for any help,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
question with the hope that the cluster would roll back epochs for 0.2f,
but all it does is recreate the pg directory (empty) on osd.4.
Jeff
On 05/05/2014 04:33 PM, Gregory Farnum wrote:
What's your cluster look like? I wonder if you can just remove the bad
PG from osd.4 and let it r
Thanks. That is a cool utility, unfortunately I'm pretty sure the pg in
question had a cephfs object instead of rbd images (because mounting
cephfs is the only noticeable brokenness).
Jeff
On 05/05/2014 06:43 PM, Jake Young wrote:
I was in a similar situation where I could see the PGs da
fter object recovery is as complete as
it's going to get.
At this point though I'm shrugging and accepting the data loss, but
ideas on how to create a new pg to replace the incomplete 0.2f would be
deeply useful. I'm supposing ceph pg force_create_pg 0.2f would suffice.
Jeff
27;t have any
examples of how.
Thanks for any help,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Wow I'm an idiot for getting the wrong reweight command.
Thanks so much,
Jeff
On May 9, 2014 11:06 AM, "Sage Weil" wrote:
> On Fri, 9 May 2014, Jeff Bachtel wrote:
> > I'm working on http://tracker.ceph.com/issues/8310 , basically by
> bringing
> > osds
I see the EL6 build on http://ceph.com/rpm-firefly/el6/x86_64/ but not
on gitbuilder (last build 07MAY). Is 0.80.1 considered a different
branch ref for purposes of gitbuilder?
Jeff
On 05/12/2014 05:31 PM, Sage Weil wrote:
This first Firefly point release fixes a few bugs, the most visible
basic premise even trying to do that, please let me know so I can wave
off (in which case, I believe I'd use ceph_filestore_dump to delete all
copies of this pg in the cluster so I can force create it, which is
failing at this time).
Thanks,
Jeff
___
host. Can this be the
source of the problem? If so, is there a workaround?
$ rbd -p platform showmapped|wc -l
248
Thanks.
Best,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
cking the unmap? Is
there a way to force unmap?
Best,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 09/25/2015 12:38 PM, Ilya Dryomov wrote:
On Fri, Sep 25, 2015 at 7:17 PM, Jeff Epstein
wrote:
We occasionally have a situation where we are unable to unmap an rbd. This
occurs intermittently, with no obvious cause. For the most part, rbds can be
unmapped fine, but sometimes we get this
refcount, lsof wouldn't help.
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 09/25/2015 02:28 PM, Jan Schermer wrote:
What about /sys/block/krbdX/holders? Nothing in there?
There is no /sys/block/krbd450, but there is /sys/block/rbd450. In our
case, /sys/block/rbd450/holders is empty.
Jeff
___
ceph-users mailing list
On 8/9/2016 10:43 AM, Wido den Hollander wrote:
Op 9 augustus 2016 om 16:36 schreef Александр Пивушков :
> >> Hello dear community!
I'm new to the Ceph and not long ago took up the theme of building clusters.
Therefore it is very important to your opinion.
It is necessary to create a clus
c -l
237
VirtualHost servername matches fqdn. ceph.conf uses short hostname (both
are in /etc/hosts pointing to same IP).
Any ideas what might be causing the FastCGI errors? I saw the similar
problems originally with fcgid, which was what led me to install
mod_fastcgi.
Thanks,
Jeff
_
That configuration option is set, the results are the same. To clarify: do
I need to start radosgw from the command line if it is being spawned by
fastcgi? I've tried it both ways with the same result.
Thanks,
Jeff
On Tue, May 14, 2013 at 12:56 AM, Yehuda Sadeh wrote:
> On Mon, May
next
branch, things seem to be working (s3test.py is successful).
Thanks for the help,
Jeff
On Tue, May 14, 2013 at 6:35 AM, Jeff Bachtel <
jbach...@bericotechnologies.com> wrote:
> That configuration option is set, the results are the same. To clarify: do
> I need to start radosgw
Hijacking (because it's related): a couple weeks ago on IRC it was
indicated a repo with these (or updated) qemu builds for CentOS should be
coming soon from Ceph/Inktank. Did that ever happen?
Thanks,
Jeff
On Mon, Jun 3, 2013 at 10:25 PM, YIP Wai Peng wrote:
> Hi Andrel,
>
>
You need to fix your clocks (usually with ntp). According to the log
message they can be off by 50ms and yours seems to be about 85ms off.
On 6/6/2013 8:40 PM, Joshua Mesilane wrote:
> Hi,
>
> I'm currently evaulating ceph as a solution to some HA storage that
> we're looking at. To test I have
On 9/6/2016 8:41 PM, Vlad Blando wrote:
Hi,
My replication count now is this
[root@controller-node ~]# ceph osd lspools
4 images,5 volumes,
Those aren't replica counts they're pool ids.
[root@controller-node ~]#
and I made adjustment and made it to 3 for images and 2 to volumes to
3, it
- new to the list.
Thanks in advance!
--
Jeff Applewhite
Principal Product Manager
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I haven't had any problems using 375GB P4800X's in R730 and R740xd
machines for DB+WAL. The iDRAC whines a bit on the R740 but everything
works fine.
On 9/6/2018 3:09 PM, Steven Vacaroaia wrote:
Hi ,
Just to add to this question, is anyone using Intel Optane DC P4800X on
DELL R630 ...or any
? What is the 0=a=up:active? Is
that saying rank 0 of file system a is up:active?
Jeff Smith
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
I have been removed twice.
On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu
wrote:
>
> Hi,
>
> I'm bumping this old thread cause it's getting annoying. My membership get
> disabled twice a month.
> Between my two Gmail accounts I'm in more than 25 mailing lists and I see
> this behavior only here.
I had to reboot my mds. The hot spare did not kick in and now I am
showing the filesystem is degraded and offline. Both mds are showing
as up:standby. I am not sure how to proceed.
cluster:
id: 188c7fba-288f-45e9-bca1-cc5fceccd2a1
health: HEALTH_ERR
1 filesystem is deg
;> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson
>> > > wrote:
>> > >>
>> > >> I'm also getting removed but not only from ceph. I subscribe
>> > >> d...@kafka.apache.org list and the same thing happens there.
>>
On 1/12/2016 4:51 AM, Burkhard Linke wrote:
Hi,
On 01/08/2016 03:02 PM, Paweł Sadowski wrote:
Hi,
Quick results for 1/5/10 jobs:
*snipsnap*
Run status group 0 (all jobs):
WRITE: io=21116MB, aggrb=360372KB/s, minb=360372KB/s,
maxb=360372KB/s,
mint=6msec, maxt=6msec
*snipsnap*
- Is there any need to open ports other than TCP 6789 and 6800-6803?
- Any other suggestions?
ceph 0.94 on Debian Jessie
Best,
Jeff
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi Steve
Thanks for your answer. I don't have a private network defined.
Furthermore, in my current testing configuration, there is only one OSD,
so communication between OSDs should be a non-issue.
Do you know how OSD up/down state is determined when there is only one OSD?
Best,
Jeff
age, be advised that any
dissemination or copying of this message is prohibited.
If you received this message erroneously, please notify the sender and delete
it, together with any attachments.
-Original Message-
From: Jeff Epstein [mailto:jeff.epst...@commerceguys.com]
Sent: Monday, Jan
Your glance images need to be raw, also. A QCOW image will be
copied/converted.
On 2/8/2016 3:33 PM, Jason Dillaman wrote:
If Nova and Glance are properly configured, it should only require a quick
clone of the Glance image to create your Nova ephemeral image. Have you
double-checked your c
g missing at this point is delegations in an active/active
configuration, but that's mainly because of the synchronous nature of
libcephfs. We have a potential fix for that problem but it requires work
in libcephfs that is not yet done.
Cheers,
--
Jeff Layton
___
On Thu, 2019-02-14 at 10:35 +0800, Marvin Zhang wrote:
> On Thu, Feb 14, 2019 at 8:09 AM Jeff Layton wrote:
> > > Hi,
> > > As http://docs.ceph.com/docs/master/cephfs/nfs/ says, it's OK to
> > > config active/passive NFS-Ganesha to use CephFs. My question is if
On Thu, 2019-02-14 at 19:49 +0800, Marvin Zhang wrote:
> Hi Jeff,
> Another question is about Client Caching when disabling delegation.
> I set breakpoint on nfs4_op_read, which is OP_READ process function in
> nfs-ganesha. Then I read a file, I found that it will hit only once on
>
n the v4 client does revalidate the cache, it relies heavily on NFSv4
change attribute. Cephfs's change attribute is cluster-coherent too, so
if the client does revalidate it should see changes made on other
servers.
> On Thu, Feb 14, 2019 at 8:29 PM Jeff Layton wrote:
> > On Thu, 201
On Fri, 2019-02-15 at 15:34 +0800, Marvin Zhang wrote:
> Thanks Jeff.
> If I set Attr_Expiration_Time as zero in conf , deos it mean timeout
> is zero? If so, every client will see the change immediately. Will it
> decrease the performance hardly?
> I seems that GlusterFS FSAL
when running fio on a single file from a
> single client.
>
>
NFS iops? I'd guess more READ ops in particular? Is that with a
FSAL_CEPH backend?
>
> >
> > > On Thu, Feb 14, 2019 at 9:04 PM Jeff Layton
> > > wrote:
> > > > On Thu, 2019-02-14
vides any
performance gain when the attributes are already cached in the libcephfs
layer.
If we did want to start using the mdcache, then we'd almost certainly
want to invalidate that cache when libcephfs gives up caps. I just don't
see how the extra layer of caching provides mu
We had several postgresql servers running these disks from Dell. Numerous
failures, including one server that had 3 die at once. Dell claims it is a
firmware issue instructed us to upgrade to QDV1DP15 from QDV1DP12 (I am
not sure how these line up to the Intel firmwares). We lost several more
/deploying-a-cephnfs-server-cluster-with-rook/
I don't think that site has a way to post comments, but I'm happy to
answer questions about it via email.
--
Jeff Layton
signature.asc
Description: This is a digitally signed message part
___
ceph-use
like ganesha is probably just too swamped with write requests
to do much else, but you'll probably want to do the legwork starting
with the hanging application, and figure out what it's doing that
takes so long. Is it some syscall? Which one?
>From there you can start looking at statisti
. With that, you can also use the rados_ng recovery backend,
which is more resilient in the face of multiple crashes.
In that configuration you would want to have the same config file on
both nodes, including the same nodeid so that you can potentially take
advantage of the RECLAIM_RESET interface to kill off the old session
quickly after the server restarts.
You also need a much longer grace period.
Cheers,
--
Jeff Layton
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
ack down to the kernel which then wakes
up the original task so it can get the result.
FUSE is a wonderful thing, but it's not really built for speed.
--
Jeff Layton
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
This is almost certainly the same bug that is fixed here:
https://github.com/ceph/ceph/pull/28324
It should get backported soon-ish but I'm not sure which luminous
release it'll show up in.
Cheers,
Jeff
On Wed, 2019-07-17 at 10:36 +0100, David C wrote:
> Thanks for taking a
Ahh, I just noticed you were running nautilus on the client side. This
patch went into v14.2.2, so once you update to that you should be good
to go.
-- Jeff
On Wed, 2019-07-17 at 17:10 -0400, Jeff Layton wrote:
> This is almost certainly the same bug that is fixed here:
>
> https://g
or viruses; you must scan for
> these. Please note that e-mails sent to and from blocz IO Limited are
> routinely monitored for record keeping, quality control and training
> purposes, to ensure regulatory compliance and to prevent viruses and
> unauthorised use of our computer syste
el log from one of the hosts (the other two were similar):
> > https://mrcn.st/p/ezrhr1qR
> >
> > After playing some service failover games and hard rebooting the three
> > affected client boxes everything seems to be fine. The remaining FS
> > client box had no kernel err
On Thu, 2019-08-15 at 16:45 +0900, Hector Martin wrote:
> On 15/08/2019 03.40, Jeff Layton wrote:
> > On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote:
> > > Jeff, the oops seems to be a NULL dereference in ceph_lock_message().
> > > Please take a look.
> >
Actually, scratch that. I went ahead and opened this:
https://tracker.ceph.com/issues/43649
Feel free to watch that one for updates.
On Fri, 2020-01-17 at 07:43 -0500, Jeff Layton wrote:
> No problem. Can you let me know the tracker bug number once you've
> opened it?
>
&g
On Fri, 2020-01-17 at 17:10 +0100, Ilya Dryomov wrote:
> On Fri, Jan 17, 2020 at 2:21 AM Aaron wrote:
> > No worries, can definitely do that.
> >
> > Cheers
> > Aaron
> >
> > On Thu, Jan 16, 2020 at 8:08 PM Jeff Layton wrote:
> > > On T
u might also just be able to enable
the CRUSH tunables (http://ceph.com/docs/master/rados/operations/crush-map/#tunables).
I experienced this (stuck active+remapped) frequently with the stock
0.41 apt-get/Ubuntu version of ceph. Less so with Bobtail.
Jeff Anderson-Lee
John, this is becoming a more
1 - 100 of 101 matches
Mail list logo