Hi,
We are running hammer 0.94.2 and have an increasing amount of
"heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f38c77e6700' had
timed out after 600" messages in our radosgw logs, with radosgw
eventually stalling. A restart of the radosgw helps for a few minutes,
but after that it hangs ag
I suspect these to be the cause:
rados ls -p .be-east.rgw.buckets | grep
sanitybe-east.5436.1__:2bpm.1OR-cqyOLUHek8m2RdPVRZ.pDT__sanity
be-east.5436.1__sanity
be-east.5436.1__:2vBijaGnVQF4Q0IjZPeyZSKeUmBGn9X__sanity
be-east.5436.1__sanity
be-east.5436.1__:4JTCVFxB1qoDWPu1nhuMDuZ3QNPaq5n__san
On Thu, Aug 20, 2015 at 11:07 AM, Simon Hallam wrote:
> Hey all,
>
>
>
> We are currently testing CephFS on a small (3 node) cluster.
>
>
>
> The setup is currently:
>
>
>
> Each server has 12 OSDs, 1 Monitor and 1 MDS running on it:
>
> The servers are running: 0.94.2-0.el7
>
> The clients are r
tried removing, but no luck:
rados -p .be-east.rgw.buckets rm
"be-east.5436.1__:2bpm.1OR-cqyOLUHek8m2RdPVRZ.pDT__sanity"
error removing
.be-east.rgw.buckets>be-east.5436.1__:2bpm.1OR-cqyOLUHek8m2RdPVRZ.pDT__sanity:
(2)
anyone?
On 21-08-15 13:06, Sam Wouters wrote:
> I suspect these to be the cau
Hi,
First of all, we are sure that the return to the default configuration
fixed it. As soon as we restarted only one of the ceph nodes with the
default configuration, it sped up recovery tremedously. We had already
restarted before with the old conf and recovery was never that fast.
Regarding th
Thanks for the config,
few comments inline:, not really related to the issue
> On 21 Aug 2015, at 15:12, J-P Methot wrote:
>
> Hi,
>
> First of all, we are sure that the return to the default configuration
> fixed it. As soon as we restarted only one of the ceph nodes with the
> default config
> filestore_fd_cache_random = true
not true
Shinobu
On Fri, Aug 21, 2015 at 10:20 PM, Jan Schermer wrote:
> Thanks for the config,
> few comments inline:, not really related to the issue
>
> > On 21 Aug 2015, at 15:12, J-P Methot wrote:
> >
> > Hi,
> >
> > First of all, we are sure that the r
It sounds like you have rados CLI tool from an earlier Ceph release (< Hammer)
installed and it is attempting to use the librados shared library from a newer
(>= Hammer) version of Ceph.
Jason
- Original Message -
> From: "Aakanksha Pudipeddi-SSI"
> To: ceph-us...@ceph.com
> Sent:
Odd, did you happen to capture osd logs?
-Sam
On Thu, Aug 20, 2015 at 8:10 PM, Ilya Dryomov wrote:
> On Fri, Aug 21, 2015 at 2:02 AM, Samuel Just wrote:
>> What's supposed to happen is that the client transparently directs all
>> requests to the cache pool rather than the cold pool when there is
On Fri, Aug 21, 2015 at 5:59 PM, Samuel Just wrote:
> Odd, did you happen to capture osd logs?
No, but the reproducer is trivial to cut & paste.
Thanks,
Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/l
I think I found the bug -- need to whiteout the snapset (or decache
it) upon evict.
http://tracker.ceph.com/issues/12748
-Sam
On Fri, Aug 21, 2015 at 8:04 AM, Ilya Dryomov wrote:
> On Fri, Aug 21, 2015 at 5:59 PM, Samuel Just wrote:
>> Odd, did you happen to capture osd logs?
>
> No, but the re
We heavily use radosgw here for most of our work and we have seen a
weird truncation issue with radosgw/s3 requests.
We have noticed that if the time between the initial "ticket" to grab
the object key and grabbing the data is greater than 90 seconds the
object returned is truncated to whateve
I saw this article on Linux Today and immediately thought of Ceph.
http://www.enterprisestorageforum.com/storage-management/object-storage-vs.-posix-storage-something-in-the-middle-please-1.html
I was thinking would it theoretically be possible with RGW to do a GET and
set a BEGIN_SEEK and OFFSET
On Fri, Aug 21, 2015 at 10:27 PM, Scottix wrote:
> I saw this article on Linux Today and immediately thought of Ceph.
>
> http://www.enterprisestorageforum.com/storage-management/object-storage-vs.-posix-storage-something-in-the-middle-please-1.html
>
> I was thinking would it theoretically be pos
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Shouldn't this already be possible with HTTP Range requests? I don't
work with RGW or S3 so please ignore me if I'm talking crazy.
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Aug 21,
I just tried this (with some smaller objects, maybe 4.5 MB, as well as
with a 16 GB file and it worked fine.
However, i am using apache + fastcgi interface to rgw, rather than civetweb.
-Ben
On Fri, Aug 21, 2015 at 12:19 PM, Sean wrote:
> We heavily use radosgw here for most of our work and we
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
We are looking to purchase our next round of Ceph hardware and based
off the work by Nick Fisk [1] our previous thought of cores over clock
is being revisited.
I have two camps of thoughts and would like to get some feedback, even
if it is only theo
FWIW, we recently were looking at a couple of different options for the
machines in our test lab that run the nightly QA suite jobs via teuthology.
From a cost/benefit perspective, I think it really comes down to
something like a XEON E3-12XXv3 or the new XEON D-1540, each of which
have advant
Hi,
I have crosspost this issue here and in github,
but no response yet.
Any advice?
On Mon, Aug 10, 2015 at 10:21 AM, dahan wrote:
>
> Hi all, I have tried the reliability model:
> https://github.com/ceph/ceph-tools/tree/master/models/reliability
>
> I run the tool with default configuration,
Hi All,
Is it possible to give TRIM / DISCARD initiated by krbd low priority on the
OSDs?
I know it is possible to run fstrim at Idle priority on the rbd mount point,
e.g. ionice -c Idle fstrim -v $MOUNT .
But this Idle priority (it appears) only is within the context of the node
executing
20 matches
Mail list logo