Hi Peter,
Thanks for those graphs and thorough explanation. I guess a lot of your
performance increase is due to the fact that a fair amount of your workload is
cachable?
I'm got my new node online late last week with 12x8TB drives and 200GB bcache
partition, coupled with my very uncachable wo
On 04/06/2017 03:22 AM, yipik...@gmail.com wrote:
> On 06/04/2017 09:42, Nick Fisk wrote:
>>
>> I assume Brady is referring to the death spiral LIO gets into with
>> some initiators, including vmware, if an IO takes longer than about
>> 10s. I haven’t heard of anything, and can’t see any changes, s
On 04/06/2017 08:46 AM, David Disseldorp wrote:
> On Thu, 6 Apr 2017 14:27:01 +0100, Nick Fisk wrote:
> ...
>>> I'm not to sure what you're referring to WRT the spiral of death, but we did
>>> patch some LIO issues encountered when a command was aborted while
>>> outstanding at the LIO backstore la
I am trying to understand the cause of a problem we started
encountering a few weeks ago. There are 30 or so per hour messages on
OSD nodes of type:
ceph-osd.33.log:2017-04-10 13:42:39.935422 7fd7076d8700 0 bad crc in
data 2227614508 != exp 2469058201
and
2017-04-10 13:42:39.939284 7fd722c4270
JFYI: Today we get totaly stable setup Ceph + ESXi "without hacks" and
this pass stress tests.
1. Don't try pass RBD directly to LIO, this setup are unstable
2. Instead of that, use Qemu + KVM (i use proxmox for that create VM)
3. Attach RBD to VM as VIRTIO-SCSI disk (must be exported by target_co
I tested it on Hammer and I can recreate what you are seeing. The good
news is that Infernalis and later releases behave correctly -- they
list the range 1M-4M as dirty. Since Hammer is approaching
end-of-life, I wouldn't realistically expect this to be fixed -- but I
did open a tracker ticket to d
On 04/10/2017 01:21 PM, Timofey Titovets wrote:
> JFYI: Today we get totaly stable setup Ceph + ESXi "without hacks" and
> this pass stress tests.
>
> 1. Don't try pass RBD directly to LIO, this setup are unstable
> 2. Instead of that, use Qemu + KVM (i use proxmox for that create VM)
> 3. Attach
Hi Xavier,
I still have the entries in my /etc/fstab file and what I did to solve
the problem was to enable on all nodes the service
"ceph-osd@XXX.service" where "XXX" is the OSD number.
I don't know the reason why this was initially disabled in my
installation...
As for the "ceph-disk lis
The main issue I see with osds not automatically mounting and starting is
the partition ID of the OSD and journals are not set to the GUID expected
by the udev rules for OSDs and journals. Running ceph-disk activate-all
might give you more information as to why the OSDs aren't mounting
properly.
I still see the issue, where the space is not getting deleted. gc process works
sometimes but sometimes it does nothing to clean the GC, as there are no items
in the GC, but still the space is used on the pool.
Any ideas what the ideal config for automatic deletion of these objects after
the fi
Probably a question for @yehuda :
We have fairly strict user accountability requirements. The best way we
have found to meet them with S3 object storage on Ceph is by using RadosGW
subusers.
If we set up one user per bucket, then set up subusers to provide separate
individual S3 keys and access
I've had this issue as well. In my case some or most osds on each host do
mount, but a few don't mount or start. (I have 9 osds on each host).
My workaround is to run partprobe on the device that isn't mounted. This
causes the osd to mount and start automatically. The osds then also mount
on subs
On Mon, Apr 10, 2017 at 2:16 PM, Alex Gorbachev
wrote:
> I am trying to understand the cause of a problem we started
> encountering a few weeks ago. There are 30 or so per hour messages on
> OSD nodes of type:
>
> ceph-osd.33.log:2017-04-10 13:42:39.935422 7fd7076d8700 0 bad crc in
> data 22276
> Op 8 april 2017 om 4:03 schreef Gerald Spencer :
>
>
> Do the rados bindings exist for python3?
> I see this sprinkled in various areas..
> https://github.com/ceph/ceph/pull/7621
> https://github.com/ceph/ceph/blob/master/debian/python3-rados.install
>
> This being said, I can not find said p
On 04/10/2017 08:16 PM, Alex Gorbachev wrote:
I am trying to understand the cause of a problem we started
encountering a few weeks ago. There are 30 or so per hour messages on
OSD nodes of type:
ceph-osd.33.log:2017-04-10 13:42:39.935422 7fd7076d8700 0 bad crc in
data 2227614508 != exp 2469058
15 matches
Mail list logo