Thanks a lot - problem fixed.
On 10.10.2017 16:58, Peter Linder wrote:
I think your failure domain within your rules is wrong.
step choose firstn 0 type osd
Should be:
step choose firstn 0 type host
On 10/10/2017 5:05 PM, Konrad Riedel wrote:
Hello Ceph-users,
after switching to luminous
Hello everyone,
lately, we've had issues with buying SSDs that we use for
journaling(Kingston stopped making them) - Kingston V300 - so we decided
to start using a different model and started researching which one would
be the best price/value for us. We compared five models, to check if
they
On 17-10-11 09:50 AM, Josef Zelenka wrote:
Hello everyone,
lately, we've had issues with buying SSDs that we use for
journaling(Kingston stopped making them) - Kingston V300 - so we decided to
start using a different model and started researching which one would be the
best price/value for us.
Hi Gregory
You're right, when setting the object layout in libradosstriper, one should set
all three parameters (the number of stripes, the size of the stripe unit, and
the size of the striped object). The Ceph plugin for GridFTP has an example of
this at
https://github.com/stfc/gridFTPCephPlug
As far as I am able to understand there are 2 ways of setting iscsi for ceph
1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
2- using userspace (tcmu , ceph-iscsi-conf, ceph-iscsi-cli)
I don't know which one is better, I am seeing that oficial support is
pointing to tcmu but i havent
On Wed, Oct 11, 2017 at 1:42 AM, Bill Sharer wrote:
> I've been in the process of updating my gentoo based cluster both with
> new hardware and a somewhat postponed update. This includes some major
> stuff including the switch from gcc 4.x to 5.4.0 on existing hardware
> and using gcc 6.4.0 to ma
Hi, Gregory!
You are absolutely right! Thanks!
The following sequence solves the problem:
rados_striper_set_object_layout_stripe_unit(m_striper, stripe_unit);
rados_striper_set_object_layout_stripe_count(m_striper, stripe_count);
int stripe_size = stripe_unit * stripe_count;
rados_striper_set_obj
Hi, Ian!
Thank you for your reference!
Could you comment on the following rule:
object_size = stripe_unit * stripe_count
Or it is not necessarily so?
I refer to page 8 in this report:
https://indico.cern.ch/event/531810/contributions/2298934/at
tachments/1358128/2053937/Ceph-Experience-at-RAL-f
Oh! I put a wrong link, sorry The picture which explains stripe_unit and
stripe count is here:
https://indico.cern.ch/event/330212/contributions/1718786/attachments/642384/883834/CephPluginForXroot.pdf
I tried to attach it in the mail, but it was blocked.
On Wed, Oct 11, 2017 at 3:16 PM, Alex
Hi Jorge,
On 10/10/2017 07:23 AM, Jorge Pinilla López wrote:
> Are .99 KV, .01 MetaData and .0 Data ratios right? they seem a little
> too disproporcionate.
Yes, this is correct.
> Also .99 KV and Cache of 3GB for SSD means that almost the 3GB would
> be used for KV but there is also another att
On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López
wrote:
> As far as I am able to understand there are 2 ways of setting iscsi for
> ceph
>
> 1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
>
The target_core_rbd approach is only utilized by SUSE (and its derivatives
like PetaSAN)
okay, thanks for the explanation, so from the 3GB of Cache (default
cache for SSD) only a 0.5GB is going to K/V and 2.5 going to metadata.
Is there a way of knowing how much k/v, metadata, data is storing and
how full cache is so I can adjust my ratios?, I was thinking some ratios
(like 0.9 k/v, 0
Hi Jorge,
I was sort of responsible for all of this. :)
So basically there are different caches in different places:
- rocksdb bloom filter and index cache
- rocksdb block cache (which can be configured to include filters and
indexes)
- rocksdb compressed block cache
- bluestore onode cache
These limits unfortunately aren’t very well understood or studied right
now. The biggest slowdown I’m aware of is that when using FileStore you see
an impact as it starts to create more folders internally (this is the
“collection splitting”) and require more cached metadata to do fast lookups.
But
Careful when you're looking at documentation. You're looking at the master
branch which might have unreleased features or changes that your release
doesn't have. You'll want to change master in the url to luminous to make
sure that you're looking at the documentation for your version of Ceph.
I
David, thanks.
I've switched the brnach to Luminous and the doc is the same (thankfully).
No worries, i'll wait till someone that hopefully did it already might give
me a hint.
thanks!
On Wed, Oct 11, 2017 at 11:00 AM, David Turner
wrote:
> Careful when you're looking at documentation. You're
I've managed RBD cluster that had all of the RBDs configured to 1M objects
and filled up the cluster to 75% full with 4TB drives. Other than the
collection splitting (subfolder splitting as I've called it before) we
didn't have any problems with object counts.
On Wed, Oct 11, 2017 at 9:47 AM Greg
Hi,
I had a "general protection fault: " with Ceph RBD kernel client.
Not sure how to read the call, is it Ceph related ?
Oct 11 16:15:11 lorunde kernel: [311418.891238] general protection fault:
[#1] SMP
Oct 11 16:15:11 lorunde kernel: [311418.891855] Modules linked in: cpuid
binfmt_
Just for the sake of putting this in the public forum,
In theory, by placing the primary copy of the object on an SSD medium, and
placing replica copies on HDD medium, it should still yield some improvement in
writes, compared to an all HDD scenario.
My logic here is rooted in the idea that the
Christian is correct that min_size does not affect how many need to ACK the
write, it is responsible for how many copies need to be available for the
PG to be accessible. This is where SSD journals for filestore and SSD
DB/WAL partitions come into play. The write is considered ACK'd as soon as
th
On Wed, Oct 11, 2017 at 8:57 AM Jason Dillaman wrote:
> On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López
> wrote:
>
>> As far as I am able to understand there are 2 ways of setting iscsi for
>> ceph
>>
>> 1- using kernel (lrbd) only able on SUSE, CentOS, fedora...
>>
>
> The target_core_rbd
Hi,
I have a cephfs cluster as follows:
1 15x HDD data pool (primary cephfs data pool)
1 2x SSD data pool (linked to a specific dir via xattrs)
1 2x SSD metadata pool
1 2x SSD cache tier pool
the cache tier pool consists in 2 host, with one SSD OSD on each host, with
size=2 replicated by host.
L
The full ratio is based on the max bytes. if you say that the cache should
have a max bytes of 1TB and that the full ratio is .8, then it will aim to
keep it at 800GB. Without a max bytes value set, the ratios are a
percentage of unlimited... aka no limit themselves. The full_ratio should
be res
That sounds like it. Thanks David.
I wonder if that behavior of ignoring the OSD full_ratio is intentional.
Regards,
Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
On Wed, Oct 11, 2017 at 12:26 PM, David Turner
wrote:
> The full ratio is based on the max bytes. if yo
Hi Jason,
Thanks for the detailed write-up...
On Wed, 11 Oct 2017 08:57:46 -0400, Jason Dillaman wrote:
> On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López
> wrote:
>
> > As far as I am able to understand there are 2 ways of setting iscsi for
> > ceph
> >
> > 1- using kernel (lrbd) only abl
Hi Travis,
This is reporting an error when sending data back to the client.
Generally it means that the client timed out and closed the connection.
Are you also seeing failures on the client side?
Casey
On 10/10/2017 06:45 PM, Travis Nielsen wrote:
In Luminous 12.2.1, when running a GET on
Hi to all,
What if you're using an ISCSI gateway based on LIO and KRBD (that is, RBD
block device mounted on the ISCSI gateway and published through LIO). The
LIO target portal (virtual IP) would failover to another node. This would
theoretically provide support for PGRs since LIO does support S
On Wed, Oct 11, 2017 at 12:31 PM, Samuel Soulard
wrote:
> Hi to all,
>
> What if you're using an ISCSI gateway based on LIO and KRBD (that is, RBD
> block device mounted on the ISCSI gateway and published through LIO). The
> LIO target portal (virtual IP) would failover to another node. This wou
Hmmm, If you failover the identity of the LIO configuration including PGRs
(I believe they are files on disk), this would work no? Using an 2 ISCSI
gateways which have shared storage to store the LIO configuration and PGR
data.
Also, you said another "fails over to another port", do you mean a po
On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard
wrote:
> Hmmm, If you failover the identity of the LIO configuration including PGRs
> (I believe they are files on disk), this would work no? Using an 2 ISCSI
> gateways which have shared storage to store the LIO configuration and PGR
> data.
Are y
Ahh so, in this case, only Suse Enterprise Storage is able to provide ISCSI
connections of MS Clusters if an HA is required be it Active/Standby,
Active/Active or Active/Failover.
On Wed, Oct 11, 2017 at 2:03 PM, Jason Dillaman wrote:
> On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard
> wrote:
>
Hi all,
i just setup multisite replication according to the docs from
http://docs.ceph.com/docs/master/radosgw/multisite/ and everything works
except that if a client uploads via multipart the files dont get replicated.
If i in one zone rename a file that was uploaded via multipart it gets
replic
Multipart is a client side setting when uploading. Multisite in and of
itself is a client and it doesn't use multipart (at least not by default).
I have a Jewel RGW Multisite cluster and one site has the object as
multi-part while the second site just has it as a single object. I had to
change fr
Hi David,
yeah seems you are right, they are stored as different filenames in the
data bucket when using multisite upload. But anyway it stil doesnt get
replicated. As example i have files like
6a9448d2-bdba-4bec-aad6-aba72cd8eac6.21344646.1__multipart_Wireshark-win64-2.2.7.exe.2~0LAfq93OMdk7hrij
Thanks for your report. We're looking into it. You can try to see if
touching the object (e.g., modifying its permissions) triggers the sync.
Yehuda
On Wed, Oct 11, 2017 at 1:36 PM, Enrico Kern
wrote:
> Hi David,
>
> yeah seems you are right, they are stored as different filenames in the
> data
To the client they were showing up as a 500 error. Ty, do you know of any
client-side issues that could have come up during the test run? And there
was only a single GET happening at a time, right?
On 10/11/17, 9:27 AM, "ceph-users on behalf of Casey Bodley"
wrote:
>Hi Travis,
>
>This is repo
if i change permissions the sync status shows that it is syncing 1 shard,
but no files ends up in the pool (testing with empty data pool). after a
while it shows that data is back in sync but there is no file
On Wed, Oct 11, 2017 at 11:26 PM, Yehuda Sadeh-Weinraub
wrote:
> Thanks for your report
in addition i noticed that if you delete a bucket that had multipart upload
files which were not replicated in it that the files are not deleted in the
pool, while the bucket is gone the data stil remains in the pool where the
multipart upload was initiated.
On Thu, Oct 12, 2017 at 12:26 AM, Enric
What is the size of the object? Is it only this one?
Try this command: 'radosgw-admin sync error list'. Does it show anything
related to that object?
Thanks,
Yehuda
On Wed, Oct 11, 2017 at 3:26 PM, Enrico Kern
wrote:
> if i change permissions the sync status shows that it is syncing 1 shard,
its 45MB, but it happens with all multipart uploads.
sync error list shows
{
"shard_id": 31,
"entries": [
{
"id": "1_1507761459.607008_8197.1",
"section": "data",
"name":
"testbucket:6a9448d2-bdba-4bec-aad6-aba72cd8eac
or this:
{
"shard_id": 22,
"entries": [
{
"id": "1_1507761448.758184_10459.1",
"section": "data",
"name":
"testbucket:6a9448d2-bdba-4bec-aad6-aba72cd8eac6.21344646.3/Wireshark-win64-2.2.7.exe",
"timestam
I was wondering if I can't get the second mds back up That offline
backward scrub check sounds like it should be able to also salvage what
it can of the two pools to a normal filesystem. Is there an option for
that or has someone written some form of salvage tool?
On 10/11/2017 07:07 AM, John
As an aside, SCST iSCSI will support ALUA and does PGRs through the use of
DLM. We have been using that with Solaris and Hyper-V initiators for RBD
backed storage but still have some ongoing issues with ALUA (probably our
current config, we need to lab later recommendations).
> -Origin
Yes I looked at this solution, and it seems interesting. However, one
point often stick with business requirements is commercial support.
With Redhat or Suse, you have support provided with the solution. I'm not
sure about SCST what support channel they offer.
Sam
On Oct 11, 2017 20:05, "Adri
It’s a fair point – in our case we are based on CentOS so self-support only
anyway (business does not like paying support costs). At the time we evaluated
LIO, SCST and STGT, with a directive to use ALUA support instead of IP
failover. In the end we went with SCST as it had more mature ALUA
Hello,
Setting up a new test lab, single server 5 disks/OSD.
Want to run an EC Pool that has more shards than avaliable OSD's , is it
possible to force crush to 're use an OSD for another shard?
I know normally this is bad practice but is for testing only on a single server
setup.
Thanks,
Ash
46 matches
Mail list logo