[ceph-users] Re: Hadoop to Ceph

2020-11-06 Thread Jaroslaw Owsiewski
Hi, What protocol do you want to make this data available on the Ceph? -- Jarek pt., 6 lis 2020 o 04:00 Szabo, Istvan (Agoda) napisał(a): > Hi, > > Is there anybody tried to migrate data from Hadoop to Ceph? > If yes what is the right way? > > Thank you > > >

[ceph-users] Re: Mon went down and won't come back

2020-11-06 Thread Eugen Block
Hi, can you share your ceph.conf (mon section)? Zitat von Paul Mezzanini : Hi everyone, I figure it's time to pull in more brain power on this one. We had an NVMe mostly die in one of our monitors and it caused the write latency for the machine to spike. Ceph did the RightThing(tm) and

[ceph-users] Re: high latency after maintenance]

2020-11-06 Thread Marcel Kuiper
Hi Anthony Thank you for your respons I am looking at the"OSDs highest latency of write operations" panel of the grafana dashboard found in the ceph source in ./monitoring/grafana/dashboards/osds-overview.json. It is a topk graph that uses ceph_osd_op_w_latency_sum / ceph_osd_op_w_latency_coun

[ceph-users] Re: Low Memory Nodes

2020-11-06 Thread Hans van den Bogert
> I already ordered more ram. Can i turn temporary down the RAM usage of > the OSDs to not get into that vicious cycle and just suffer small but > stable performance? Hi, Look at https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#bluestore-config-reference and then spec

[ceph-users] Re: Low Memory Nodes

2020-11-06 Thread Dan van der Ster
How much RAM do you have, and how many OSDs? This config should be considered close to the minimum: ceph config set osd osd_memory_target 15 (1.5GB per OSD -- remember the default is 4GB per OSD) -- dan On Fri, Nov 6, 2020 at 11:52 AM Ml Ml wrote: > > Hello List, > > i think 3 of

[ceph-users] Multisite sync not working - permission denied

2020-11-06 Thread Michael Breen
Hi, radosgw-admin -v ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable) Multisite sync was something I had working with a previous cluster and an earlier Ceph version, but it doesn't now, and I can't understand why. If anyone with an idea of a possible cause could giv

[ceph-users] Re: Multisite sync not working - permission denied

2020-11-06 Thread Michael Breen
I forgot to mention earlier attempted debugging: I believe this is not because the keys are wrong, but because it is looking for a user that is not seen on the secondary: debug 2020-11-03T16:37:47.330+ 7f32e9859700 5 req 60 0.00386s :post_period error reading user info, uid=ACCESS can't a

[ceph-users] Re: Hadoop to Ceph

2020-11-06 Thread Jaroslaw Owsiewski
If S3 you can use distcp from HDFS to S3@Ceph. For example: hadoop distcp -Dmapred.job.queue.name=queue_name Dfs.s3a.access.key= -Dfs.s3a.secret.key= -Dfs.s3a.endpoint= -Dfs.s3a.connection.ssl.enabled=false_or_true /hdfs_path/ s3a://path/ Regards -- Jarek pt., 6 lis 2020 o 12:29 Szabo, Istvan

[ceph-users] Re: high latency after maintenance]

2020-11-06 Thread Wout van Heeswijk
Hi Marcel, The peering process is the process used by Ceph OSDs, on a per placement group basis, to agree on the state of that placement on each of the involved OSDs. In your case, 2/3 of the placement group metadata that needs to be agreed upon/checked is on the nodes that did not undergo main

[ceph-users] Re: Mon went down and won't come back

2020-11-06 Thread Paul Mezzanini
Relevant ceph.conf file lines: [global] mon initial members = ceph-mon-01,ceph-mon-03 mon host = IPFor01,IPFor03 mon max pg per osd = 400 mon pg warn max object skew = -1 [mon] mon allow pool delete = true ceph config has: global

[ceph-users] Re: Mon went down and won't come back

2020-11-06 Thread Eugen Block
So the mon_host line is without a port, correct, just the IP? Zitat von Paul Mezzanini : Relevant ceph.conf file lines: [global] mon initial members = ceph-mon-01,ceph-mon-03 mon host = IPFor01,IPFor03 mon max pg per osd = 400 mon pg warn max object skew = -1 [mon] mon allow pool delete = tru

[ceph-users] Re: Mon went down and won't come back

2020-11-06 Thread Paul Mezzanini
Correct, just comma separated IP addresses -- Paul Mezzanini Sr Systems Administrator / Engineer, Research Computing Information & Technology Services Finance & Administration Rochester Institute of Technology o:(585) 475-3245 | pfm...@rit.edu CONFIDENTIALITY NOTE: The information transmitted, in

[ceph-users] Low Memory Nodes

2020-11-06 Thread Ml Ml
Hello List, i think 3 of 6 Nodes have to less memory. This triggers the effect, that the nodes will swap a lot and almost kill themselfes. That triggers OSDs to go down, which triggers a rebalance which does not really help :D I already ordered more ram. Can i turn temporary down the RAM usage of

[ceph-users] Re: Hadoop to Ceph

2020-11-06 Thread Szabo, Istvan (Agoda)
Objectstore On 2020. Nov 6., at 15:41, Jaroslaw Owsiewski wrote:  Email received from outside the company. If in doubt don't click links nor open attachments! Hi, What protocol do you want to make this data available on the Ceph? -- Jarek pt., 6 lis 2020 o

[ceph-users] Re: Multisite sync not working - permission denied

2020-11-06 Thread Michael Breen
Continuing my fascinating conversation with myself: The output of radosgw-admin sync status indicates that only the metadata is a problem, i.e., the data itself is syncing, and I have confirmed that. There is no S3 access to the secondary, zone-b, so I could not check replication that way, but ha

[ceph-users] Re: using msgr-v1 for OSDs on nautilus

2020-11-06 Thread Void Star Nill
Thanks Eugen. I will give it a try. Regards, Shridhar On Thu, 5 Nov 2020 at 23:52, Eugen Block wrote: > Hi, > > you could try do only bind to v1 [1] by setting > > ms_bind_msgr2 = false > > > Regards, > Eugen > > > [1] https://docs.ceph.com/en/latest/rados/configuration/msgr2/ > > > Zitat von

[ceph-users] Debugging slow ops

2020-11-06 Thread Void Star Nill
Hello, I am trying to debug slow operations in our cluster running Nautilus 14.2.13. I am analysing the output of "ceph daemon osd.N dump_historic_ops" command. I am noticing that the I am noticing that most of the time is spent between "header_read" and "throttled" events. For example, below is