Re: [ceph-users] Ceph-Deploy error on 15/71 stage

2018-08-31 Thread Eugen Block

Hi,

I'm not sure if there's a misunderstanding. You need to track the logs  
during the osd deployment step (stage.3), that is where it fails, and  
this is where /var/log/messages could be useful. Since the deployment  
failed you have no systemd-units (ceph-osd@.service) to log  
anything.


Before running stage.3 again try something like

grep -C5 ceph-disk /var/log/messages (or messages-201808*.xz)

or

grep -C5 sda4 /var/log/messages (or messages-201808*.xz)

If that doesn't reveal anything run stage.3 again and watch the logs.

Regards,
Eugen


Zitat von Jones de Andrade :


Hi Eugen.

Ok, edited the file /etc/salt/minion, uncommented the "log_level_logfile"
line and set it to "debug" level.

Turned off the computer, waited a few minutes so that the time frame would
stand out in the /var/log/messages file, and restarted the computer.

Using vi I "greped out" (awful wording) the reboot section. From that, I
also removed most of what it seemed totally unrelated to ceph, salt,
minions, grafana, prometheus, whatever.

I got the lines below. It does not seem to complain about anything that I
can see. :(


2018-08-30T15:41:46.455383-03:00 torcello systemd[1]: systemd 234 running
in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP
+LIBCRYPTSETUP +GCRYPT -GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID -ELFUTILS
+KMOD -IDN2 -IDN default-hierarchy=hybrid)
2018-08-30T15:41:46.456330-03:00 torcello systemd[1]: Detected architecture
x86-64.
2018-08-30T15:41:46.456350-03:00 torcello systemd[1]: nss-lookup.target:
Dependency Before=nss-lookup.target dropped
2018-08-30T15:41:46.456357-03:00 torcello systemd[1]: Started Load Kernel
Modules.
2018-08-30T15:41:46.456369-03:00 torcello systemd[1]: Starting Apply Kernel
Variables...
2018-08-30T15:41:46.457230-03:00 torcello systemd[1]: Started Alertmanager
for prometheus.
2018-08-30T15:41:46.457237-03:00 torcello systemd[1]: Started Monitoring
system and time series database.
2018-08-30T15:41:46.457403-03:00 torcello systemd[1]: Starting NTP
client/server...






*2018-08-30T15:41:46.457425-03:00 torcello systemd[1]: Started Prometheus
exporter for machine metrics.2018-08-30T15:41:46.457706-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.797896888Z
caller=main.go:225 msg="Starting Prometheus" version="(version=2.1.0,
branch=non-git, revision=non-git)"2018-08-30T15:41:46.457712-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.797969232Z
caller=main.go:226 build_context="(go=go1.9.4, user=abuild@lamb69,
date=20180513-03:46:03)"2018-08-30T15:41:46.457719-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.798008802Z
caller=main.go:227 host_details="(Linux 4.12.14-lp150.12.4-default #1 SMP
Tue May 22 05:17:22 UTC 2018 (66b2eda) x86_64 torcello
(none))"2018-08-30T15:41:46.457726-03:00 torcello prometheus[695]:
level=info ts=2018-08-30T18:41:44.798044088Z caller=main.go:228
fd_limits="(soft=1024, hard=4096)"2018-08-30T15:41:46.457738-03:00 torcello
prometheus[695]: level=info ts=2018-08-30T18:41:44.802067189Z
caller=web.go:383 component=web msg="Start listening for connections"
address=0.0.0.0:9090 2018-08-30T15:41:46.457745-03:00
torcello prometheus[695]: level=info ts=2018-08-30T18:41:44.802037354Z
caller=main.go:499 msg="Starting TSDB ..."*
2018-08-30T15:41:46.458145-03:00 torcello smartd[809]: Monitoring 1
ATA/SATA, 0 SCSI/SAS and 0 NVMe devices
2018-08-30T15:41:46.458321-03:00 torcello systemd[1]: Started NTP
client/server.
*2018-08-30T15:41:50.387157-03:00 torcello ceph_exporter[690]: 2018/08/30
15:41:50 Starting ceph exporter on ":9128"*
2018-08-30T15:41:52.658272-03:00 torcello wicked[905]: lo  up
2018-08-30T15:41:52.658738-03:00 torcello wicked[905]: eth0up
2018-08-30T15:41:52.659989-03:00 torcello systemd[1]: Started wicked
managed network interfaces.
2018-08-30T15:41:52.660514-03:00 torcello systemd[1]: Reached target
Network.
2018-08-30T15:41:52.667938-03:00 torcello systemd[1]: Starting OpenSSH
Daemon...
2018-08-30T15:41:52.668292-03:00 torcello systemd[1]: Reached target
Network is Online.




*2018-08-30T15:41:52.669132-03:00 torcello systemd[1]: Started Ceph cluster
monitor daemon.2018-08-30T15:41:52.669328-03:00 torcello systemd[1]:
Reached target ceph target allowing to start/stop all ceph-mon@.service
instances at once.2018-08-30T15:41:52.670346-03:00 torcello systemd[1]:
Started Ceph cluster manager daemon.2018-08-30T15:41:52.670565-03:00
torcello systemd[1]: Reached target ceph target allowing to start/stop all
ceph-mgr@.service instances at once.2018-08-30T15:41:52.670839-03:00
torcello systemd[1]: Reached target ceph target allowing to start/stop all
ceph*@.service instances at once.*
2018-08-30T15:41:52.671246-03:00 torcello systemd[1]: Starting Login and
scanning of iSCSI devices...
*2018-08-30T15:41:52.672402-03:00 torcello systemd[1]: Starting Grafana
instance...*
2018-08-30T15:41:52.678922-03:00 torcello systemd[1]: Started B

Re: [ceph-users] MDS not start. Timeout??

2018-08-31 Thread John Spray
On Fri, Aug 31, 2018 at 6:11 AM morf...@gmail.com  wrote:
>
> Hello all!
>
> I had a electric power problem. After this I have 2 incomplete pg. But all 
> RBD volumes are work.
>
> But not work my CephFS. MDS load stop at "replay" state and MDS related 
> commands hangs:
>
> cephfs-journal-tool journal export backup.bin - freeze;
>
> cephfs-journal-tool event recover_dentries summary - freeze (no action in 
> strace);
>
> cephfs-journal-tool journal reset - freeze;
>
>

As you have noticed, you have two incomplete PGs.  They are presumably
metadata PGs, and CephFS can't read or write some parts of its
metadata, so it's IOs are blocking.

You need to investigate what's going on with those PGs, and longer
term work out what about your configuration allowed an electrical
problem to damage the cluster -- look into your drive controller
configuration (do you have e.g. writeback caches without battery
backup?) etc.

John

> strace out:
>
>  
> [pid  6314] <... futex resumed> )   = -1 ETIMEDOUT (Connection timed out)
> [pid  6314] futex(0x55d342eea928, FUTEX_WAKE_PRIVATE, 1) = 0
> [pid  6314] futex(0x55d342eea97c, 
> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 31, {1535692208, 64139948}, 
>  
> [pid  6318] <... futex resumed> )   = -1 ETIMEDOUT (Connection timed out)
> [pid  6318] futex(0x55d3430b6958, FUTEX_WAKE_PRIVATE, 1) = 0
> [pid  6318] futex(0x55d3430b6984, 
> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 33, {1535692208, 80954445}, 
>  
> [pid  6324] <... futex resumed> )   = -1 ETIMEDOUT (Connection timed out)
> [pid  6324] futex(0x55d3430b7758, FUTEX_WAKE_PRIVATE, 1) = 0
> [pid  6324] write(12, "c", 1)   = 1
> [pid  6317] <... epoll_wait resumed> {{EPOLLIN, {u32=11, u64=11}}}, 5000, 
> 3) = 1
> [pid  6317] read(11, "c", 256)  = 1
> [pid  6317] read(11, 0x7f1558c32300, 256) = -1 EAGAIN (Resource temporarily 
> unavailable)
> [pid  6317] futex(0x55d3432269e0, FUTEX_WAIT_PRIVATE, 2, NULL 
> [pid  6324] futex(0x55d3432269e0, FUTEX_WAKE_PRIVATE, 1) = 1
> [pid  6317] <... futex resumed> )   = 0
> [pid  6317] futex(0x55d3432269e0, FUTEX_WAKE_PRIVATE, 1) = 0
> [pid  6317] sendmsg(17, {msg_name(0)=NULL, 
> msg_iov(1)=[{"\7\25\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\0\177\0\1\0\0\0\0\0\0\0\0\0\0"...,
>  75}], msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL 
> [pid  6324] futex(0x55d3430b7784, 
> FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 33, {1535692208, 169222622}, 
>  
> [pid  6317] <... sendmsg resumed> ) = 75
> [pid  6317] epoll_wait(10,
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is luminous ceph rgw can only run with the civetweb ?

2018-08-31 Thread linghucongsong
In jewel I use the below config rgw is work well with the nginx. But with 
luminous  the nginx look like can not work with the rgw.



10.11.3.57, request: "GET / HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/ceph/ceph-client.rgw.ceph-11.asok:", host: 
"10.11.3.57:7480"
2018/08/31 16:38:25 [error] 143730#143730: *110 recv() failed (104: Connection 
reset by peer) while reading response header from upstream, client: 10.11.3.57, 
server: 10.11.3.57, request: "GET / HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/ceph/ceph-client.rgw.ceph-11.asok:", host: 
"10.11.3.57:7480"


[client.rgw.ceph-11]
public addr = 10.11.3.57
rgw_thread_pool_size = 512
rgw_frontends = fastcgi
rgw_socket_path = /var/run/ceph/ceph-client.rgw.ceph-11.asok
rgw_print_continue = false
rgw_content_length_compat = true


root@ceph-11:/etc/ceph# ll  /var/run/ceph/ceph-client.rgw.ceph-11.asok
srwxrwxrwx 1 ceph ceph 0 Aug 31 16:28 
/var/run/ceph/ceph-client.rgw.ceph-11.asok=


http {

##
# Basic Settings
##
fastcgi_connect_timeout 300s;
fastcgi_send_timeout 300s;
fastcgi_read_timeout 300s;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
# server_tokens off;

# server_names_hash_bucket_size 64;
# server_name_in_redirect off;

include /etc/nginx/mime.types;
default_type application/octet-stream;

##
# SSL Settings
##

ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;

##
# Logging Settings
##

access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;

##
# Gzip Settings
##

gzip on;
gzip_disable "msie6";

# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json 
application/javascript text/xml application/xml application/xml+rss 
text/javascript;

##
# Virtual Host Configs
##

include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
server {
listen   7480 default;
server_name 10.11.3.57;
client_max_body_size 10240m;
#client_max_body_size 20m;
location / {
fastcgi_pass_header Authorization;
fastcgi_pass_request_headers on;
fastcgi_param QUERY_STRING  $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param CONTENT_LENGTH $content_length;

if ($request_method = PUT) {
rewrite ^ /PUT$request_uri;
}

include fastcgi_params;
fastcgi_pass unix:/var/run/ceph/ceph-client.rgw.ceph-11.asok;
}

location /PUT/ {
internal;
fastcgi_pass_header Authorization;
fastcgi_pass_request_headers on;

include fastcgi_params;
fastcgi_param QUERY_STRING  $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param  CONTENT_TYPE $content_type;
fastcgi_pass unix:/var/run/ceph/ceph-client.rgw.ceph-11.asok;
}
}



Thank you!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mount cephfs without tiering

2018-08-31 Thread Fyodor Ustinov
Hi!

I have cephfs with tiering.
Does anyone know if it's possible to mount a file system so that the tiring is 
not used?

I.e. I want mount cephfs on backup server without tiering usage and on samba 
server with tiering usage.

It's possible?

WBR,
Fyodor.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (no subject)

2018-08-31 Thread Stas
Hello there,
I'm trying to reduce recovery impact on client operations and using mclock
for this purpose. I've tested different weights for queues but didn't see
any impacts on real performance.

ceph version 12.2.8  luminous (stable)

Last tested config:
"osd_op_queue": "mclock_opclass",
"osd_op_queue_cut_off": "high",
"osd_op_queue_mclock_client_op_lim": "0.00",
"osd_op_queue_mclock_client_op_res": "1.00",
"osd_op_queue_mclock_client_op_wgt": "1000.00",
"osd_op_queue_mclock_osd_subop_lim": "0.00",
"osd_op_queue_mclock_osd_subop_res": "1.00",
"osd_op_queue_mclock_osd_subop_wgt": "1000.00",
"osd_op_queue_mclock_recov_lim": "0.00",
"osd_op_queue_mclock_recov_res": "1.00",
"osd_op_queue_mclock_recov_wgt": "1.00",

Is it feature really working? Am I doing something wrong?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph Object Gateway Server - Hardware Recommendations

2018-08-31 Thread Unni Sathyarajan
Hi ceph users,

I am setting up a cluster of S3 - like storage, to decide on the server
specifications from where can I find the minimum and production ready
hardware recommendations?

The following URL does not mention it :
http://docs.ceph.com/docs/hammer/start/hardware-recommendations/#minimum-hardware-recommendations

It has hardware recommendations for

   1. Ceph Object Storage Daemon Server
   2. Ceph Mon
   3. Ceph Metadata server


Can you anyone please recommend the hardware specification I should look in
a ceph storage cluster of 1 PB storage.


Thanks in advance.
Unni
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Object Gateway Server - Hardware Recommendations

2018-08-31 Thread Marc Roos
 
Ok from what I have learned sofar from my own test environment. (Keep in 
mind I am having a test setup for only a year). The s3 rgw is not so 
much requiring high latency, so you should be able to do fine with hdd 
only cluster.  I guess my setup should be sufficient for what you need 
to have, test it and then just multiply it for your 1PB

3 nodes:

supermicro CSE-826A-R1200LPB
supermicro X9DRi-LN4F+
2x lsi logic sas 9207-8i 
2x xeon > 2.2GHz. (12 cores at least)
1x 10GB pcie
20GB+ internal memory
nx WD RED (pro) 4TB/6TB





-Original Message-
From: Unni Sathyarajan [mailto:unnisathy...@gmail.com] 
Sent: vrijdag 31 augustus 2018 14:08
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph Object Gateway Server - Hardware 
Recommendations

Hi ceph users,

I am setting up a cluster of S3 - like storage, to decide on the server 
specifications from where can I find the minimum and production ready 
hardware recommendations?

The following URL does not mention it :
http://docs.ceph.com/docs/hammer/start/hardware-recommendations/#minimum-hardware-recommendations


It has hardware recommendations for 

1.  Ceph Object Storage Daemon Server

2.  Ceph Mon 

3.  Ceph Metadata server



Can you anyone please recommend the hardware specification I should look 
in a ceph storage cluster of 1 PB storage. 


Thanks in advance.
Unni




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (no subject)

2018-08-31 Thread puyingdong







help

end






 










puyingdong




puyingd...@gmail.com








签名由
网易邮箱大师
定制

 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Luminous RGW errors at start

2018-08-31 Thread Robert Stanford
 I installed a new Luminous cluster.  Everything is fine so far.  Then I
tried to start RGW and got this error:

2018-08-31 15:15:41.998048 7fc350271e80  0 rgw_init_ioctx ERROR:
librados::Rados::pool_create returned (34) Numerical result out of range
(this can be due to a pool or placement group misconfiguration, e.g. pg_num
< pgp_num or mon_max_pg_per_osd exceeded)
2018-08-31 15:15:42.005732 7fc350271e80 -1 Couldn't init storage provider
(RADOS)

 I notice that the only pools that exist are the data and index RGW pools
(no user or log pools like on Jewel).  What is causing this?

 Thank you
 R
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph-Deploy error on 15/71 stage

2018-08-31 Thread Jones de Andrade
Hi Eugen.

Entirely my missunderstanding, I thought there would be something at boot
time (what would certainly not make any sense at all). Sorry.

Before stage 3 I ran the commands you suggested on the nodes, and only one
got me the output below:

###
# grep -C5 sda4 /var/log/messages
2018-08-28T08:26:50.635077-03:00 polar kernel: [3.029809] ata2.00:
ATAPI: PLDS DVD+/-RW DU-8A5LH, 6D1M, max UDMA/133
2018-08-28T08:26:50.635080-03:00 polar kernel: [3.030616] ata2.00:
configured for UDMA/133
2018-08-28T08:26:50.635082-03:00 polar kernel: [3.038249] scsi 1:0:0:0:
CD-ROMPLDS DVD+-RW DU-8A5LH 6D1M PQ: 0 ANSI: 5
2018-08-28T08:26:50.635085-03:00 polar kernel: [3.048102] usb 1-6: new
low-speed USB device number 2 using xhci_hcd
2018-08-28T08:26:50.635095-03:00 polar kernel: [3.051408] scsi 1:0:0:0:
Attached scsi generic sg1 type 5
2018-08-28T08:26:50.635098-03:00 polar kernel: [3.079763]  sda: sda1
sda2 sda3 sda4
2018-08-28T08:26:50.635101-03:00 polar kernel: [3.080548] sd 0:0:0:0:
[sda] Attached SCSI disk
2018-08-28T08:26:50.635104-03:00 polar kernel: [3.109021] sr 1:0:0:0:
[sr0] scsi3-mmc drive: 24x/24x writer cd/rw xa/form2 cdda tray
2018-08-28T08:26:50.635106-03:00 polar kernel: [3.109025] cdrom:
Uniform CD-ROM driver Revision: 3.20
2018-08-28T08:26:50.635109-03:00 polar kernel: [3.109246] sr 1:0:0:0:
Attached scsi CD-ROM sr0
2018-08-28T08:26:50.635112-03:00 polar kernel: [3.206490] usb 1-6: New
USB device found, idVendor=413c, idProduct=2113
--
2018-08-28T10:11:10.512604-03:00 polar os-prober: debug: running
/usr/lib/os-probes/mounted/83haiku on mounted /dev/sda1
2018-08-28T10:11:10.516374-03:00 polar 83haiku: debug: /dev/sda1 is not a
BeFS partition: exiting
2018-08-28T10:11:10.517805-03:00 polar os-prober: debug: running
/usr/lib/os-probes/mounted/90linux-distro on mounted /dev/sda1
2018-08-28T10:11:10.523382-03:00 polar os-prober: debug: running
/usr/lib/os-probes/mounted/90solaris on mounted /dev/sda1
2018-08-28T10:11:10.529317-03:00 polar os-prober: debug: /dev/sda2: is
active swap
2018-08-28T10:11:10.539818-03:00 polar os-prober: debug: running
/usr/lib/os-probes/50mounted-tests on /dev/sda4
2018-08-28T10:11:10.669852-03:00 polar systemd-udevd[456]: Network
interface NamePolicy= disabled by default.
2018-08-28T10:11:10.705602-03:00 polar systemd-udevd[456]: Specified group
'plugdev' unknown
2018-08-28T10:11:10.812270-03:00 polar 50mounted-tests: debug: mounted
using GRUB xfs filesystem driver
2018-08-28T10:11:10.817141-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/05efi
2018-08-28T10:11:10.832257-03:00 polar 05efi: debug: /dev/sda4 is xfs
partition: exiting
2018-08-28T10:11:10.837353-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/10freedos
2018-08-28T10:11:10.851042-03:00 polar 10freedos: debug: /dev/sda4 is not a
FAT partition: exiting
2018-08-28T10:11:10.854580-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/10qnx
2018-08-28T10:11:10.863539-03:00 polar 10qnx: debug: /dev/sda4 is not a
QNX4 partition: exiting
2018-08-28T10:11:10.865876-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/20macosx
2018-08-28T10:11:10.871781-03:00 polar macosx-prober: debug: /dev/sda4 is
not an HFS+ partition: exiting
2018-08-28T10:11:10.873708-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/20microsoft
2018-08-28T10:11:10.879146-03:00 polar 20microsoft: debug: Skipping legacy
bootloaders on UEFI system
2018-08-28T10:11:10.880798-03:00 polar 50mounted-tests: debug: running
subtest /usr/lib/os-probes/mounted/30utility
2018-08-28T10:11:10.885707-03:00 polar 30utility: debug: /dev/sda4 is not a
FAT partition: exiting
2018-08-28T10:11:10.887422-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/40lsb

2018-08-28T10:11:10.892547-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/70hurd

2018-08-28T10:11:10.897110-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/80minix

2018-08-28T10:11:10.901133-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/83haiku

2018-08-28T10:11:10.904998-03:00 polar 83haiku: debug: /dev/sda4 is not a
BeFS partition: exiting
2018-08-28T10:11:10.906289-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/90linux-distro

2018-08-28T10:11:10.912016-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/90solaris

2018-08-28T10:11:10.915838-03:00 polar 50mounted-tests: debug: running
subtest
/usr/lib/os-probes/mounted/efi

2018-08-28T10:11:11.757030-03:00 polar [RPM][4789]: erase
kernel-default-4.12.14-lp150.12.16.1.x86_64:
success

2018-08-28T10:11:11.757912-03:00 polar [RPM][4789]: Transaction ID 5b8549e8
finished:
0

--
2018-08-28T10:13:08.815753-03:00 polar kernel: [2.885213] ata2.00:
configured for
UD

Re: [ceph-users] safe to remove leftover bucket index objects

2018-08-31 Thread Dan van der Ster
So it sounds like you tried what I was going to do, and it broke
things. Good to know... thanks.

In our case, what triggered the extra index objects was a user running
PUT /bucketname/ around 20 million times -- this apparently recreates
the index objects.

-- dan

On Thu, Aug 30, 2018 at 7:20 PM David Turner  wrote:
>
> I'm glad you asked this, because it was on my to-do list. I know that based 
> on our not existing in the bucket marker does not mean it's safe to delete.  
> I have an index pool with 22k objects in it. 70 objects match existing bucket 
> markers. I was having a problem on the cluster and started deleting the 
> objects in the index pool and after going through 200 objects I stopped it 
> and tested and list access to 3 pools. Luckily for me they were all buckets 
> I've been working on deleting, so no need for recovery.
>
> I then compared bucket IDs to the objects in that pool, but still only found 
> a couple hundred more matching objects. I have no idea what the other 22k 
> objects are in the index bucket that don't match bucket markers or bucket 
> IDs. I did confirm there was no resharding happening both in the research 
> list and all bucket reshard statuses.
>
> Does anyone know how to parse the names of these objects and how to tell what 
> can be deleted?  This is if particular interest as I have another costed with 
> 1M injects in the index pool.
>
> On Thu, Aug 30, 2018, 7:29 AM Dan van der Ster  wrote:
>>
>> Replying to self...
>>
>> On Wed, Aug 1, 2018 at 11:56 AM Dan van der Ster  wrote:
>> >
>> > Dear rgw friends,
>> >
>> > Somehow we have more than 20 million objects in our
>> > default.rgw.buckets.index pool.
>> > They are probably leftover from this issue we had last year:
>> > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018565.html
>> > and we want to clean the leftover / unused index objects
>> >
>> > To do this, I would rados ls the pool, get a list of all existing
>> > buckets and their current marker, then delete any objects with an
>> > unused marker.
>> > Does that sound correct?
>>
>> More precisely, for example, there is an object
>> .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 in the index
>> pool.
>> I run `radosgw-admin bucket stats` to get the marker for all current
>> existing buckets.
>> The marker 61c59385-085d-4caa-9070-63a3868dccb6.2978181.59 is not
>> mentioned in the bucket stats output.
>> Is it safe to rados rm 
>> .dir.61c59385-085d-4caa-9070-63a3868dccb6.2978181.59.8 ??
>>
>> Thanks in advance!
>>
>> -- dan
>>
>>
>>
>>
>>
>>
>>
>> > Can someone suggest a better way?
>> >
>> > Cheers, Dan
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs speed

2018-08-31 Thread Peter Eisch
[replying to myself]

I set aside cephfs and created an rbd volume.  I get the same splotchy 
throughput with rbd as I was getting with cephfs.   (image attached)

So, withdrawing this as a question here as a cephfs issue.

#backingout

peter



Peter Eisch
virginpulse.com
|globalchallenge.virginpulse.com
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.10

On 8/30/18, 12:25 PM, "Peter Eisch"  wrote:

Thanks for the thought.  It’s mounted with this entry in fstab (one line, 
if email wraps it):

cephmon-s01,cephmon-s02,cephmon-s03:/ /loamceph
noauto,name=clientname,secretfile=/etc/ceph/secret,noatime,_netdev0   2

Pretty plain, but I'm open to tweaking!

peter

From: Gregory Farnum 
Date: Thursday, August 30, 2018 at 11:47 AM
To: Peter Eisch 
Cc: "ceph-users@lists.ceph.com" 
Subject: Re: [ceph-users] cephfs speed

How are you mounting CephFS? It may be that the cache settings are just set 
very badly for a 10G pipe. Plus rados bench is a very parallel large-IO 
benchmark and many benchmarks you might dump into a filesystem are definitely 
not.
-Greg

On Thu, Aug 30, 2018 at 7:54 AM Peter Eisch 
 wrote:
Hi,

I have a cluster serving cephfs and it works. It’s just slow. Client is 
using the kernel driver. I can ‘rados bench’ writes to the cephfs_data pool at 
wire speeds (9580Mb/s on a 10G link) but when I copy data into cephfs it is 
rare to get above 100Mb/s. Large file writes may start fast (2Gb/s) but within 
a minute slows. In the dashboard at the OSDs I get lots of triangles (it 
doesn't stream) which seems to be lots of starts and stops. By contrast the 
graphs show constant flow when using 'rados bench.'

I feel like I'm missing something obvious. What can I do to help diagnose 
this better or resolve the issue?

Errata:
Version: 12.2.7 (on everything)
mon: 3 daemons, quorum cephmon-s01,cephmon-s03,cephmon-s02
mgr: cephmon-s02(active), standbys: cephmon-s01, cephmon-s03
mds: cephfs1-1/1/1 up {0=cephmon-s02=up:active}, 2 up:standby
osd: 70 osds: 70 up, 70 in
rgw: 3 daemons active

rados bench summary:
Total time run: 600.043733
Total writes made: 167725
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1118.09
Stddev Bandwidth: 7.23868
Max bandwidth (MB/sec): 1140
Min bandwidth (MB/sec): 1084
Average IOPS: 279
Stddev IOPS: 1
Max IOPS: 285
Min IOPS: 271
Average Latency(s): 0.057239
Stddev Latency(s): 0.0354817
Max latency(s): 0.367037
Min latency(s): 0.0120791

peter


Peter Eisch​











https://www.virginpulse.com/
|

https://globalchallenge.virginpulse.com/


Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
Switzerland | United Kingdom | USA

Confidentiality Notice: The information contained in this e-mail, including 
any attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.


v2.10

___
ceph-users mailing list
mailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs without tiering

2018-08-31 Thread Gregory Farnum
You mean you set up CephFS with a cache tier but want to ignore it?
No, that's generally not possible. How would the backup server get
consistent data if it's ignoring the cache? (Answer: It can't.)
-Greg

On Fri, Aug 31, 2018 at 2:35 AM Fyodor Ustinov  wrote:

> Hi!
>
> I have cephfs with tiering.
> Does anyone know if it's possible to mount a file system so that the
> tiring is not used?
>
> I.e. I want mount cephfs on backup server without tiering usage and on
> samba server with tiering usage.
>
> It's possible?
>
> WBR,
> Fyodor.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] filestore split settings

2018-08-31 Thread David Turner
More important than being able to push those settings or further is
probably the ability to actually split your subfolders. I've been using
variants of this [1] script I created a while back to take care of that.

To answer your question, we do run with much larger settings than you're
using. 128/-16. The negative prevents subfolder merging while still
allowing the value to be used for calculations the splitting number.

Take a look at the script. It stops the osds, aggressively sets the
subfolder settings, splits the subfolders to that setting, puts your
settings back, and starts your osds. I do this about once a month for our
use case of growing data.


[1] https://gist.github.com/drakonstein/cb76c7696e65522ab0e699b7ea1ab1c4

On Wed, Aug 22, 2018, 7:37 AM Rafael Lopez  wrote:

> Hi all,
>
> For those still using filestore and running clusters with a large number
> of objects, I am seeking some thoughts on increasing the filestore split
> settings. Currently we have:
>
> filestore merge threshold = 70
> filestore split multiple = 20
>
> Has anyone gone higher than this?
>
> We are hitting the threshold of 22400 files per dir on osds for a
> particular pool and experiencing slow reqs, and other osd badness as a
> result. I am wondering if we can simply increase these values to push the
> files per dir threshold and delay splitting without major consequences, eg.
> to 80/30. This would probably buy us enough time to move to bluestore.
>
> --
> *Rafael Lopez*
> Research Devops Engineer
> Monash University eResearch Centre
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs speed

2018-08-31 Thread Joe Comeau
Are you using bluestore OSDs ?

if so my thought process on this is what we are having an issue with is caching 
and bluestore

see the thread on bluestore caching
"Re: [ceph-users] Best practices for allocating memory to bluestore cache"


before when we were on Jewel and filestore we could get a much better sustained 
write 
Now on bluestore we are not getting more than a sustained 2GB file write before 
it drastically slows down 
Then it fluctuates from 0kb/s to 100MB/s and back and forth as it is writing

Thanks Joe

>>> Peter Eisch  8/31/2018 10:31 AM >>>

[replying to myself]

I set aside cephfs and created an rbd volume. I get the same splotchy 
throughput with rbd as I was getting with cephfs. (image attached)

So, withdrawing this as a question here as a cephfs issue.

#backingout

peter



Peter Eisch​









virginpulse.com
|
globalchallenge.virginpulse.com



Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA


Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.


v2.10



On 8/30/18, 12:25 PM, "Peter Eisch"  wrote:

Thanks for the thought. It’s mounted with this entry in fstab (one line, if 
email wraps it):

cephmon-s01,cephmon-s02,cephmon-s03:/ /loam ceph 
noauto,name=clientname,secretfile=/etc/ceph/secret,noatime,_netdev 0 2

Pretty plain, but I'm open to tweaking!

peter

From: Gregory Farnum 
Date: Thursday, August 30, 2018 at 11:47 AM
To: Peter Eisch 
Cc: "ceph-users@lists.ceph.com" 
Subject: Re: [ceph-users] cephfs speed

How are you mounting CephFS? It may be that the cache settings are just set 
very badly for a 10G pipe. Plus rados bench is a very parallel large-IO 
benchmark and many benchmarks you might dump into a filesystem are definitely 
not. 
-Greg

On Thu, Aug 30, 2018 at 7:54 AM Peter Eisch 
 wrote:
Hi,

I have a cluster serving cephfs and it works. It’s just slow. Client is using 
the kernel driver. I can ‘rados bench’ writes to the cephfs_data pool at wire 
speeds (9580Mb/s on a 10G link) but when I copy data into cephfs it is rare to 
get above 100Mb/s. Large file writes may start fast (2Gb/s) but within a minute 
slows. In the dashboard at the OSDs I get lots of triangles (it doesn't stream) 
which seems to be lots of starts and stops. By contrast the graphs show 
constant flow when using 'rados bench.'

I feel like I'm missing something obvious. What can I do to help diagnose this 
better or resolve the issue? 

Errata:
Version: 12.2.7 (on everything)
mon: 3 daemons, quorum cephmon-s01,cephmon-s03,cephmon-s02
mgr: cephmon-s02(active), standbys: cephmon-s01, cephmon-s03
mds: cephfs1-1/1/1 up {0=cephmon-s02=up:active}, 2 up:standby
osd: 70 osds: 70 up, 70 in
rgw: 3 daemons active

rados bench summary:
Total time run: 600.043733
Total writes made: 167725
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1118.09
Stddev Bandwidth: 7.23868
Max bandwidth (MB/sec): 1140
Min bandwidth (MB/sec): 1084
Average IOPS: 279
Stddev IOPS: 1
Max IOPS: 285
Min IOPS: 271
Average Latency(s): 0.057239
Stddev Latency(s): 0.0354817
Max latency(s): 0.367037
Min latency(s): 0.0120791

peter


Peter Eisch​











https://www.virginpulse.com/
|

https://globalchallenge.virginpulse.com/


Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA

Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.


v2.10

___
ceph-users mailing list
mailto:ceph-users@lists.ceph.com
http://lists.ceph.

Re: [ceph-users] cephfs speed

2018-08-31 Thread David Byte
Are these single threaded writes that you are referring to?  It certainly 
appears so from the thread, but I thought it would be good to confirm that 
before digging in further.


David Byte
Sr. Technology Strategist
SCE Enterprise Linux
SCE Enterprise Storage
Alliances and SUSE Embedded
db...@suse.com
918.528.4422

From: ceph-users  on behalf of Joe Comeau 

Date: Friday, August 31, 2018 at 1:07 PM
To: "ceph-users@lists.ceph.com" , Peter Eisch 

Subject: Re: [ceph-users] cephfs speed

Are you using bluestore OSDs ?

if so my thought process on this is what we are having an issue with is caching 
and bluestore

see the thread on bluestore caching
"Re: [ceph-users] Best practices for allocating memory to bluestore cache"


before when we were on Jewel and filestore we could get a much better sustained 
write
Now on bluestore we are not getting more than a sustained 2GB file write before 
it drastically slows down
Then it fluctuates from 0kb/s to 100MB/s and back and forth as it is writing

Thanks Joe

>>> Peter Eisch  8/31/2018 10:31 AM >>>
[replying to myself]

I set aside cephfs and created an rbd volume. I get the same splotchy 
throughput with rbd as I was getting with cephfs. (image attached)

So, withdrawing this as a question here as a cephfs issue.

#backingout

peter

Peter Eisch​




[Facebook]


[LinkedIn]


[Twitter]




virginpulse.com

|


globalchallenge.virginpulse.com



Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA


Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.



v2.10



On 8/30/18, 12:25 PM, "Peter Eisch"  wrote:

Thanks for the thought. It’s mounted with this entry in fstab (one line, if 
email wraps it):

cephmon-s01,cephmon-s02,cephmon-s03:/ /loam ceph 
noauto,name=clientname,secretfile=/etc/ceph/secret,noatime,_netdev 0 2

Pretty plain, but I'm open to tweaking!

peter

From: Gregory Farnum 
Date: Thursday, August 30, 2018 at 11:47 AM
To: Peter Eisch 
Cc: "ceph-users@lists.ceph.com" 
Subject: Re: [ceph-users] cephfs speed

How are you mounting CephFS? It may be that the cache settings are just set 
very badly for a 10G pipe. Plus rados bench is a very parallel large-IO 
benchmark and many benchmarks you might dump into a filesystem are definitely 
not.
-Greg

On Thu, Aug 30, 2018 at 7:54 AM Peter Eisch 
 wrote:
Hi,

I have a cluster serving cephfs and it works. It’s just slow. Client is using 
the kernel driver. I can ‘rados bench’ writes to the cephfs_data pool at wire 
speeds (9580Mb/s on a 10G link) but when I copy data into cephfs it is rare to 
get above 100Mb/s. Large file writes may start fast (2Gb/s) but within a minute 
slows. In the dashboard at the OSDs I get lots of triangles (it doesn't stream) 
which seems to be lots of starts and stops. By contrast the graphs show 
constant flow when using 'rados bench.'

I feel like I'm missing something obvious. What can I do to help diagnose this 
better or resolve the issue?

Errata:
Version: 12.2.7 (on everything)
mon: 3 daemons, quorum cephmon-s01,cephmon-s03,cephmon-s02
mgr: cephmon-s02(active), standbys: cephmon-s01, cephmon-s03
mds: cephfs1-1/1/1 up {0=cephmon-s02=up:active}, 2 up:standby
osd: 70 osds: 70 up, 70 in
rgw: 3 daemons active

rados bench summary:
Total time run: 600.043733
Total writes made: 167725
Write size: 4194304
Object size: 4194304
Bandwidth (MB/sec): 1118.09
Stddev Bandwidth: 7.23868
Max bandwidth (MB/sec): 1140
Min bandwidth (MB/sec): 1084
Average IOPS: 279
Stddev IOPS: 1
Max IOPS: 285
Min IOPS: 271
Average Latency(s): 0.057239
Stddev Latency(s): 0.0354817
Max latency(s): 0.367037
Min latency(s): 0.0120791

peter


Peter Eisch​











https://www.virginpulse.com/
|

https://globalchallenge.virginpulse.com/


Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA

Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, d