[ceph-users] removed_snaps in ceph osd dump?

2015-06-15 Thread Jan Schermer
Hi,
I have ~1800 removed_snaps listed in the output of “ceph osd dump”.

Is that allright? Any way to get rid of those? What’s the significance?

Thanks

Jan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS client issue

2015-06-15 Thread John Spray



On 14/06/15 20:00, Matteo Dacrema wrote:


Hi Lincoln,


I'm using the kernel client.

Kernel version is: 3.13.0-53-generic​



That's old by CephFS standards.  It's likely that the issue you're 
seeing is one of the known bugs (which were actually the motivation for 
adding the warning message you're seeing).


John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ec pool history objects

2015-06-15 Thread 池信泽
hi, all:

when I use ec poll, I see there are some object history for object
xx.

[root@node3 2.1d6s2_head]# ll -R | grep xx
-rw-r--r--. 1 root root  65536 Jun 15 17:41 xx__head_610951D6__2_fe1_2
-rw-r--r--. 1 root root  65536 Jun 15 17:41 xx__head_610951D6__2_fe2_2
-rw-r--r--. 1 root root  65536 Jun 15 17:45
xx__head_610951D6__2__2

   I think this object is used for roll_back when not all shards have
written object to filestore. Is it right?

   If that, why not delete this history objects when all shards has written
object to filestore?

-- 
Regards,
xinze
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ec pool history objects

2015-06-15 Thread 池信泽
hi, all:

when I use ec poll, I see there are some object history for object
xx.

Such as: xx__head_610951D6__2_fe1_2, xx__head_610951D6__2_fe2_2
xx__head_610951D6__2__2

I think this object is used for roll_back when not all shards have
written object to filestore. Is it right?

If that, why not delete this history objects when all shards has
written object to filestore?

-- 
Regards,
xinze
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS client issue

2015-06-15 Thread Matteo Dacrema
Ok, I'll update kernel to 3.16.3 version and let you know.


Thanks,

Matteo


Da: John Spray 
Inviato: luned? 15 giugno 2015 10:51
A: Matteo Dacrema; Lincoln Bryant; ceph-users
Oggetto: Re: [ceph-users] CephFS client issue



On 14/06/15 20:00, Matteo Dacrema wrote:

Hi Lincoln,


I'm using the kernel client.

Kernel version is: 3.13.0-53-generic?

That's old by CephFS standards.  It's likely that the issue you're seeing is 
one of the known bugs (which were actually the motivation for adding the 
warning message you're seeing).

John

--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come 
spam.
Clicca qui per metterlo in 
blacklist
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rebalancing two nodes simultaneously

2015-06-15 Thread Lindsay Mathieson
If I have two nodes with identical drive/osd setups
Drive 1 = 3TB
Drive 2 = 1TB
Drive 3 = 1TB

All with equal weights of (1)

I now decide to reweight Drive 1 to (3).

Would it be best to do one node at a time, or do both nodes simultaneously?

I would presume that all the data shuffling would be internal to the nodes,
so they would not effect each other.


NB. Drive 1 is rather slower than drives 2 & 3 which is why I did not
reweight at the start, but we are running out of space (up to 84% on the
1TB drives)
-- 
Lindsay
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs unmounts itself from time to time

2015-06-15 Thread Roland Giesler
I have a small cluster of 4 machines and quite a few drives.  After about 2
- 3 weeks cephfs fails.  It's not properly mounted anymore in /mnt/cephfs,
which of course causes the VM's running to fail too.

In /var/log/syslog I have "/mnt/cephfs: File exists at
/usr/share/perl5/PVE/Storage/DirPlugin.pm line 52" repeatedly.

​There doesn't seem to be anything wrong with ceph at the time.

# ceph -s
cluster 40f26838-4760-4b10-a65c-b9c1cd671f2f
 health HEALTH_WARN clock skew detected on mon.s1
 monmap e2: 2 mons at {h1=192.168.121.30:6789/0,s1=192.168.121.33:6789/0},
election epoch 312, quorum 0,1 h1,s1
 mdsmap e401: 1/1/1 up {0=s3=up:active}, 1 up:standby
 osdmap e5577: 19 osds: 19 up, 19 in
  pgmap v11191838: 384 pgs, 3 pools, 774 GB data, 455 kobjects
1636 GB used, 9713 GB / 11358 GB avail
 384 active+clean
  client io 12240 kB/s rd, 1524 B/s wr, 24 op/s
​
​# ceph osd tree
# id  weight   type nameup/down  reweight
-111.13root default
-2 8.14host h1
 1 0.9 osd.1up1
 3 0.9 osd.3up1
 4 0.9 osd.4up1
 5 0.68osd.5up1
 6 0.68osd.6up1
 7 0.68osd.7up1
 8 0.68osd.8up1
 9 0.68osd.9up1
10 0.68osd.10   up1
11 0.68osd.11   up1
12 0.68osd.12   up1
-3 0.45host s3
 2 0.45osd.2up1
-4 0.9 host s2
13 0.9 osd.13   up1
-5 1.64host s1
14 0.29osd.14   up1
 0 0.27osd.0up1
15 0.27osd.15   up1
16 0.27osd.16   up1
17 0.27osd.17   up1
18 0.27osd.18   up1

​When I "umount -l /mnt/cephfs" and then "mount -a" after that, the the
ceph volume is loaded again.  I can restart the VM's and all seems well.

I can't find errors pertaining to cephfs in the the other logs either.

System information:

Linux s1 2.6.32-34-pve #1 SMP Fri Dec 19 07:42:04 CET 2014 x86_64 GNU/Linux

I can't upgrade to kernel v3.13 since I'm using containers.

Of course, I want to prevent this from happening!  How do I troubleshoot
that?  What is causing this?​

​regards


*Roland Giesler*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs unmounts itself from time to time

2015-06-15 Thread Gregory Farnum
On Mon, Jun 15, 2015 at 4:03 AM, Roland Giesler  wrote:
> I have a small cluster of 4 machines and quite a few drives.  After about 2
> - 3 weeks cephfs fails.  It's not properly mounted anymore in /mnt/cephfs,
> which of course causes the VM's running to fail too.
>
> In /var/log/syslog I have "/mnt/cephfs: File exists at
> /usr/share/perl5/PVE/Storage/DirPlugin.pm line 52" repeatedly.
>
> There doesn't seem to be anything wrong with ceph at the time.
>
> # ceph -s
> cluster 40f26838-4760-4b10-a65c-b9c1cd671f2f
>  health HEALTH_WARN clock skew detected on mon.s1
>  monmap e2: 2 mons at
> {h1=192.168.121.30:6789/0,s1=192.168.121.33:6789/0}, election epoch 312,
> quorum 0,1 h1,s1
>  mdsmap e401: 1/1/1 up {0=s3=up:active}, 1 up:standby
>  osdmap e5577: 19 osds: 19 up, 19 in
>   pgmap v11191838: 384 pgs, 3 pools, 774 GB data, 455 kobjects
> 1636 GB used, 9713 GB / 11358 GB avail
>  384 active+clean
>   client io 12240 kB/s rd, 1524 B/s wr, 24 op/s
> # ceph osd tree
> # id  weight   type nameup/down  reweight
> -111.13root default
> -2 8.14host h1
>  1 0.9 osd.1up1
>  3 0.9 osd.3up1
>  4 0.9 osd.4up1
>  5 0.68osd.5up1
>  6 0.68osd.6up1
>  7 0.68osd.7up1
>  8 0.68osd.8up1
>  9 0.68osd.9up1
> 10 0.68osd.10   up1
> 11 0.68osd.11   up1
> 12 0.68osd.12   up1
> -3 0.45host s3
>  2 0.45osd.2up1
> -4 0.9 host s2
> 13 0.9 osd.13   up1
> -5 1.64host s1
> 14 0.29osd.14   up1
>  0 0.27osd.0up1
> 15 0.27osd.15   up1
> 16 0.27osd.16   up1
> 17 0.27osd.17   up1
> 18 0.27osd.18   up1
>
> When I "umount -l /mnt/cephfs" and then "mount -a" after that, the the ceph
> volume is loaded again.  I can restart the VM's and all seems well.
>
> I can't find errors pertaining to cephfs in the the other logs either.
>
> System information:
>
> Linux s1 2.6.32-34-pve #1 SMP Fri Dec 19 07:42:04 CET 2014 x86_64 GNU/Linux

I'm not sure what version of Linux this really is (I assume it's a
vendor kernel of some kind!), but it's definitely an old one! CephFS
sees pretty continuous improvements to stability and it could be any
number of resolved bugs.

If you can't upgrade the kernel, you might try out the ceph-fuse
client instead as you can run a much newer and more up-to-date version
of it, even on the old kernel. Other than that, can you include more
information about exactly what you mean when saying CephFS unmounts
itself?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rebalancing two nodes simultaneously

2015-06-15 Thread Lindsay Mathieson
If I have two nodes with identical drive/osd setups
Drive 1 = 3TB
Drive 2 = 1TB
Drive 3 = 1TB

All with equal weights of (1)

I now decide to reweight Drive 1 to (3).

Would it be best to do one node at a time, or do both nodes simultaneously?

I would presume that all the data shuffling would be internal to the nodes,
so they would not effect each other.


NB. Drive 1 is rather slower than drives 2 & 3 which is why I did not
reweight at the start, but we are running out of space (up to 84% on the
1TB drives)


p,s Is there a way to speed up the rebalance? the cluster is unused
overnight, so I can thrash the IO

-- 
Lindsay
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OSD with OCFS2

2015-06-15 Thread gjprabu
Hi  

The size differ issue is solved, This is related to ocfs2 format option and 
-C count should be 4K.
(mkfs.ocfs2 /dev/mapper/mpatha -N 64 -b 4K -C 256K -T mail 
--fs-features=extended-slotmap --fs-feature-level=max-features -L ) 

Need to change like below.
(mkfs.ocfs2 /dev/mapper/mpatha -b4K -C 4K -L label -T mail -N 2 /dev/sdX

 << Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 << Could you please explain a bit what you are trying to do here ?

In ceph shared directory , we will clone source repository then will access 
the same from ceph client .


   
Regards
Prabu


 On Fri, 12 Jun 2015 21:12:03 +0530 Somnath Roy 
 wrote  

  Sorry, it was a typo , I meant to say 1GB only.
 I would say break the problem like the following.
  
 1. Run some fio workload say (1G) on RBD and run ceph command like ‘ceph df’ 
to see how much data it written. I am sure you will be seeing same data. 
Remember by default ceph rados object size is 4MB, so, it should write 1GB/4MB 
number of objects.
  
 2. Also, you can use ‘rados’ utility to directly put/get say 1GB file to the 
cluster and check similar way.
  
 As I said, if your journal in the same device and if you measure the space 
consumed by entire OSD mount point , it will be more because of WA induced by 
Ceph. But, individual file size you transferred should not differ.
  
 << Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 Could you please explain a bit what you are trying to do here ?
  
 Thanks & Regards
 Somnath
  
   From: gjprabu [mailto:gjpr...@zohocorp.com] 
 Sent: Friday, June 12, 2015 12:34 AM
 To: Somnath Roy
 Cc: ceph-users@lists.ceph.com; Kamala Subramani; Siva Sokkumuthu
 Subject: Re: RE: RE: [ceph-users] Ceph OSD with OCFS2
 
 
  
  Hi,
 
   I measured the data only what i transfered from client. Example 500MB 
file transfered after complete if i measured the same file size will be 1GB not 
10GB. 
 
Our Configuration is :-
 
=
 ceph -w
 cluster f428f5d6-7323-4254-9f66-56a21b099c1a
 health HEALTH_OK
 monmap e1: 3 mons at 
{cephadmin=172.20.19.235:6789/0,cephnode1=172.20.7.168:6789/0,cephnode2=172.20.9.41:6789/0},
 election epoch 114, quorum 0,1,2 cephnode1,cephnode2,cephadmin
 osdmap e9: 2 osds: 2 up, 2 in
 pgmap v1022: 64 pgs, 1 pools, 7507 MB data, 1952 objects
 26139 MB used, 277 GB / 302 GB avail
 64 active+clean
 
===
 ceph.conf
 [global]
 osd pool default size = 2
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host = 172.20.7.168,172.20.9.41,172.20.19.235
 mon_initial_members = zoho-cephnode1, zoho-cephnode2, zoho-cephadmin
 fsid = f428f5d6-7323-4254-9f66-56a21b099c1a
 

 
 What is the replication policy you are using ?
  
We are using default OSD with 2 replica not using CRUSH Map, PG num and 
Erasure etc., 
 
 What interface you used to store the data ?
 
We are using RBD to store data and its has been mounted with OCFS2 in 
client side.
 
 How are you removing data ? Are you removing a rbd image ?
  
We are not removing rbd image, only removing data which is already 
having and removing using rm command from client. We didn't set async way to 
transfer or remove data
 
 
 Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)
 
 
 
   Regards
 
 Prabu GJ
   
 
 
  
   
  On Fri, 12 Jun 2015 00:21:24 +0530 Somnath Roy 
 wrote  
 
Hi,
  
 Ceph journal works in different way.  It’s a write ahead journal, all the data 
will be persisted first in journal and then will be written to actual place. 
Journal data is encoded. Journal is a fixed size partition/file and written 
sequentially. So, if you are placing journal in HDDs, it will be overwritten, 
for SSD case , it will be GC later. So, if you are measuring amount of data 
written to the device it will be double. But, if you are saying you have 
written a 500MB file to cluster and you are seeing the actual file size is 10G, 
it should not be the case. How are you seeing this size BTW ?
  
 Could you please tell us more about your configuration ?
 What is the replication policy you are using ?
 What interface you used to store the data ?
  
 Regarding your other query..
  
 << If i transfer 1GB data, what will be server size(OSD), Is this will 
write compressed format
  
 No, actual data is not compressed. You don’t want to fill up 

Re: [ceph-users] Rebalancing two nodes simultaneously

2015-06-15 Thread Lindsay Mathieson
On 15 June 2015 at 21:16, Lindsay Mathieson 
wrote:

> p,s Is there a way to speed up the rebalance? the cluster is unused
> overnight, so I can thrash the IO



I bumped max_backfills to 20 and recovery max active to 30 using inject
args. Nothing seems to be breaking yet :) I/O delay seems to be around 15%

-- 
Lindsay
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What is link and unlink options used for in radosgw-admin

2015-06-15 Thread WCMinor
Wenjun Huang  writes:

> 
> 
> Hello, everyone
> I am now confused with the options of link & unlink in radosgw-admin utility.
> 
> In my option, if I link the ownerA’s bucketA to ownerB through the command
below:
> 
> radosgw-admin bucket link —uid=ownerB —bucket=bucketA
> 
> then, I think the owner of bucketA is ownerB. 
> 
> But, in my test, there is nothing changed after I run the command except
that the displayed “owner: “ has changed in the result of the command:
>  radosgw-admin bucket stats —bucket=bucketA
> 
> I can still do nothing to bucketA through user ownerB.
> 
> Has I misunderstand the usage of link & unlink, but what are they really for?
> 
> Thanks
> Wenjun
> 
> 
> ___
> ceph-users mailing list
> ceph-users@...
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


Hi, I just got the same problem, I solved by changing directly the attrs in
the bucket metadata, do:

# radosgw-admin metadata get bucket:"nameofyourbucket"

then you'll get the bucket_id and other stuff as response, do:
# radosgw-admin metadata get bucket.instance:"nameofyourbucket":"bucket_id"
> bucket.json

Then edit it, and change the "val" field after the "key" field called
"user.rgw.acl"

You have to put there string corresponding to the uid of your new owner. In
order to get it, you can create a test bucket with the new owner and repeat
the operations above.

Last step, save your json file and insert the new metadata into the bucket:

# radosgw-admin metadata put bucket.instance:"nameofyourbucket":"bucket_id"
< bucket.json

Best,

WCMinor

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-15 Thread Barclay Jameson
I am currently implementing Ceph into our HPC environment to handle
SAS temp workspace.
I am starting out with 3 OSD nodes with 1 MON/MDS node.
16 4TB HDDs per OSD node with 4 120GB SSD.
Each node has 40Gb Mellanox interconnect between each other to a
Mellanox switch.
Each client node has 10Gb to switch.

I have not done comparisons to Lustre but I have done comparisons to
PanFS which we currently use in production.
I have found that most workflows Ceph is comparibale to PanFS if not
better; however, PanFS still does better with small IO due to how it
caches small files.
If you want I can give you some hard numbers.

almightybeeij

On Fri, Jun 12, 2015 at 12:31 AM, Nigel Williams
 wrote:
> Wondering if anyone has done comparisons between CephFS and other
> parallel filesystems like Lustre typically used in HPC deployments
> either for scratch storage or persistent storage to support HPC
> workflows?
>
> thanks.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NFS interaction with RBD

2015-06-15 Thread Simon Leinen
Trent Lloyd writes:
> Jens-Christian Fischer  writes:
>> 
>> I think we (i.e. Christian) found the problem:
>> We created a test VM with 9 mounted RBD volumes (no NFS server). As soon as 
> he hit all disks, we started to experience these 120 second timeouts. We 
> realized that the QEMU process on the hypervisor is opening a TCP connection 
> to every OSD for every mounted volume - exceeding the 1024 FD limit.
>> 
>> So no deep scrubbing etc, but simply to many connections…

> Have seen mention of similar from CERN in their presentations, found this 
> post on a quick google.. might help?

> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-December/026187.html

Yes, that's exactly the problem that we had.  We solved it by setting
max_files to 8191 in /etc/libvirt/qemu.conf on all compute hosts.

Once that was applied, we were able to live-migrate running instances
for them to enjoy the increased limit.
-- 
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NFS interaction with RBD

2015-06-15 Thread Simon Leinen
Christian Schnidrig writes:
> Well that’s strange. I wonder why our systems behave so differently.

One point about our cluster (I work with Christian, who's still on
vacation, and Jens-Christian) is that it has 124 OSDs and 2048 PGs (I
think) in the pool used for these RBD volumes.  As a result, each
connected RBD volume can result in 124 (or slightly less) connections
from the RBD client inside Qemu/KVM to each OSD that stores data from
that RBD volume.

I don't know how librbd's connection management works.  I assume that
these librbd-to-OSD connections are only created once the client
actually tries to access data on that OSD.  But when you have a lot of
data on the RBD volumes that the VM actually accesses (which we have),
then these many connections will actually be created.  And apparently
librbd doesn't handle the situation very gracefully when its process
runs into the limit of open file descriptors.

George only has 20 OSDs, so I guess that's an upper bound on the number
of TCP connections that librbd will open per RBD volume.  He should be
safe up to about 50 volumes per VM, assuming the default nfiles limit of
1024.

The nasty thing is when everything has been running fine for ages, and
then you add a bunch of OSDs, run a few benchmarks, see that everything
should run much BETTER (as promised :-), but then suddenly some VMs with
lots of mounted volumes mysteriously start hanging.

> Maybe the number of placement groups plays a major role as
> well. Jens-Christian may be able to give you the specifics of our ceph
> cluster.

Me too, see above.

> I’m about to leave on vacation and don’t have time to look that up
> anymore.

Enjoy your well-earned vacation!!
-- 
Simon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] need help

2015-06-15 Thread Ranjan, Jyoti

Hi,

I have been trying to deploy ceph rados gateway on a single node but failing 
for a while. My ceph cluster is with three osd and looks fine. I could create 
gateway user but user is not able to create bucket. I am getting below error ...


Traceback (most recent call last):
  File "s3test.py", line 13, in 
bucket = conn.create_bucket('my-new-bucket')
  File "/usr/lib/python2.7/dist-packages/boto/s3/connection.py", line 504, in 
create_bucket
response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 404 Not Found
None

My code o create bucket is ...

import boto
import boto.s3.connection
access_key = '1TFIE3C1MUO7W9FGS5T6'
secret_key = 'EAskQAhQCzs1Y6o6RJZHS5tc0yCu6jkbAd9WvdHe'
conn = boto.connect_s3(
aws_access_key_id = access_key,
aws_secret_access_key = secret_key,
host = 'aviator.objectstore.com',
is_secure=False,
calling_format = boto.s3.connection.OrdinaryCallingFormat(),
)

bucket = conn.create_bucket('my-new-bucket')

Regards,
Jyoti Ranjan



   cluster 373abfe1-c6d4-45b3-9ea7-f91c240d9532
 health HEALTH_OK
 monmap e1: 1 mons at {aviator=10.1.195.51:6789/0}
election epoch 2, quorum 0 aviator
 osdmap e27: 3 osds: 3 up, 3 in
  pgmap v208: 240 pgs, 8 pools, 1508 bytes data, 47 objects
108 MB used, 33650 MB / 33758 MB avail
 240 active+clean



ubuntu@aviator:~/ceph-cluster$ radosgw-admin user info --uid=testuser
{
"user_id": "testuser",
"display_name": "First User",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers": [
{
"id": "testuser:swift",
"permissions": ""
}
],
"keys": [
{
"user": "testuser",
"access_key": "1TFIE3C1MUO7W9FGS5T6",
"secret_key": "EAskQAhQCzs1Y6o6RJZHS5tc0yCu6jkbAd9WvdHe"
},
{
"user": "testuser:swift",
"access_key": "85SJ2SSAB74IQGLJRGQ8",
"secret_key": ""
}
],
"swift_keys": [
{
"user": "testuser:swift",
"secret_key": "5oGD6P4Y90jVMDVEeuBvpLxXYWikQWrAwLCB8isr"
}
],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"max_size_kb": -1,
"max_objects": -1
},
"temp_url_keys": []
}
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] unfound object(s)

2015-06-15 Thread GuangYang
Hello Cephers,
On one of our production clusters, there is one *unfound* object reported which 
make the PG stuck at recovering. While trying to recover the object, I failed 
to find a way to tell which object is unfound.

I tried:
  1> PG query
  2> Grep from monitor log

Did I miss anything?  

Thanks,
Guang 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] unfound object(s)

2015-06-15 Thread GuangYang
Thanks to Sam, we can use:
  ceph pg  list_missing
to get the list of unfound objects.

Thanks,
Guang



> From: yguan...@outlook.com
> To: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com
> Date: Mon, 15 Jun 2015 16:46:53 +
> Subject: [ceph-users] unfound object(s)
>
> Hello Cephers,
> On one of our production clusters, there is one *unfound* object reported 
> which make the PG stuck at recovering. While trying to recover the object, I 
> failed to find a way to tell which object is unfound.
>
> I tried:
> 1> PG query
> 2> Grep from monitor log
>
> Did I miss anything?
>
> Thanks,
> Guang
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
  
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] firefly to giant upgrade broke ceph-gw

2015-06-15 Thread Michael Kuriger
Hi all,
I recently upgraded my 2 ceph clusters from firefly to giant.  After the
update, ceph gateway has some issues.  I¹ve even gone so far as to
completely remove all gateway related pools and recreated from scratch.

I can write data into the gateway, and that seems to work (most of the
time) but deleting is not working unless I specify an exact file to
delete.  Also, my radosgw-agent is not syncing buckets any longer.  I¹m
using s3cmd to test reads/writes to the gateway.

Has anyone else had problems in giant?
 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS: 'ls -alR' performance terrible unless Linux cache flushed

2015-06-15 Thread negillen negillen
Hello everyone,

something very strange is driving me crazy with CephFS (kernel driver).
I copy a large directory on the CephFS from one node. If I try to perform a
'time ls -alR' on that directory it gets executed in less than one second.
If I try to do the same 'time ls -alR' from another node it takes several
minutes. No matter how many times I repeat the command, the speed is always
abysmal. The ls works fine on the node where the initial copy was executed
from. This happens with any directory I have tried, no matter what kind of
data is inside.

After lots of experimenting I have found that in order to have fast ls
speed for that dir from every node I need to flush the Linux cache on the
original node:
echo 3 > /proc/sys/vm/drop_caches
Unmounting and remounting the CephFS on that node does the trick too.

Anyone has a clue about what's happening here? Could this be a problem with
the writeback fscache for the CephFS?

Any help appreciated! Thanks and regards. :)


# uname -r
3.10.80-1.el6.elrepo.x86_64
# ceph -v
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
# ceph -s
cluster f9ffbbd7-186b-483a-96ea-90cdadb81f2a
 health HEALTH_OK
 monmap e1: 3 mons at {[omissis]}
election epoch 60, quorum 0,1,2 [omissis]
 mdsmap e59: 1/1/1 up {0=[omissis]=up:active}, 2 up:standby
 osdmap e146: 2 osds: 2 up, 2 in
  pgmap v122287: 256 pgs, 2 pools, 30709 MB data, 75239 objects
62432 MB used, 860 GB / 921 GB avail
 256 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CephFS: 'ls -alR' performance terrible unless Linux cache flushed

2015-06-15 Thread negillen negillen
Hello everyone,

something very strange is driving me crazy with CephFS (kernel driver).
I copy a large directory on the CephFS from one node. If I try to perform a
'time ls -alR' on that directory it gets executed in less than one second.
If I try to do the same 'time ls -alR' from another node it takes several
minutes. No matter how many times I repeat the command, the speed is always
abysmal. The ls works fine on the node where the initial copy was executed
from. This happens with any directory I have tried, no matter what kind of
data is inside.

After lots of experimenting I have found that in order to have fast ls
speed for that dir from every node I need to flush the Linux cache on the
original node:
echo 3 > /proc/sys/vm/drop_caches
Unmounting and remounting the CephFS on that node does the trick too.

Anyone has a clue about what's happening here? Could this be a problem with
the writeback fscache for the CephFS?

Any help appreciated! Thanks and regards. :)


# uname -r
3.10.80-1.el6.elrepo.x86_64
# ceph -v
ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
# ceph -s
cluster f9ffbbd7-186b-483a-96ea-90cdadb81f2a
 health HEALTH_OK
 monmap e1: 3 mons at {[omissis]}
election epoch 60, quorum 0,1,2 [omissis]
 mdsmap e59: 1/1/1 up {0=[omissis]=up:active}, 2 up:standby
 osdmap e146: 2 osds: 2 up, 2 in
  pgmap v122287: 256 pgs, 2 pools, 30709 MB data, 75239 objects
62432 MB used, 860 GB / 921 GB avail
 256 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RADOS Bench

2015-06-15 Thread Garg, Pankaj
Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Bench

2015-06-15 Thread Somnath Roy
Pankaj,
It is the cumulative BW of ceph cluster but you will be limited by your single 
client BW always.
To verify if you are single client 10Gb network limited or not, put another 
client and see if it is scaling or not.

Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Monday, June 15, 2015 12:55 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] RADOS Bench

Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Bench

2015-06-15 Thread Garg, Pankaj
Thanks Somnath. Do you mean that I should run Rados Bench in parallel on 2 
different clients?
Is there a way to run Rados Bench from 2 clients, so that they run in parallel, 
except launching them together manually?

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Monday, June 15, 2015 1:01 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: RADOS Bench

Pankaj,
It is the cumulative BW of ceph cluster but you will be limited by your single 
client BW always.
To verify if you are single client 10Gb network limited or not, put another 
client and see if it is scaling or not.

Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Monday, June 15, 2015 12:55 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] RADOS Bench

Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: Too many PGs

2015-06-15 Thread Marek Dohojda
I hate to bug, but I truly hope someone has an answer to below.

Thank you kindly!

-- Forwarded message --
From: Marek Dohojda 
Date: Wed, Jun 10, 2015 at 7:49 AM
Subject: Too many PGs
To: ceph-users-requ...@lists.ceph.com


Hello

I am running “Hammer” Ceph and I am getting following:
health HEALTH_WARN
too many PGs per OSD (438 > max 300)

Now I realize that this is because I have too few OSD for the amount of
pools I have.   Currently I have 14 OSD, split into 7 each for SSD and
LVM.  I created another one for CephFS which started this error.

I think that the reason I have this error is because I created the last
CephFS pool with 512 PGs, which in retrospect was a mistake.  I am
utilizing this one strictly as a means to easy backup my libvirt XML files,
and hence do not need much in terms of redundancy.  Even if this was to
fail, I am not really worried about it.

I will be adding more OSD soonish, so this error will go away in the long
term.

For now

Is there any way to either suppress this error, or adjust down the PGs on
the CephFS Pool that I created? I tried to do:
ceph odd pool set pg_num 64

However that didn’t actually do it, perhaps it needs a restart?

I wouldn’t be opposed to deleting this pool and recreating, but when I try
that it gives me an error that there must be at least one pool in MDS.


Thank you!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RADOS Bench

2015-06-15 Thread Somnath Roy
No, you need to launch them manually... Here is my thought.

1. Say running 4 instances of rados clients from 4 different console you are 
getting 1100 MB/s as you said.

2. Now , say running 4 more instances from another client with 10 Gb, you are 
able to scale it more.

This means you are limited by single client BW, cluster is capable of giving 
more.

Thanks & Regards
Somnath

From: Garg, Pankaj [mailto:pankaj.g...@caviumnetworks.com]
Sent: Monday, June 15, 2015 1:03 PM
To: Somnath Roy; ceph-users@lists.ceph.com
Subject: RE: RADOS Bench

Thanks Somnath. Do you mean that I should run Rados Bench in parallel on 2 
different clients?
Is there a way to run Rados Bench from 2 clients, so that they run in parallel, 
except launching them together manually?

From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Monday, June 15, 2015 1:01 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: RADOS Bench

Pankaj,
It is the cumulative BW of ceph cluster but you will be limited by your single 
client BW always.
To verify if you are single client 10Gb network limited or not, put another 
client and see if it is scaling or not.

Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, 
Pankaj
Sent: Monday, June 15, 2015 12:55 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] RADOS Bench

Hi,
I have a few machines in my Ceph Cluster. I have another machine that I use to 
run RADOS Bench to get the performance.
I am now seeing numbers around 1100 MB/Sec, which is quite close to saturation 
point of the 10Gbps link.

I'd like to understand what does the total bandwidth number represent after I 
run the Rados bench test? Is this cumulative bandwidth of the Ceph Cluster or 
does it represent the
Bandwidth to the client machine?

I'd like to understand if I'm now being limited by my network.

Thanks
Pankaj



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Too many PGs

2015-06-15 Thread Somnath Roy
If you want to suppress the warning, do this in the conf file..

mon_pg_warn_max_per_osd = 0

  or

mon_pg_warn_max_per_osd = 


Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Marek 
Dohojda
Sent: Monday, June 15, 2015 1:05 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Fwd: Too many PGs

I hate to bug, but I truly hope someone has an answer to below.

Thank you kindly!

-- Forwarded message --
From: Marek Dohojda 
mailto:mdoho...@altitudedigital.com>>
Date: Wed, Jun 10, 2015 at 7:49 AM
Subject: Too many PGs
To: ceph-users-requ...@lists.ceph.com


Hello

I am running “Hammer” Ceph and I am getting following:
health HEALTH_WARN
too many PGs per OSD (438 > max 300)

Now I realize that this is because I have too few OSD for the amount of pools I 
have.   Currently I have 14 OSD, split into 7 each for SSD and LVM.  I created 
another one for CephFS which started this error.

I think that the reason I have this error is because I created the last CephFS 
pool with 512 PGs, which in retrospect was a mistake.  I am utilizing this one 
strictly as a means to easy backup my libvirt XML files, and hence do not need 
much in terms of redundancy.  Even if this was to fail, I am not really worried 
about it.

I will be adding more OSD soonish, so this error will go away in the long term.

For now

Is there any way to either suppress this error, or adjust down the PGs on the 
CephFS Pool that I created? I tried to do:
ceph odd pool set pg_num 64

However that didn’t actually do it, perhaps it needs a restart?

I wouldn’t be opposed to deleting this pool and recreating, but when I try that 
it gives me an error that there must be at least one pool in MDS.


Thank you!




PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] help to new user

2015-06-15 Thread vida ahmadi
Dear all,
I am new user in ceph and I would like to install ceph with minimum
requirement.I read in some documents to put ceph components(OSDs,MON,MDSs)
in different virtual machine. Is it good idea for starting?
Please let me know about your suggestion and experience. Thank you in
advance.
-- 
Best regards,
Vida
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] help to new user

2015-06-15 Thread Shane Gibson

Vida - installing Ceph as hosted VMs is a great way to get experience 
"hands-on" with a Ceph cluster.  It is NOT a good way to run Ceph for any real 
work load.NOTE that it's critical you structure your virtual disks and 
virtual network(s) to match how you'd like to run your Ceph work loads on real 
hardware.  If you don't - when you move from your "playground" Ceph cluster, 
all of your installations tooling/scripts will break badly - and you'll have a 
very hard time getting your ceph cluster up and running without significant 
debugging and reworking of the installation process...

In summary - yet - an excellent way to get familiar with Ceph.  ONE MAJOR 
CAVEAT - it's critical that your VMs have accurate / sync'd time (clock time).  
If they don't you'll have no end of problems with your cluster not being 
"clean".  Insure that your VMs are all successfully running NTP or peering with 
each other to keep in sync.  NOTE that a lot of VM implementations wil  suffer 
significant clock drift (even within just a few hours of running) ... this can 
be a pain in the behind to deal with...

~~shane




On 6/15/15, 1:42 PM, "vida ahmadi" 
mailto:vm.ahmad...@gmail.com>> wrote:

Dear all,
I am new user in ceph and I would like to install ceph with minimum 
requirement.I read in some documents to put ceph components(OSDs,MON,MDSs) in 
different virtual machine. Is it good idea for starting?
Please let me know about your suggestion and experience. Thank you in advance.
--
Best regards,
Vida
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS client issue

2015-06-15 Thread Matteo Dacrema
With 3.16.3 kernel it seems to be stable but I've discovered one new issue.

If I take down one of the two osd node all the client stop to respond.


Here the output of ceph -s

ceph -s
cluster 2de7b17f-0a3e-4109-b878-c035dd2f7735
 health HEALTH_WARN
256 pgs degraded
127 pgs stuck inactive
127 pgs stuck unclean
256 pgs undersized
recovery 1457662/2915324 objects degraded (50.000%)
4/8 in osds are down
clock skew detected on mon.cephmds01, mon.ceph-mon1
 monmap e5: 3 mons at 
{ceph-mon1=10.29.81.184:6789/0,cephmds01=10.29.81.161:6789/0,cephmds02=10.29.81.160:6789/0}
election epoch 64, quorum 0,1,2 cephmds02,cephmds01,ceph-mon1
 mdsmap e176: 1/1/1 up {0=cephmds01=up:active}, 1 up:standby
 osdmap e712: 8 osds: 4 up, 8 in
  pgmap v420651: 256 pgs, 2 pools, 133 GB data, 1423 kobjects
289 GB used, 341 GB / 631 GB avail
1457662/2915324 objects degraded (50.000%)
 256 undersized+degraded+peered
  client io 86991 B/s wr, 0 op/s


When I take UP the node all clients resume to work.

Thanks,
Matteo

?



Da: ceph-users  per conto di Matteo Dacrema 

Inviato: luned? 15 giugno 2015 12:37
A: John Spray; Lincoln Bryant; ceph-users
Oggetto: Re: [ceph-users] CephFS client issue


Ok, I'll update kernel to 3.16.3 version and let you know.


Thanks,

Matteo


Da: John Spray 
Inviato: luned? 15 giugno 2015 10:51
A: Matteo Dacrema; Lincoln Bryant; ceph-users
Oggetto: Re: [ceph-users] CephFS client issue



On 14/06/15 20:00, Matteo Dacrema wrote:

Hi Lincoln,


I'm using the kernel client.

Kernel version is: 3.13.0-53-generic?

That's old by CephFS standards.  It's likely that the issue you're seeing is 
one of the known bugs (which were actually the motivation for adding the 
warning message you're seeing).

John

--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come 
spam.
Clicca qui per metterlo in 
blacklist

--
Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto.
Clicca qui per segnalarlo come 
spam.
Clicca qui per metterlo in 
blacklist
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS client issue

2015-06-15 Thread Christian Balzer

Hello,

On Mon, 15 Jun 2015 23:11:07 + Matteo Dacrema wrote:

> With 3.16.3 kernel it seems to be stable but I've discovered one new
> issue.
> 
> If I take down one of the two osd node all the client stop to respond.
> 
How did you take the node down?

What is your "osd_pool_default_min_size"?

Penultimately, you wouldn't deploy a cluster with just 2 storage nodes in
production anyway.

Christian
> 
> Here the output of ceph -s
> 
> ceph -s
> cluster 2de7b17f-0a3e-4109-b878-c035dd2f7735
>  health HEALTH_WARN
> 256 pgs degraded
> 127 pgs stuck inactive
> 127 pgs stuck unclean
> 256 pgs undersized
> recovery 1457662/2915324 objects degraded (50.000%)
> 4/8 in osds are down
> clock skew detected on mon.cephmds01, mon.ceph-mon1
>  monmap e5: 3 mons at
> {ceph-mon1=10.29.81.184:6789/0,cephmds01=10.29.81.161:6789/0,cephmds02=10.29.81.160:6789/0}
> election epoch 64, quorum 0,1,2 cephmds02,cephmds01,ceph-mon1 mdsmap
> e176: 1/1/1 up {0=cephmds01=up:active}, 1 up:standby osdmap e712: 8
> osds: 4 up, 8 in pgmap v420651: 256 pgs, 2 pools, 133 GB data, 1423
> kobjects 289 GB used, 341 GB / 631 GB avail
> 1457662/2915324 objects degraded (50.000%)
>  256 undersized+degraded+peered
>   client io 86991 B/s wr, 0 op/s
> 
> 
> When I take UP the node all clients resume to work.
> 
> Thanks,
> Matteo
> 
> ?
> 
> 
> 
> Da: ceph-users  per conto di Matteo
> Dacrema  Inviato: luned? 15 giugno 2015 12:37
> A: John Spray; Lincoln Bryant; ceph-users
> Oggetto: Re: [ceph-users] CephFS client issue
> 
> 
> Ok, I'll update kernel to 3.16.3 version and let you know.
> 
> 
> Thanks,
> 
> Matteo
> 
> 
> Da: John Spray 
> Inviato: luned? 15 giugno 2015 10:51
> A: Matteo Dacrema; Lincoln Bryant; ceph-users
> Oggetto: Re: [ceph-users] CephFS client issue
> 
> 
> 
> On 14/06/15 20:00, Matteo Dacrema wrote:
> 
> Hi Lincoln,
> 
> 
> I'm using the kernel client.
> 
> Kernel version is: 3.13.0-53-generic?
> 
> That's old by CephFS standards.  It's likely that the issue you're
> seeing is one of the known bugs (which were actually the motivation for
> adding the warning message you're seeing).
> 
> John
> 
> --
> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non
> infetto. Clicca qui per segnalarlo come
> spam.
> Clicca qui per metterlo in
> blacklist
> 
> --
> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non
> infetto. Clicca qui per segnalarlo come
> spam.
> Clicca qui per metterlo in
> blacklist


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Fusion Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] anyone using CephFS for HPC?

2015-06-15 Thread Shinobu Kinjo
Thanks for your info.
I would like to know how large i/o that you mentioned, and what kind of app
you used to do benchmarking?

Sincerely,
Kinjo

On Tue, Jun 16, 2015 at 12:04 AM, Barclay Jameson 
wrote:

> I am currently implementing Ceph into our HPC environment to handle
> SAS temp workspace.
> I am starting out with 3 OSD nodes with 1 MON/MDS node.
> 16 4TB HDDs per OSD node with 4 120GB SSD.
> Each node has 40Gb Mellanox interconnect between each other to a
> Mellanox switch.
> Each client node has 10Gb to switch.
>
> I have not done comparisons to Lustre but I have done comparisons to
> PanFS which we currently use in production.
> I have found that most workflows Ceph is comparibale to PanFS if not
> better; however, PanFS still does better with small IO due to how it
> caches small files.
> If you want I can give you some hard numbers.
>
> almightybeeij
>
> On Fri, Jun 12, 2015 at 12:31 AM, Nigel Williams
>  wrote:
> > Wondering if anyone has done comparisons between CephFS and other
> > parallel filesystems like Lustre typically used in HPC deployments
> > either for scratch storage or persistent storage to support HPC
> > workflows?
> >
> > thanks.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



-- 
Life w/ Linux 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph OSD with OCFS2

2015-06-15 Thread gjprabu
Hi Somnath,
 Is there any fine tune for the blow issues.

<< Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 << Could you please explain a bit what you are trying to do here ?

In ceph shared directory , we will clone source repository then will access 
the same from ceph client .


Regards
Prabu

 On Mon, 15 Jun 2015 17:16:11 +0530 gjprabu  
wrote  

Hi  

The size differ issue is solved, This is related to ocfs2 format option and 
-C count should be 4K.
(mkfs.ocfs2 /dev/mapper/mpatha -N 64 -b 4K -C 256K -T mail 
--fs-features=extended-slotmap --fs-feature-level=max-features -L ) 

Need to change like below.
(mkfs.ocfs2 /dev/mapper/mpatha -b4K -C 4K -L label -T mail -N 2 /dev/sdX

 << Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 << Could you please explain a bit what you are trying to do here ?

In ceph shared directory , we will clone source repository then will access 
the same from ceph client .


   
Regards
Prabu


 On Fri, 12 Jun 2015 21:12:03 +0530 Somnath Roy 
 wrote  

  Sorry, it was a typo , I meant to say 1GB only.
 I would say break the problem like the following.
  
 1. Run some fio workload say (1G) on RBD and run ceph command like ‘ceph df’ 
to see how much data it written. I am sure you will be seeing same data. 
Remember by default ceph rados object size is 4MB, so, it should write 1GB/4MB 
number of objects.
  
 2. Also, you can use ‘rados’ utility to directly put/get say 1GB file to the 
cluster and check similar way.
  
 As I said, if your journal in the same device and if you measure the space 
consumed by entire OSD mount point , it will be more because of WA induced by 
Ceph. But, individual file size you transferred should not differ.
  
 << Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 Could you please explain a bit what you are trying to do here ?
  
 Thanks & Regards
 Somnath
  
   From: gjprabu [mailto:gjpr...@zohocorp.com] 
 Sent: Friday, June 12, 2015 12:34 AM
 To: Somnath Roy
 Cc: ceph-users@lists.ceph.com; Kamala Subramani; Siva Sokkumuthu
 Subject: Re: RE: RE: [ceph-users] Ceph OSD with OCFS2
 
 
  
  Hi,
 
   I measured the data only what i transfered from client. Example 500MB 
file transfered after complete if i measured the same file size will be 1GB not 
10GB. 
 
Our Configuration is :-
 
=
 ceph -w
 cluster f428f5d6-7323-4254-9f66-56a21b099c1a
 health HEALTH_OK
 monmap e1: 3 mons at 
{cephadmin=172.20.19.235:6789/0,cephnode1=172.20.7.168:6789/0,cephnode2=172.20.9.41:6789/0},
 election epoch 114, quorum 0,1,2 cephnode1,cephnode2,cephadmin
 osdmap e9: 2 osds: 2 up, 2 in
 pgmap v1022: 64 pgs, 1 pools, 7507 MB data, 1952 objects
 26139 MB used, 277 GB / 302 GB avail
 64 active+clean
 
===
 ceph.conf
 [global]
 osd pool default size = 2
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host = 172.20.7.168,172.20.9.41,172.20.19.235
 mon_initial_members = zoho-cephnode1, zoho-cephnode2, zoho-cephadmin
 fsid = f428f5d6-7323-4254-9f66-56a21b099c1a
 

 
 What is the replication policy you are using ?
  
We are using default OSD with 2 replica not using CRUSH Map, PG num and 
Erasure etc., 
 
 What interface you used to store the data ?
 
We are using RBD to store data and its has been mounted with OCFS2 in 
client side.
 
 How are you removing data ? Are you removing a rbd image ?
  
We are not removing rbd image, only removing data which is already 
having and removing using rm command from client. We didn't set async way to 
transfer or remove data
 
 
 Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)
 
 
 
   Regards
 
 Prabu GJ
   
 
 
  
   
  On Fri, 12 Jun 2015 00:21:24 +0530 Somnath Roy 
 wrote  
 
Hi,
  
 Ceph journal works in different way.  It’s a write ahead journal, all the data 
will be persisted first in journal and then will be written to actual place. 
Journal data is encoded. Journal is a fixed size partition/file and written 
sequentially. So, if you are placing journal in HDDs, it will be overwritten, 
for SSD case , it will be GC later. So, if you are measuring amount of data 
written to the device it will be double. But, if you are saying you have 

Re: [ceph-users] Ceph OSD with OCFS2

2015-06-15 Thread Somnath Roy
Prabu,
I am still not clear..
You are cloning git source repository on top of RBD + OCFS2 and that is taking 
extra time ?

Thanks & Regards
Somnath

From: gjprabu [mailto:gjpr...@zohocorp.com]
Sent: Monday, June 15, 2015 9:39 PM
To: gjprabu
Cc: Somnath Roy; Kamala Subramani; ceph-users@lists.ceph.com; Siva 
Sokkumuthu
Subject: Re: Re: [ceph-users] Ceph OSD with OCFS2


Hi Somnath,

 Is there any fine tune for the blow issues.

<< Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)

<< Could you please explain a bit what you are trying to do here ?

In ceph shared directory , we will clone source repository then will access 
the same from ceph client .

Regards
Prabu

 On Mon, 15 Jun 2015 17:16:11 +0530 gjprabu 
mailto:gjpr...@zohocorp.com>> wrote 
Hi

The size differ issue is solved, This is related to ocfs2 format option and 
-C count should be 4K.
(mkfs.ocfs2 /dev/mapper/mpatha -N 64 -b 4K -C 256K -T mail 
--fs-features=extended-slotmap --fs-feature-level=max-features -L )

Need to change like below.
(mkfs.ocfs2 /dev/mapper/mpatha -b4K -C 4K -L label -T mail -N 2 /dev/sdX

<< Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)

<< Could you please explain a bit what you are trying to do here ?

In ceph shared directory , we will clone source repository then will access 
the same from ceph client .



Regards
Prabu

 On Fri, 12 Jun 2015 21:12:03 +0530 Somnath Roy 
mailto:somnath@sandisk.com>> wrote 

Sorry, it was a typo , I meant to say 1GB only.

I would say break the problem like the following.



1. Run some fio workload say (1G) on RBD and run ceph command like ‘ceph df’ to 
see how much data it written. I am sure you will be seeing same data. Remember 
by default ceph rados object size is 4MB, so, it should write 1GB/4MB number of 
objects.



2. Also, you can use ‘rados’ utility to directly put/get say 1GB file to the 
cluster and check similar way.



As I said, if your journal in the same device and if you measure the space 
consumed by entire OSD mount point , it will be more because of WA induced by 
Ceph. But, individual file size you transferred should not differ.



<< Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)

Could you please explain a bit what you are trying to do here ?



Thanks & Regards

Somnath



From: gjprabu [mailto:gjpr...@zohocorp.com]
Sent: Friday, June 12, 2015 12:34 AM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; Kamala 
Subramani; Siva Sokkumuthu
Subject: Re: RE: RE: [ceph-users] Ceph OSD with OCFS2



Hi,

  I measured the data only what i transfered from client. Example 500MB 
file transfered after complete if i measured the same file size will be 1GB not 
10GB.

   Our Configuration is :-
=
ceph -w
cluster f428f5d6-7323-4254-9f66-56a21b099c1a
health HEALTH_OK
monmap e1: 3 mons at 
{cephadmin=172.20.19.235:6789/0,cephnode1=172.20.7.168:6789/0,cephnode2=172.20.9.41:6789/0},
 election epoch 114, quorum 0,1,2 cephnode1,cephnode2,cephadmin
osdmap e9: 2 osds: 2 up, 2 in
pgmap v1022: 64 pgs, 1 pools, 7507 MB data, 1952 objects
26139 MB used, 277 GB / 302 GB avail
64 active+clean
===
ceph.conf
[global]
osd pool default size = 2
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 172.20.7.168,172.20.9.41,172.20.19.235
mon_initial_members = zoho-cephnode1, zoho-cephnode2, zoho-cephadmin
fsid = f428f5d6-7323-4254-9f66-56a21b099c1a


What is the replication policy you are using ?

   We are using default OSD with 2 replica not using CRUSH Map, PG num and 
Erasure etc.,

What interface you used to store the data ?

   We are using RBD to store data and its has been mounted with OCFS2 in 
client side.

How are you removing data ? Are you removing a rbd image ?

   We are not removing rbd image, only removing data which is already 
having and removing using rm command from client. We didn't set async way to 
transfer or remove data


Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)


Regards

Prabu GJ





 On Fri, 12 Jun 2015 00:21:24 +0530 Somnath Roy 
mailto:somnath@sandisk.com>> wrote 

Hi,



Ceph journal works in different way.  It’s a write ahead journal, all the data 
will be persisted first in journal and then will