Hi Sage,
can anyone validate, if there is still "bug" inside RPMs that does
automatic CEPH service restart after updating packages ?
We are instructed to first update/restart MONs, and after that OSD - but
that is impossible if we have MON+OSDs on same host...since the ceph is
automaticaly restar
The result is same.
# ceph-fuse --debug-ms 1 --debug-client 10 -m 192.168.122.106:6789 /mnt
ceph-fuse[3296] : starting ceph client
And the log file is
# cat /var/log/ceph/ceph-client.admin.log
2014-07-16 17:08:13.146032 7f9a212f87c0 0 ceph version 0.80.1
(a38fe1169b6d2ac98b427334c12d7cf81f809b
Can you offer some comments on what the impact is likely to be to the data in
an affected cluster? Should all data now be treated with suspicion and restored
back to before the firefly upgrade?
James
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On B
Hi!
I'm trying to install ceph on Debian wheezy (from deb http://ceph.com/debian/
wheezy main) and getting following error:
# apt-get update && apt-get dist-upgrade -y && apt-get install -y ceph
...
The following packages have unmet dependencies:
ceph : Depends: ceph-common (>= 0.78-500) but
On Wed, Jul 16, 2014 at 10:50 AM, James Harper wrote:
> Can you offer some comments on what the impact is likely to be to the data in
> an affected cluster? Should all data now be treated with suspicion and
> restored back to before the firefly upgrade?
Yes, I'd definitely like to know that too
> Hi!
>
> I'm trying to install ceph on Debian wheezy (from deb
> http://ceph.com/debian/ wheezy main) and getting following error:
>
> # apt-get update && apt-get dist-upgrade -y && apt-get install -y ceph
>
> ...
>
> The following packages have unmet dependencies:
> ceph : Depends: ceph-comm
Hi,
Is it possible to set owner of a bucket or an object to someone else?
I've got a user who was created with flag --system and is able to
create buckets and objects.
I've created a bucket using boto and I've got FULL CONTROL over it:
http://acs.amazonaws.com/groups/global/AllUsers = READ, M.
Tes
Your MDS isn't running or isn't active.
-Greg
On Wednesday, July 16, 2014, Jaemyoun Lee wrote:
>
> The result is same.
>
> # ceph-fuse --debug-ms 1 --debug-client 10 -m 192.168.122.106:6789 /mnt
> ceph-fuse[3296] : starting ceph client
>
> And the log file is
>
> # cat /var/log/ceph/ceph-client
Hi Shubhendu,
ceph-deploy is not part of Calamari, the ceph-users list is a better
place to get help with that. I have CC'd the list here.
It will help if you can specify the series of ceph-deploy commands you
ran before the failing one as well.
Thanks,
John
On Wed, Jul 16, 2014 at 9:29 AM, Sh
Hi Sage, Andrija & List
I have seen the tuneables issue on our cluster when I upgraded to firefly.
I ended up going back to legacy settings after about an hour as my cluster is
of 55 3TB OSD’s over 5 nodes and it decided it needed to move around 32% of our
data, which after an hour all of our v
Quenten,
We've got two monitors sitting on the osd servers and one on a different
server.
Andrei
--
Andrei Mikhailovsky
Director
Arhont Information Security
Web: http://www.arhont.com
http://www.wi-foo.com
Tel: +44 (0)870 4431337
Fax: +44 (0)208 429 3111
PGP: Key ID - 0x2B3438DE
PG
For me, 3 nodes, 1MON+ 2x2TB OSDs on each node... no mds used...
I went through pain of waiting for data rebalancing and now I'm on
"optimal" tunables...
Cheers
On 16 July 2014 14:29, Andrei Mikhailovsky wrote:
> Quenten,
>
> We've got two monitors sitting on the osd servers and one on a differ
With 34 x 4TB OSDs over 4 hosts, I had 30% objects moved - about half full
and took around 12 hours. Except now I can't use the kclient any more -
wish I'd read that first.
On 16 July 2014 13:36, Andrija Panic wrote:
> For me, 3 nodes, 1MON+ 2x2TB OSDs on each node... no mds used...
> I went t
On Friday I managed to run a command I probably shouldn't and knock half our
OSDs offline. By setting the noout and nodown flags and bringing up the OSDS on
the boxes that don't also have mons running on them I got most of the cluster
back up by today (it took me a while to discover the nodown f
Hi,
After the repair process, i have :
1926 active+clean
2 active+clean+inconsistent
This two PGs seem to be on the same osd ( #34 ):
# ceph pg dump | grep inconsistent
dumped all in format plain
0.2e4 0 0 0 8388660 4 4
active+clean+inconsistent 2014-0
Resending my earlier message.
On Tuesday, July 15, 2014 10:58 PM, lakshmi k s wrote:
Hello Ceph Users -
My Ceph setup consists of 1 admin
node, 3 OSDs, I radosgw and 1 client. One of OSD node also hosts monitor node.
Ceph Health is OK and I have verified the radosgw runtime. I have create
You may try to debug your issue by using curl requests.
If you use your Swift credentials, a request of this format should give you
a 20X return code (probably 204):
curl -v -i http:///auth -X GET -H "X-Auth-User:
testuser:swiftuser" -H "X-Auth-Key:
ksYDp8dul80Ta1PeDkFFyLem1FlrtvnyzYiaqvh8"
If t
Hi Andrija,
I'm running a cluster with both CentOS and Ubuntu machines in it. I just
did some upgrades to 0.80.4, and I can confirm that doing "yum update ceph"
on the CentOS machine did result in having all OSDs on that machine
restarted automatically. I actually did not know that would happen,
Thanks for the response. Curl yields the following -
ceph-gateway@ceph-gateway:~$ curl -v -i http://ceph-gateway/auth -X GET -H
"X-Auth-User:ganapati:swift" -H
"X-Auth-Key:GIn60fmdvnEh5tSiRziixcO5wVxZjg9eoYmtX3hJ"
Hostname was NOT found in DNS cache
Trying 127.0.1.1...
Connected to ceph-gatewa
This now appears to have partially fixed itself. I am now able to run commands
on the cluster, though one of the monitors is down. I still have no idea what
was going on.
George
From: george.ry...@stfc.ac.uk [mailto:george.ry...@stfc.ac.uk]
Sent: 16 July 2014 13:59
To: ceph-users@lists.ceph.co
Hello,
I am new to Ceph; the group I'm working in is currently evaluating it
for our new large-scale storage.
How scalable is the S3-like object storage server RADOSgw? What kind
of architecture should one use to ensure we can scale RADOSgw out if
the need arises? (We plan to host a few PBs on
Hello,
I am new to Ceph; the group I'm working in is currently evaluating it
for our new large-scale storage.
Is there any recommendation for the OSD journals? E.g., does it make
sense to keep them on SSDs? Would it make sense to host the journal
on a RAID-1 array for added safety? (IOW: what h
> Op 16 jul. 2014 om 16:54 heeft "Riccardo Murri" het
> volgende geschreven:
>
> Hello,
>
> I am new to Ceph; the group I'm working in is currently evaluating it
> for our new large-scale storage.
>
> How scalable is the S3-like object storage server RADOSgw? What kind
> of architecture s
Maybe some of the user data is not correct...
If you try
radosgw-admin user info --uid=ganapati
is the subuser there?
The key that you must use should be under "swift_keys".
Otherwise, be sure that the user is created with
radosgw-admin key create --subuser=username:subusername --key-type=
> Op 16 jul. 2014 om 16:58 heeft "Riccardo Murri" het
> volgende geschreven:
>
> Hello,
>
> I am new to Ceph; the group I'm working in is currently evaluating it
> for our new large-scale storage.
>
> Is there any recommendation for the OSD journals? E.g., does it make
> sense to keep the
Below is the output of radosgw admin user info. Am I missing something here.
Appreciate your help.
ceph-gateway@ceph-gateway:~$ radosgw-admin user info --uid=ganapati
{ "user_id": "ganapati",
"display_name": "I",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"auid": 0,
"subusers":
On 07/16/2014 09:58 AM, Riccardo Murri wrote:
Hello,
I am new to Ceph; the group I'm working in is currently evaluating it
for our new large-scale storage.
Is there any recommendation for the OSD journals? E.g., does it make
sense to keep them on SSDs? Would it make sense to host the journal
There is a log kept in ceph.log of every ceph-deploy command.
On Jul 16, 2014 5:21 AM, "John Spray" wrote:
> Hi Shubhendu,
>
> ceph-deploy is not part of Calamari, the ceph-users list is a better
> place to get help with that. I have CC'd the list here.
>
> It will help if you can specify the se
I wanted to update ceph-fuse to a new version and I would like to have
it seamless.
I thought I could do a remount to update the running version but came to a fail.
Here is the error I got.
# mount /mnt/ceph/ -o remount
2014-07-16 09:08:57.690464 7f669be1a760 -1 asok(0x1285eb0)
AdminSocketConfigOb
On Wed, 16 Jul 2014, Travis Rhoden wrote:
> Hi Andrija,
>
> I'm running a cluster with both CentOS and Ubuntu machines in it. I just
> did some upgrades to 0.80.4, and I can confirm that doing "yum update ceph"
> on the CentOS machine did result in having all OSDs on that machine
> restarted auto
On Wed, Jul 16, 2014 at 1:50 AM, James Harper wrote:
> Can you offer some comments on what the impact is likely to be to the data in
> an affected cluster? Should all data now be treated with suspicion and
> restored back to before the firefly upgrade?
I am under the impression that it's not ac
On Wed, Jul 16, 2014 at 9:20 AM, Scottix wrote:
> I wanted to update ceph-fuse to a new version and I would like to have
> it seamless.
> I thought I could do a remount to update the running version but came to a
> fail.
> Here is the error I got.
>
> # mount /mnt/ceph/ -o remount
> 2014-07-16 09
On Wed, Jul 16, 2014 at 6:21 AM, Pierre BLONDEAU
wrote:
> Hi,
>
> After the repair process, i have :
> 1926 active+clean
>2 active+clean+inconsistent
>
> This two PGs seem to be on the same osd ( #34 ):
> # ceph pg dump | grep inconsistent
> dumped all in format plain
> 0.2e4 0
I've 2 dell systems with PERC H710 raid cards. Those are very good end
cards , but do not support jbod .
They support raid 0, 1, 5, 6, 10, 50, 60 .
lspci shows them as: LSI Logic / Symbios Logic MegaRAID SAS 2208
[Thunderbolt] (rev 05)
The firmware Dell uses on the card does not support jbod.
In my test cluster in systems with similar RAID cards, I create single-disk
RAID-0 volumes.
That does the trick.
Paul
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert
Fantini
Sent: Wednesday, July 16, 2014 1:55 PM
To: ceph-users@lists.ceph.com
Subject: [cep
Robert,
We use those cards here in our Dell R-720 servers.
We just ended up creating a bunch of single disk RAID-0 units, since
there was no jbod option available.
Shain
On 07/16/2014 04:55 PM, Robert Fantini wrote:
I've 2 dell systems with PERC H710 raid cards. Those are very good end
card
On Thu, Jul 17, 2014 at 12:55 AM, Robert Fantini
wrote:
> I've 2 dell systems with PERC H710 raid cards. Those are very good end cards
> , but do not support jbod .
>
> They support raid 0, 1, 5, 6, 10, 50, 60 .
>
> lspci shows them as: LSI Logic / Symbios Logic MegaRAID SAS 2208
> [Thunderbolt]
thank you very much for the responses!
On Wed, Jul 16, 2014 at 5:02 PM, Paul Santos wrote:
> In my test cluster in systems with similar RAID cards, I create
> single-disk RAID-0 volumes.
>
> That does the trick.
>
>
>
> Paul
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com]
I've got a bit of good news and bad news about the state of landing
the rbd-ephemeral-clone patch series for Nova in Juno.
The good news is that the first patch in the series
(https://review.openstack.org/91722 fixing a data loss inducing bug
with live migrations of instances with RBD backed ephem
Le 16/07/2014 22:40, Gregory Farnum a écrit :
On Wed, Jul 16, 2014 at 6:21 AM, Pierre BLONDEAU
wrote:
Hi,
After the repair process, i have :
1926 active+clean
2 active+clean+inconsistent
This two PGs seem to be on the same osd ( #34 ):
# ceph pg dump | grep inconsistent
dumped all in form
The good SSDs will report how much of their estimated life has been used.
It's not in the SMART spec though, so different manufacturers do it
differently (or not at all).
I've got Intel DC S3700s, and the SMART attributes include:
233 Media_Wearout_Indicator 0x0032 100 100 000Old_age
One of the things I've learned is that many small changes to the cluster
are better than one large change. Adding 20% more OSDs? Don't add them
all at once, trickle them in over time. Increasing pg_num & pgp_num from
128 to 1024? Go in steps, not one leap.
I try to avoid operations that will t
On Wed, Jul 16, 2014 at 4:45 PM, Craig Lewis wrote:
> One of the things I've learned is that many small changes to the cluster are
> better than one large change. Adding 20% more OSDs? Don't add them all at
> once, trickle them in over time. Increasing pg_num & pgp_num from 128 to
> 1024? Go i
On Wed, 16 Jul 2014, Gregory Farnum wrote:
> On Wed, Jul 16, 2014 at 4:45 PM, Craig Lewis
> wrote:
> > One of the things I've learned is that many small changes to the cluster are
> > better than one large change. Adding 20% more OSDs? Don't add them all at
> > once, trickle them in over time.
Thanks, that's worth a try. Half as bad might make all the difference.
I have the luxury of a federated setup, and I can test on the secondary
cluster fairly safely. If the change doesn't cause replication timeouts,
it's probably ok to deploy on the primary.
I'll go to CRUSH_TUNABLES2 manually
hi all:
This issue will meet deep-scrub error 100% with RBD user?
I mean it is possible just occur sometimes.
I use firefly v0.80.1 with kernel 3.8.0-37 and did not see any deep-scrub
error.
Then I still need to upgrade to 0.80.4 or I didn't get this issue?
Thanks!!
2014-07-17 1:19 GMT+08:00
Hi Sage & List
I understand this is probably a hard question to answer.
I mentioned previously our cluster is co-located MON’s on OSD servers, which
are R515’s w/ 1 x AMD 6 Core processor & 11 3TB OSD’s w/ dual 10GBE.
When our cluster is doing these busy operations and IO has stopped as in my
Hi Sage,
Even I am facing the problem.
ls -l /var/log/ceph/
total 54280
-rw-r--r-- 1 root root0 Jul 17 06:39 ceph-osd.0.log
-rw-r--r-- 1 root root 19603037 Jul 16 19:01 ceph-osd.0.log.1.gz
-rw-r--r-- 1 root root0 Jul 17 06:39 ceph-osd.1.log
-rw-r--r-- 1 root root 18008247 Jul 16 19
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
What do you recommend in case of a disk failure in this kind of
configuration? Are you bringing down the host when you replace the
disk and re-create the raid-0 for the replaced disk? I reckon that
linux doesn't automatically get the disk replacem
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi Dmitry,
I've been using Ubuntu 14.04LTS + Icehouse /w CEPH as a storage
backend for glance, cinder and nova (kvm/libvirt). I *really* would
love to see this patch cycle in Juno. It's been a real performance
issue because of the unnecessary re-copy
- Message from george.ry...@stfc.ac.uk -
Date: Wed, 16 Jul 2014 14:45:35 +
From: george.ry...@stfc.ac.uk
Subject: Re: [ceph-users] Failed monitors
To: ceph-users@lists.ceph.com
This now appears to have partially fixed itself. I am now able to
run commands on the clu
> The good SSDs will report how much of their estimated life has been used.
> It's not in the SMART spec though, so different manufacturers do it
> differently (or not at all).
> I'm planning to monitor those value, and replace the SSD when "gets old".
> I don't know exactly what that means yet,
Hi Dmitry,
Will you please share with us how things went on the meeting?
Many thanks,
Benjamin
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Dmitry Borodaenko
> Sent: Wednesday, July 16, 2014 11:18 PM
> To: ceph-users@lists.ceph.com
>
On Thu, 17 Jul 2014 06:38:42 + Robert van Leeuwen wrote:
>
> > The good SSDs will report how much of their estimated life has been
> > used. It's not in the SMART spec though, so different manufacturers do
> > it differently (or not at all).
>
> > I'm planning to monitor those value, and rep
54 matches
Mail list logo