Hello,
Can you explain how you do this procedure? I have the same problem with the
large images and snapshots.
This is what I do:
# qemu-img convert -f qcow2 -O raw image.qcow2 image.img
# openstack image create image.img
But the image.img is too large.
Thanks,
Fran.
2016-07-13 8:29 GMT+02:00
Hi Fran,
Fortunately, qemu-img(1) is able to directly utilise RBD (supporting
sparse block devices)!
Please refer to http://docs.ceph.com/docs/hammer/rbd/qemu-rbd/ for examples.
Cheers,
Kees
On 13-07-16 09:18, Fran Barrera wrote:
> Can you explain how you do this procedure? I have the same prob
Yes, but is the same problem isn't? The image will be too large because the
format is raw.
Thanks.
2016-07-13 9:24 GMT+02:00 Kees Meijs :
> Hi Fran,
>
> Fortunately, qemu-img(1) is able to directly utilise RBD (supporting
> sparse block devices)!
>
> Please refer to http://docs.ceph.com/docs/ham
Hi,
If the qemu-img is able to handle RBD in a clever way (and I assume it
does) it is able to sparsely write the image to the Ceph pool.
But, it is an assumption! Maybe someone else could shed some light on this?
Or even better: read the source, the RBD handler specifically.
And last but not l
Hello,
i have a ceph with many 2T disk e only one 1T disk (for wrong)
I want to change 1T disk with 2T...
what is the correct procedure
the same of "change broken disk"?
many thanks
--
*Fabio *
___
ceph-users mailing list
ceph-users@lists.ceph
yes
From: Fabio - NS3 srl
Date: 2016-07-13 16:11
To: ceph-us...@ceph.com
Subject: [ceph-users] Change with disk from 1TB to 2TB
Hello,
i have a ceph with many 2T disk e only one 1T disk (for wrong)
I want to change 1T disk with 2T...
what is the correct procedure
the same of "change broken
Hello,
Looking at using 2 x 960GB SSD's (SM863)
Reason for larger is I was thinking would be better off with them in Raid 1 so
enough space for OS and all Journals.
Instead am I better off using 2 x 200GB S3700's instead, with 5 disks per a SSD?
Thanks,
Ashley
-Original Message-
From:
> Op 13 juli 2016 om 11:34 schreef Ashley Merrick :
>
>
> Hello,
>
> Looking at using 2 x 960GB SSD's (SM863)
>
> Reason for larger is I was thinking would be better off with them in Raid 1
> so enough space for OS and all Journals.
>
> Instead am I better off using 2 x 200GB S3700's instead
> Op 13 juli 2016 om 8:19 schreef Götz Reinicke - IT Koordinator
> :
>
>
> Hi,
>
> can anybody give some realworld feedback on what hardware
> (CPU/Cores/NIC) you use for a 40Gb (file)server (smb and nfs)? The Ceph
> Cluster will be mostly rbd images. S3 in the future, CephFS we will see :)
>
Okie perfect.
May sound a random question, but what size would you recommend for the
SATA-DOM, obviously I know standard OS space requirements, but will CEPH
required much on the root OS of a OSD only node apart from standard logs.
,Ashley
-Original Message-
From: Wido den Hollander [m
> Op 13 juli 2016 om 11:51 schreef Ashley Merrick :
>
>
> Okie perfect.
>
> May sound a random question, but what size would you recommend for the
> SATA-DOM, obviously I know standard OS space requirements, but will CEPH
> required much on the root OS of a OSD only node apart from standard l
Hi,
This is an OSD box running Hammer on Ubuntu 14.04 LTS with additional
systems administration tools:
> $ df -h | grep -v /var/lib/ceph/osd
> Filesystem Size Used Avail Use% Mounted on
> udev5,9G 4,0K 5,9G 1% /dev
> tmpfs 1,2G 892K 1,2G 1% /run
> /dev/dm-1
Am 13.07.16 um 11:47 schrieb Wido den Hollander:
>> Op 13 juli 2016 om 8:19 schreef Götz Reinicke - IT Koordinator
>> :
>>
>>
>> Hi,
>>
>> can anybody give some realworld feedback on what hardware
>> (CPU/Cores/NIC) you use for a 40Gb (file)server (smb and nfs)? The Ceph
>> Cluster will be mostly
Hello,
It is safe to rename pool with cache-tier? I want to make some
standardization in pools name for example pools 'prod01' and 'cache-prod01'.
Maybe before rename should I remove cache tier?
Regards,
--
Mateusz Skała
mateusz.sk...@budikom.net
budikom.net
ul. Trzy Lipy 3, GPNT, bud.
> Op 13 juli 2016 om 12:00 schreef Götz Reinicke - IT Koordinator
> :
>
>
> Am 13.07.16 um 11:47 schrieb Wido den Hollander:
> >> Op 13 juli 2016 om 8:19 schreef Götz Reinicke - IT Koordinator
> >> :
> >>
> >>
> >> Hi,
> >>
> >> can anybody give some realworld feedback on what hardware
> >> (C
Hi Cephers,
There's some physical maintainance I need to perform on an OSD node.
Very likely the maintainance is going to take a while since it involves
replacing components, so I would like to be well prepared.
Unfortunately it is no option to add another OSD node or rebalance at
this time, so I
> Op 13 juli 2016 om 14:31 schreef Kees Meijs :
>
>
> Hi Cephers,
>
> There's some physical maintainance I need to perform on an OSD node.
> Very likely the maintainance is going to take a while since it involves
> replacing components, so I would like to be well prepared.
>
> Unfortunately it
If you stop the OSDs cleanly then that should cause no disruption to clients.
Starting the OSD back up is another story, expect slow request for a while
there and unless you have lots of very fast CPUs on the OSD node, start them
one-by-one and not all at once.
Jan
> On 13 Jul 2016, at 14:37,
40Gbps can be used as 4*10Gbps
I guess welcome feedbacks should not be stuck by "usage of a 40Gbps
ports", but extented to "usage of more than a single 10Gbps port, eg
20Gbps etc too"
Is there people here that are using more than 10G on an ceph server ?
On 13/07/2016 14:27, Wido den Hollander wr
Thanks!
So to sum up, I'd best:
* set the noout flag
* stop the OSDs one by one
* shut down the physical node
* jank the OSD drives to prevent ceph-disk(8) from automaticly
activating at boot time
* do my maintainance
* start the physical node
* reseat and activate the OSD drive
Am 13.07.16 um 14:27 schrieb Wido den Hollander:
>> Op 13 juli 2016 om 12:00 schreef Götz Reinicke - IT Koordinator
>> :
>>
>>
>> Am 13.07.16 um 11:47 schrieb Wido den Hollander:
Op 13 juli 2016 om 8:19 schreef Götz Reinicke - IT Koordinator
:
Hi,
can anybody gi
On 07/13/2016 08:41 AM, c...@jack.fr.eu.org wrote:
40Gbps can be used as 4*10Gbps
I guess welcome feedbacks should not be stuck by "usage of a 40Gbps
ports", but extented to "usage of more than a single 10Gbps port, eg
20Gbps etc too"
Is there people here that are using more than 10G on an ce
My OSDs have dual 40G NICs. I typically don't use more than 1Gbps on
either network. During heavy recovery activity (like if I lose a whole
server), I've seen up to 12Gbps on the cluster network.
For reference my cluster is 9 OSD nodes with 9x 7200RPM 2TB OSDs. They all
have RAID cards with 4GB o
> Op 13 juli 2016 om 14:47 schreef Kees Meijs :
>
>
> Thanks!
>
> So to sum up, I'd best:
>
> * set the noout flag
> * stop the OSDs one by one
> * shut down the physical node
> * jank the OSD drives to prevent ceph-disk(8) from automaticly
> activating at boot time
> * do my mai
Am 13.07.16 um 14:59 schrieb Joe Landman:
>
>
> On 07/13/2016 08:41 AM, c...@jack.fr.eu.org wrote:
>> 40Gbps can be used as 4*10Gbps
>>
>> I guess welcome feedbacks should not be stuck by "usage of a 40Gbps
>> ports", but extented to "usage of more than a single 10Gbps port, eg
>> 20Gbps etc too"
>
Looks good.
You can start several OSDs at a time as long as you have enough CPU and you're
not saturating your drives or controllers.
Jan
> On 13 Jul 2016, at 15:09, Wido den Hollander wrote:
>
>
>> Op 13 juli 2016 om 14:47 schreef Kees Meijs :
>>
>>
>> Thanks!
>>
>> So to sum up, I'd best
On Wed, Jul 13, 2016 at 12:14 AM, Di Zhang wrote:
> Hi,
>
> Is there any way to change the metadata pool for a cephfs without losing
> any existing data? I know how to clone the metadata pool using rados cppool.
> But the filesystem still links to the original metadata pool no matter what
> yo
The RAW file will appear to be the exact image size but the filesystem
will know about the holes in the image and it will be sparsely
allocated on disk. For example:
# dd if=/dev/zero of=sparse-file bs=1 count=1 seek=2GiB
# ll sparse-file
-rw-rw-r--. 1 jdillaman jdillaman 2147483649 Jul 13 09:20
Hello,
The last 3 days I worked at a customer with a 1800 OSD cluster which had to be
upgraded from Hammer 0.94.5 to Jewel 10.2.2
The cluster in this case is 99% RGW, but also some RBD.
I wanted to share some of the things we encountered during this upgrade.
All 180 nodes are running CentOS 7.
I¹ve run the Mellanox 40 gig card. Connectx 3-Pro, but that¹s old now.
Back when I ran it, the drivers were kind of a pain to deal with in
Ubuntu, primarily during PXE. It should be better now though.
If you have the network to support it, 25Gbe is quite a bit cheaper per
port, and won¹t be so ha
I am using these for other stuff:
http://www.supermicro.com/products/accessories/addon/AOC-STG-b4S.cfm
If you want NIC, also think of the "network side" : SFP+ switch are very
common, 40G is less common, 25G is really new (= really few products)
On 13/07/2016 16:50, Warren Wang - ISD wrote:
> I
As you can see you have 'unknown' partition type. It should be 'ceph
journal' and 'ceph data'.
Stop ceph-osd, unmount partitions and change typecodes for partition
properly:
/sbin/sgdisk --typecode=PART:4fbd7e29-9d25-41b8-afd0-062c0ceff05d --
/dev/DISK
PART - number of partition with data (1
Thanks for sharing Wido.
>From your information you only talk about MON and OSD. What about the
RGW nodes? You stated in the beginning that 99% is rgw...
On Wed, Jul 13, 2016 at 3:56 PM, Wido den Hollander wrote:
> Hello,
>
> The last 3 days I worked at a customer with a 1800 OSD cluster which h
Hello.
On 07/13/2016 03:31 AM, Christian Balzer wrote:
Hello,
did you actually read my full reply last week, the in-line parts,
not just the top bit?
http://www.spinics.net/lists/ceph-users/msg29266.html
On Tue, 12 Jul 2016 16:16:09 +0300 George Shuklin wrote:
Yes, linear io speed was conce
We use all Cisco UCS servers (C240 M3 and M4s) with the PCIE VIC 1385 40G
NIC. The drivers were included in Ubuntu 14.04. I've had no issues with
the NICs or my network what so ever.
We have two Cisco Nexus 5624Q that the OSD servers connect to. The
switches are just switching two VLANs (ceph c
Aside from the 10GbE vs 40GbE question, if you're planning to export an RBD
image over smb/nfs I think you are going to struggle to reach anywhere near
1GB/s in a single threaded read. This is because even with readahead
cranked right up you're still only going be hitting a handful of disks at a
ti
Hello,
Sorry for the misunderstanding about IOPS. Here are some summary stats
of my benchmark (Is the 20 - 30 IOPS seems normal to you?):
ceph osd pool create test 512 512
rados bench -p test 10 write --no-cleanup
Total time run: 10.480383
Total writes made: 288
Write size:
I also tried 4K write bench. The IOPS is ~420. I used to have better
bandwidth when I use the same network for both the cluster and clients. Now
the bandwidth must be limited by the 1G ethernet. What would you suggest to
me to do?
Thanks,
On Wed, Jul 13, 2016 at 11:37 AM, Di Zhang wrote:
> Hell
I am trying to configure radosgw integration with keystone in the
following environment:
1) Use user/pass authentication with Keystone instead of admin token.
2) Use keystone v3 API
3) Keystone internal and admin URLs are non-SSL
4) Keystone is configured to use fernet tokens
My RGW configuration
Hi All,
Have a question on the performance of sequential write @ 4K block sizes.
Here is my configuration:
Ceph Cluster: 6 Nodes. Each node with :-
20x HDDs (OSDs) - 10K RPM 1.2 TB SAS disks
SSDs - 4x - Intel S3710, 400GB; for OSD journals shared across 20 HDDs (i.e.,
SSD journal ratio 1:5)
Ne
Hello,
On Wed, 13 Jul 2016 12:01:14 -0500 Di Zhang wrote:
> I also tried 4K write bench. The IOPS is ~420.
That's what people usually mean (4KB blocks) when talking about IOPS.
This number is pretty low, my guess would be network latency on your 1Gbs
network for the most part.
You should run
Hi,
I just installed jewel on a small cluster of 3 machines with 4 SSDs each. I
created 8 RBD images, and use a single client, with 8 threads, to do random
writes (using FIO with RBD engine) on the images ( 1 thread per image).
The cluster has 3X replication and 10G cluster and client networks.
Hi,
Just wonder why you want each OSD inside separate LXC container? Just to
pin them to specific cpus?
On Tue, Jul 12, 2016 at 6:33 AM, Guillaume Comte <
guillaume.co...@blade-group.com> wrote:
> Hi,
>
> I am currently defining a storage architecture based on ceph, and i wish
> to know if i don
Pankaj,
Could be related to the new throttle parameter introduced in jewel. By default
these throttles are off , you need to tweak it according to your setup.
What is your journal size and fio block size ?
If it is default 5GB , with this rate (assuming 4K RW) you mentioned and
considering 3X
Also increase the following..
filestore_op_threads
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Somnath Roy
Sent: Wednesday, July 13, 2016 5:47 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Terrible RBD performance with Jewel
Pankaj,
Could
Thanks Somnath. I will try all these, but I think there is something else going
on too.
Firstly my test reaches 0 IOPS within 10 seconds sometimes.
Secondly, when I'm at 0 IOPS, I see NO disk activity on IOSTAT and no CPU
activity either. This part is strange.
Thanks
Pankaj
From: Somnath Roy [m
Hi,
I am seeing an issue. I created 5 images testvol11-15 and I mapped them to
/dev/rbd0-4. When I execute the command 'rbd showmapped', it shows correctly
the image and the mappings as shown below:
[root@ep-compute-2-16 run1]# rbd showmapped
id pool image snap device
0 testpool test
In fact, I was wrong , I missed you are running with 12 OSDs (considering one
OSD per SSD). In that case, it will take ~250 second to fill up the journal.
Have you preconditioned the entire image with bigger block say 1M before doing
any real test ?
From: Garg, Pankaj [mailto:pankaj.g...@cavium.
No I have not.
From: Somnath Roy [mailto:somnath@sandisk.com]
Sent: Wednesday, July 13, 2016 6:00 PM
To: Garg, Pankaj; ceph-users@lists.ceph.com
Subject: RE: Terrible RBD performance with Jewel
In fact, I was wrong , I missed you are running with 12 OSDs (considering one
OSD per SSD). In tha
You should do that first to get a stable performance out with filestore.
1M seq write for the entire image should be sufficient to precondition it.
From: Garg, Pankaj [mailto:pankaj.g...@cavium.com]
Sent: Wednesday, July 13, 2016 6:04 PM
To: Somnath Roy; ceph-users@lists.ceph.com
Subject: RE: Terr
Hi,
You need to specify pool name.
rbd -p testpool info testvol11
On Thu, Jul 14, 2016 at 8:55 AM, EP Komarla
wrote:
> Hi,
>
>
>
> I am seeing an issue. I created 5 images testvol11-15 and I mapped them
> to /dev/rbd0-4. When I execute the command ‘rbd showmapped’, it shows
> correctly t
Thanks. It works.
From: c.y. lee [mailto:c...@inwinstack.com]
Sent: Wednesday, July 13, 2016 6:17 PM
To: EP Komarla
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] rbd command anomaly
Hi,
You need to specify pool name.
rbd -p testpool info testvol11
On Thu, Jul 14, 2016 at 8:55 A
I agree, but I'm dealing with something else out here with this setup.
I just ran a test, and within 3 seconds my IOPS went to 0, and stayed there for
90 secondsthen started and within seconds again went to 0.
This doesn't seem normal at all. Here is my ceph.conf:
[global]
fsid =
Hello CEPH-users,
I am looking for hiring CEPH developers for my team in Bangalore, if anyone
keen to explore, do unicast me at jhusthi...@walmartlabs.com.
Sorry folks for using this forum, thought of dropping a note.
Thanks,
Janardhan
___
ceph-u
I am not sure whether you need to set the following. What's the point of
reducing inline xattr stuff ? I forgot the calculation but lower values could
redirect your xattrs to omap. Better comment those out.
filestore_max_inline_xattr_size = 254
filestore_max_inline_xattrs = 6
We could do some i
As Somnath mentioned, you've got a lot of tunables set there. Are you
sure those are all doing what you think they are doing?
FWIW, the xfs -n size=64k option is probably not a good idea.
Unfortunately it can't be changed without making a new filesystem.
See:
http://lists.ceph.com/pipermail
Hello,
On Wed, 13 Jul 2016 09:34:35 + Ashley Merrick wrote:
> Hello,
>
> Looking at using 2 x 960GB SSD's (SM863)
>
Massive overkill.
> Reason for larger is I was thinking would be better off with them in Raid 1
> so enough space for OS and all Journals.
>
As I pointed out several times
Hi,
I changed to only use the infiniband network. For the 4KB write, the
IOPS doesn’t improve much. I also logged into the OSD nodes and atop showed the
disks are not always at 100% busy. Please check a snapshot of one node below:
DSK | sdc | busy 72% | read20/s | wri
Hello,
On Wed, 13 Jul 2016 18:15:10 + EP Komarla wrote:
> Hi All,
>
> Have a question on the performance of sequential write @ 4K block sizes.
>
Which version of Ceph?
Any significant ceph.conf modifications?
> Here is my configuration:
>
> Ceph Cluster: 6 Nodes. Each node with :-
> 20x
Hello,
On Wed, 13 Jul 2016 22:47:05 -0500 Di Zhang wrote:
> Hi,
> I changed to only use the infiniband network. For the 4KB write, the
> IOPS doesn’t improve much.
That's mostly going to be bound by latencies (as I just wrote in the other
thread), both network and internal Ceph ones.
T
Hello,
I have a ceph cluster where the one osd is failng to start. I have been
upgrading ceph to see if the error dissappered. Now I'm running jewel but I
still get the error message.
-31> 2016-07-13 17:03:30.474321 7fda18a8b700 2 -- 10.0.6.21:6800/1876
>> 10.0.5.71:6789/0 pipe(0x7fdb5712a
On Thu, Jul 14, 2016 at 06:06:58AM +0200, Martin Wilderoth wrote:
> Hello,
>
> I have a ceph cluster where the one osd is failng to start. I have been
> upgrading ceph to see if the error dissappered. Now I'm running jewel but I
> still get the error message.
>
> -1> 2016-07-13 17:04:22.061
To add, the RGWs upgraded just fine as well.
No regions in use here (yet!), so that upgraded as it should.
Wido
> Op 13 juli 2016 om 16:56 schreef Wido den Hollander :
>
>
> Hello,
>
> The last 3 days I worked at a customer with a 1800 OSD cluster which had to
> be upgraded from Hammer 0.94.
63 matches
Mail list logo