Thanks Greg. I thought it was impossible when I reported 34MB for 52
million files.

On Jul 19, 2017 1:17 PM, "Gregory Farnum" <gfar...@redhat.com> wrote:

>
>
> On Wed, Jul 19, 2017 at 10:25 AM David <dclistsli...@gmail.com> wrote:
>
>> On Tue, Jul 18, 2017 at 6:54 AM, Blair Bethwaite <
>> blair.bethwa...@gmail.com> wrote:
>>
>>> We are a data-intensive university, with an increasingly large fleet
>>> of scientific instruments capturing various types of data (mostly
>>> imaging of one kind or another). That data typically needs to be
>>> stored, protected, managed, shared, connected/moved to specialised
>>> compute for analysis. Given the large variety of use-cases we are
>>> being somewhat more circumspect it our CephFS adoption and really only
>>> dipping toes in the water, ultimately hoping it will become a
>>> long-term default NAS choice from Luminous onwards.
>>>
>>> On 18 July 2017 at 15:21, Brady Deetz <bde...@gmail.com> wrote:
>>> > All of that said, you could also consider using rbd and zfs or
>>> whatever filesystem you like. That would allow you to gain the benefits of
>>> scaleout while still getting a feature rich fs. But, there are some down
>>> sides to that architecture too.
>>>
>>> We do this today (KVMs with a couple of large RBDs attached via
>>> librbd+QEMU/KVM), but the throughput able to be achieved this way is
>>> nothing like native CephFS - adding more RBDs doesn't seem to help
>>> increase overall throughput. Also, if you have NFS clients you will
>>> absolutely need SSD ZIL. And of course you then have a single point of
>>> failure and downtime for regular updates etc.
>>>
>>> In terms of small file performance I'm interested to hear about
>>> experiences with in-line file storage on the MDS.
>>>
>>> Also, while we're talking about CephFS - what size metadata pools are
>>> people seeing on their production systems with 10s-100s millions of
>>> files?
>>>
>>
>> On a system with 10.1 million files, metadata pool is 60MB
>>
>>
> Unfortunately that's not really an accurate assessment, for good but
> terrible reasons:
> 1) CephFS metadata is principally stored via the omap interface (which is
> designed for handling things like the directory storage CephFS needs)
> 2) omap is implemented via Level/RocksDB
> 3) there is not a good way to determine which pool is responsible for
> which portion of RocksDBs data
> 4) So the pool stats do not incorporate omap data usage at all in their
> reports (it's part of the overall space used, and is one of the things that
> can make that larger than the sum of the per-pool spaces)
>
> You could try and estimate it by looking at how much "lost" space there is
> (and subtracting out journal sizes and things, depending on setup). But I
> promise there's more than 60MB of CephFS metadata for 10.1 million files!
> -Greg
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to