Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread Les Mikesell
JohnS wrote: > On Mon, 2009-07-13 at 05:49 +, o wrote: > >>> It is 1024 chars long. Witch want still help. >> I'm usng mysam and according to: >> http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html >> "The maximum key length is 1000 bytes. This can also

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-13 Thread JohnS
On Mon, 2009-07-13 at 05:49 +, o wrote: > >It is 1024 chars long. Witch want still help. > I'm usng mysam and according to: > http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html > "The maximum key length is 1000 bytes. This can also be changed by changi

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-12 Thread oooooooooooo ooooooooooooo
>How many files per directory do you have? I have 4 directory levels, 65536 leaves directories and around 200 files per dir (15M in total)- >Something is wrong. Got to figure this out. Where did this RAM go? Thanks I reduced the memory usage of mysql and my app it and I got around a 15% pe

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS
On Sat, 2009-07-11 at 11:48 -0400, JohnS wrote: > On Sat, 2009-07-11 at 00:01 +, o wrote: > > > You mentioned that the data can be retrieved from somewhere else. Is > > > some part of this filename a unique key? > > > > The real key is up to 1023 chracters long and i

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread JohnS
On Sat, 2009-07-11 at 00:01 +, o wrote: > > You mentioned that the data can be retrieved from somewhere else. Is > > some part of this filename a unique key? > > The real key is up to 1023 chracters long and it's unique, but I have to trim > to 256 charactes, by thi

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev
> > Thanks, using directories as file names is a great idea, anyway I'm not sure > if that would solve my performance issue, as the bottleneck is the disk and > not mysql. The situation you described initally, suffers from only one issue - too many files in one single directory. You are not the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread oooooooooooo ooooooooooooo
Thanks, using directories as file names is a great idea, anyway I'm not sure if that would solve my performance issue, as the bottleneck is the disk and not mysql. I just implemented the directories names based on the hash of the file and the performance is a bit slower than before. This is the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-11 Thread Alexander Georgiev
2009/7/11 o : > >> You mentioned that the data can be retrieved from somewhere else. Is >> some part of this filename a unique key? > > The real key is up to 1023 chracters long and it's unique, but I have to trim > to 256 charactes, by this way is not unique unless I add

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
> You mentioned that the data can be retrieved from somewhere else. Is > some part of this filename a unique key? The real key is up to 1023 chracters long and it's unique, but I have to trim to 256 charactes, by this way is not unique unless I add the hash. >Do you have to track this > relati

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell
o wrote: >> I don't think you've explained the constraint that would make you use >> mysql or not. > > My original idea was using the just the hash as filename, by this way I could > have a direct access. But the customer rejected this and requested to have > part of the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
According to my tests the average size per file is around 15KB (although there are files from 1Kb to 150KB). _ Explore the seven wonders of the world http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
2009/7/10, Filipe Brandenburger : > On Fri, Jul 10, 2009 at 16:21, Alexander > Georgiev wrote: >> I would use either only a database, or only the file system. To me - >> using them both is a violation of KISS. > > I disagree with your general statement. > > Storing content that is appropriate for f

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Filipe Brandenburger
On Fri, Jul 10, 2009 at 16:21, Alexander Georgiev wrote: > I would use either only a database, or only the file system. To me - > using them both is a violation of KISS. I disagree with your general statement. Storing content that is appropriate for files (e.g., pictures) as BLOBs in an SQL datab

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
2009/7/10, o : > > Ok, I coudl use mysql, but think we have around 15M entries and I would have > to add to each a file from 1KB to 150KB, in total the files size can be > around 200GB. How will be the performance of this in mysql? > in the worst case - 150kb for a 150

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
Ok, I coudl use mysql, but think we have around 15M entries and I would have to add to each a file from 1KB to 150KB, in total the files size can be around 200GB. How will be the performance of this in mysql? _ Discover the new Win

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Alexander Georgiev
> > My original idea was using the just the hash as filename, by this way I > could have a direct access. But the customer rejected this and requested to > have part of the long file name (from 11 to 1023 characters). As linux only > allows 256 characters in the path and I could get duplicates with

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
>I don't think you've explained the constraint that would make you use > mysql or not. My original idea was using the just the hash as filename, by this way I could have a direct access. But the customer rejected this and requested to have part of the long file name (from 11 to 1023 characters)

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread Les Mikesell
o wrote: > Hi, After talking with te customer, I finnaly managed to convince him for > using the first characters of the hash as directory names. > > Now I'm in doubt about the following options: > > a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mys

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-10 Thread oooooooooooo ooooooooooooo
Hi, After talking with te customer, I finnaly managed to convince him for using the first characters of the hash as directory names. Now I'm in doubt about the following options: a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql with a hash->filename table, so I can get

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier
On a side note, perhaps this is something that Hadoop would be good with. -- James A. Peltier Systems Analyst (FASNet), VIVARIUM Technical Director HPC Coordinator Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpelt...@sfu.ca Website : http://www

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS
On Thu, 2009-07-09 at 10:09 -0700, James A. Peltier wrote: > On Thu, 9 Jul 2009, o wrote: > > > > > It's possible that I will be able to name the directory tree based in the > > hash of te file, so I would get the structure described in one of my > > previous post (4 di

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread James A. Peltier
On Thu, 9 Jul 2009, o wrote: > > It's possible that I will be able to name the directory tree based in the > hash of te file, so I would get the structure described in one of my previous > post (4 directory levels, each directory name would be a single character > from

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-09 Thread JohnS
On Wed, 2009-07-08 at 16:14 -0600, Frank Cox wrote: > On Wed, 08 Jul 2009 18:09:28 -0400 > Filipe Brandenburger wrote: > > > You can hash it and still keep the original filename, and you don't > > even need a MySQL database to do lookups. > > Now that is slick as all get-out. I'm really impress

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
>There's C code to do this in squid, and backuppc does it in perl (for a pool directory where all identical files are hardlinked). Unfortunately I have to write the file with some predefined format, so these would not provide the flexibility I need. >Rethink how you're writing files or you'll

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Alexander Georgiev
2009/7/9, o : > > After a quick calculation, that could put around 3200 files per directory (I > have around 15 million of files), I think that above 1000 files the > performance will start to degrade significantly, anyway it would be a mater > of doing some benchmarks. de

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread nate
James A. Peltier wrote: > There isn't a good file system for this type of thing. filesystems with > many very small files are always slow. Ext3, XFS, JFS are all terrible > for this type of thing. I can think of one...though you'll pay out the ass for it, the Silicon file system from BlueArc (N

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier
On Wed, 8 Jul 2009, o wrote: > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 40 files (and I don't have > any way to split this ammount in smaller ones). As the number of files grows

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread James A. Peltier
On Wed, 8 Jul 2009, o wrote: > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 40 files (and I don't have > any way to split this ammount in smaller ones). As the number of files grows

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell
o wrote: >> You can hash it and still keep the original filename, and you don't >> even need a MySQL database to do lookups. > > There are an issue I forgot to mention: the original file name can be up top > 1023 characters long. As linux only allows 256 characters in the

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
> You can hash it and still keep the original filename, and you don't > even need a MySQL database to do lookups. There are an issue I forgot to mention: the original file name can be up top 1023 characters long. As linux only allows 256 characters in the file path, I could have a (very small)

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Frank Cox
On Wed, 08 Jul 2009 18:09:28 -0400 Filipe Brandenburger wrote: > You can hash it and still keep the original filename, and you don't > even need a MySQL database to do lookups. Now that is slick as all get-out. I'm really impressed your scheme, though I don't actually have any use for it right a

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Filipe Brandenburger
Hi, On Wed, Jul 8, 2009 at 17:59, o wrote: > My original idea was storing the file with a hash of it name, and then store > a  hash->real filename in mysql. By this way I have direct access to the file > and I can make a directory hierachy with the first characters of te

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
:27:40 +0000 > Subject: [CentOS] Question about optimal filesystem with many small files. > > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 40 files (and I don't have > any way to split

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread oooooooooooo ooooooooooooo
lso write your address header ? :) Thanks for the help. > From: hhh...@hotmail.com > To: centos@centos.org > Date: Wed, 8 Jul 2009 06:27:40 +0000 > Subject: [CentOS] Question about optimal filesystem with many small files. > > > Hi,

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Gary Greene
On 7/8/09 8:56 AM, "Les Mikesell" wrote: > o wrote: >> Hi, >> >> I have a program that writes lots of files to a directory tree (around 15 >> Million fo files), and a node can have up to 40 files (and I don't have >> any way to split this ammount in smaller ones). As

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Kwan Lowe
On Wed, Jul 8, 2009 at 2:27 AM, o < hhh...@hotmail.com> wrote: > > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 40 files (and I don't have > any way to split this ammount in smaller one

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-08 Thread Les Mikesell
o wrote: > Hi, > > I have a program that writes lots of files to a directory tree (around 15 > Million fo files), and a node can have up to 40 files (and I don't have > any way to split this ammount in smaller ones). As the number of files grows, > my application ge

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-07 Thread Per Qvindesland
Perhaps think about running tune2fs maybe also consider adding noatime Regards Per E-mail: p...@norhex.com [1] http://www.linkedin.com/in/perqvindesland [2] --- Original message follows --- SUBJECT: Re: [CentOS] Question about optimal filesystem with many small files. FROM:  Niki Kovacs TO

Re: [CentOS] Question about optimal filesystem with many small files.

2009-07-07 Thread Niki Kovacs
o a écrit : > Hi, > > I have a program that writes lots of files to a directory tree Did that program also write your address header ? :o) ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/cent

[CentOS] Question about optimal filesystem with many small files.

2009-07-07 Thread oooooooooooo ooooooooooooo
Hi, I have a program that writes lots of files to a directory tree (around 15 Million fo files), and a node can have up to 40 files (and I don't have any way to split this ammount in smaller ones). As the number of files grows, my application gets slower and slower (the app is works someth