JohnS wrote:
> On Mon, 2009-07-13 at 05:49 +, o wrote:
>
>>> It is 1024 chars long. Witch want still help.
>> I'm usng mysam and according to:
>> http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
>> "The maximum key length is 1000 bytes. This can also
On Mon, 2009-07-13 at 05:49 +, o wrote:
> >It is 1024 chars long. Witch want still help.
> I'm usng mysam and according to:
> http://dev.mysql.com/doc/refman/5.1/en/myisam-storage-engine.html
> "The maximum key length is 1000 bytes. This can also be changed by changi
>How many files per directory do you have?
I have 4 directory levels, 65536 leaves directories and around 200 files per
dir (15M in total)-
>Something is wrong. Got to figure this out. Where did this RAM go?
Thanks I reduced the memory usage of mysql and my app it and I got around a 15%
pe
On Sat, 2009-07-11 at 11:48 -0400, JohnS wrote:
> On Sat, 2009-07-11 at 00:01 +, o wrote:
> > > You mentioned that the data can be retrieved from somewhere else. Is
> > > some part of this filename a unique key?
> >
> > The real key is up to 1023 chracters long and i
On Sat, 2009-07-11 at 00:01 +, o wrote:
> > You mentioned that the data can be retrieved from somewhere else. Is
> > some part of this filename a unique key?
>
> The real key is up to 1023 chracters long and it's unique, but I have to trim
> to 256 charactes, by thi
>
> Thanks, using directories as file names is a great idea, anyway I'm not sure
> if that would solve my performance issue, as the bottleneck is the disk and
> not mysql.
The situation you described initally, suffers from only one issue -
too many files in one single directory. You are not the
Thanks, using directories as file names is a great idea, anyway I'm not sure if
that would solve my performance issue, as the bottleneck is the disk and not
mysql. I just implemented the directories names based on the hash of the file
and the performance is a bit slower than before. This is the
2009/7/11 o :
>
>> You mentioned that the data can be retrieved from somewhere else. Is
>> some part of this filename a unique key?
>
> The real key is up to 1023 chracters long and it's unique, but I have to trim
> to 256 charactes, by this way is not unique unless I add
> You mentioned that the data can be retrieved from somewhere else. Is
> some part of this filename a unique key?
The real key is up to 1023 chracters long and it's unique, but I have to trim
to 256 charactes, by this way is not unique unless I add the hash.
>Do you have to track this
> relati
o wrote:
>> I don't think you've explained the constraint that would make you use
>> mysql or not.
>
> My original idea was using the just the hash as filename, by this way I could
> have a direct access. But the customer rejected this and requested to have
> part of the
According to my tests the average size per file is around 15KB (although there
are files from 1Kb to 150KB).
_
Explore the seven wonders of the world
http://search.msn.com/results.aspx?q=7+wonders+world&mkt=en-US&form=QBRE
2009/7/10, Filipe Brandenburger :
> On Fri, Jul 10, 2009 at 16:21, Alexander
> Georgiev wrote:
>> I would use either only a database, or only the file system. To me -
>> using them both is a violation of KISS.
>
> I disagree with your general statement.
>
> Storing content that is appropriate for f
On Fri, Jul 10, 2009 at 16:21, Alexander
Georgiev wrote:
> I would use either only a database, or only the file system. To me -
> using them both is a violation of KISS.
I disagree with your general statement.
Storing content that is appropriate for files (e.g., pictures) as
BLOBs in an SQL datab
2009/7/10, o :
>
> Ok, I coudl use mysql, but think we have around 15M entries and I would have
> to add to each a file from 1KB to 150KB, in total the files size can be
> around 200GB. How will be the performance of this in mysql?
>
in the worst case - 150kb for a 150
Ok, I coudl use mysql, but think we have around 15M entries and I would have to
add to each a file from 1KB to 150KB, in total the files size can be around
200GB. How will be the performance of this in mysql?
_
Discover the new Win
>
> My original idea was using the just the hash as filename, by this way I
> could have a direct access. But the customer rejected this and requested to
> have part of the long file name (from 11 to 1023 characters). As linux only
> allows 256 characters in the path and I could get duplicates with
>I don't think you've explained the constraint that would make you use
> mysql or not.
My original idea was using the just the hash as filename, by this way I could
have a direct access. But the customer rejected this and requested to have part
of the long file name (from 11 to 1023 characters)
o wrote:
> Hi, After talking with te customer, I finnaly managed to convince him for
> using the first characters of the hash as directory names.
>
> Now I'm in doubt about the following options:
>
> a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mys
Hi, After talking with te customer, I finnaly managed to convince him for using
the first characters of the hash as directory names.
Now I'm in doubt about the following options:
a) Using directory 4 levels /c/2/a/4/ (200 files per directory) and mysql with
a hash->filename table, so I can get
On a side note, perhaps this is something that Hadoop would be good with.
--
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
HPC Coordinator
Simon Fraser University - Burnaby Campus
Phone : 778-782-6573
Fax : 778-782-3045
E-Mail : jpelt...@sfu.ca
Website : http://www
On Thu, 2009-07-09 at 10:09 -0700, James A. Peltier wrote:
> On Thu, 9 Jul 2009, o wrote:
>
> >
> > It's possible that I will be able to name the directory tree based in the
> > hash of te file, so I would get the structure described in one of my
> > previous post (4 di
On Thu, 9 Jul 2009, o wrote:
>
> It's possible that I will be able to name the directory tree based in the
> hash of te file, so I would get the structure described in one of my previous
> post (4 directory levels, each directory name would be a single character
> from
On Wed, 2009-07-08 at 16:14 -0600, Frank Cox wrote:
> On Wed, 08 Jul 2009 18:09:28 -0400
> Filipe Brandenburger wrote:
>
> > You can hash it and still keep the original filename, and you don't
> > even need a MySQL database to do lookups.
>
> Now that is slick as all get-out. I'm really impress
>There's C code to do this in squid, and backuppc does it in perl (for a
pool directory where all identical files are hardlinked).
Unfortunately I have to write the file with some predefined format, so these
would not provide the flexibility I need.
>Rethink how you're writing files or you'll
2009/7/9, o :
>
> After a quick calculation, that could put around 3200 files per directory (I
> have around 15 million of files), I think that above 1000 files the
> performance will start to degrade significantly, anyway it would be a mater
> of doing some benchmarks.
de
James A. Peltier wrote:
> There isn't a good file system for this type of thing. filesystems with
> many very small files are always slow. Ext3, XFS, JFS are all terrible
> for this type of thing.
I can think of one...though you'll pay out the ass for it, the
Silicon file system from BlueArc (N
On Wed, 8 Jul 2009, o wrote:
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split this ammount in smaller ones). As the number of files grows
On Wed, 8 Jul 2009, o wrote:
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split this ammount in smaller ones). As the number of files grows
o wrote:
>> You can hash it and still keep the original filename, and you don't
>> even need a MySQL database to do lookups.
>
> There are an issue I forgot to mention: the original file name can be up top
> 1023 characters long. As linux only allows 256 characters in the
> You can hash it and still keep the original filename, and you don't
> even need a MySQL database to do lookups.
There are an issue I forgot to mention: the original file name can be up top
1023 characters long. As linux only allows 256 characters in the file path, I
could have a (very small)
On Wed, 08 Jul 2009 18:09:28 -0400
Filipe Brandenburger wrote:
> You can hash it and still keep the original filename, and you don't
> even need a MySQL database to do lookups.
Now that is slick as all get-out. I'm really impressed your scheme, though I
don't actually have any use for it right a
Hi,
On Wed, Jul 8, 2009 at 17:59,
o wrote:
> My original idea was storing the file with a hash of it name, and then store
> a hash->real filename in mysql. By this way I have direct access to the file
> and I can make a directory hierachy with the first characters of te
:27:40 +0000
> Subject: [CentOS] Question about optimal filesystem with many small files.
>
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split
lso write your address header ?
:)
Thanks for the help.
> From: hhh...@hotmail.com
> To: centos@centos.org
> Date: Wed, 8 Jul 2009 06:27:40 +0000
> Subject: [CentOS] Question about optimal filesystem with many small files.
>
>
> Hi,
On 7/8/09 8:56 AM, "Les Mikesell" wrote:
> o wrote:
>> Hi,
>>
>> I have a program that writes lots of files to a directory tree (around 15
>> Million fo files), and a node can have up to 40 files (and I don't have
>> any way to split this ammount in smaller ones). As
On Wed, Jul 8, 2009 at 2:27 AM, o <
hhh...@hotmail.com> wrote:
>
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split this ammount in smaller one
o wrote:
> Hi,
>
> I have a program that writes lots of files to a directory tree (around 15
> Million fo files), and a node can have up to 40 files (and I don't have
> any way to split this ammount in smaller ones). As the number of files grows,
> my application ge
Perhaps think about running tune2fs maybe also consider
adding noatime
Regards
Per
E-mail: p...@norhex.com [1]
http://www.linkedin.com/in/perqvindesland [2]
--- Original message follows ---
SUBJECT: Re: [CentOS] Question about optimal filesystem with many
small files.
FROM: Niki Kovacs
TO
o a écrit :
> Hi,
>
> I have a program that writes lots of files to a directory tree
Did that program also write your address header ?
:o)
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/cent
Hi,
I have a program that writes lots of files to a directory tree (around 15
Million fo files), and a node can have up to 40 files (and I don't have any
way to split this ammount in smaller ones). As the number of files grows, my
application gets slower and slower (the app is works someth
40 matches
Mail list logo