Do not underestimate the performance tuning of database engines. And do not 
overestimate the performance tuning of filesystems. In an image retrieval 
system I worked on database BLOB storage was faster. Filesystems have their own 
overhead in locating the file by name and retrieving the blocks, whereas 
databases already have a file open and only need to seek to the offset which 
should all be indexed accordingly. When you "select" from a database using 
proper keys and indexes, this is very, very fast. When you open a file, it has 
to slog through several directories worth of metadata to get the file metadata 
and then read the file. What's more is where indexes are designed to be kept in 
memory, your file tables are all MRU cached, and share look-ups/cache space 
across the entire VFS.  The database also buys you atomicity, which is 
important during long look-ups. In the end, disk reads should be substantially 
higher in the filesystem.

Then consider directory entry limits. These vary by FS. Old NT has a 64k files 
per directory limit. (removed in later versions) To balance the lookups, you 
need to resport to using GUIDs, and groups of characters as  subdirectory names 
(e.g. 4e4e-8f9e7dce-xyz should be stored as "\4e4e\8fe9\7dce\4e4e-8f9e7dce-xyz" 
to keep the files per directory down. )

If you find a filesystem is faster, I suggest you examine your database data 
definition. You can't get any faster than INDEX-SEEK-READ, which is what the 
database does. 








________________________________
 From: Mohammed Rashad <mohammedrasha...@gmail.com>
To: witty-interest@lists.sourceforge.net 
Sent: Sunday, October 7, 2012 9:44 AM
Subject: [Wt-interest] Wt File IO vs Database IO
 

All,

Due to large data used in a crowd source mapping project. I had decided to 
completely eliminate the use of database and use Flat files for storage, 
retrieval and query.

I thought of storing each record in db as individual files. so this will help 
in the retrieval speed and no search is needed in the entire db or a file.

but if a table have more than 10000's of records and users accessing 
(same/different) records from different places will result in N number of File 
I/O

Will this be a bottleneck in the application. consider each file of size <= 
15KB.?

The main reason to eliminate db is because of performance bottleneck in 
database I/O.

So moving to new model will help in anyway as the number of users and data will 
be much more than expected?


-- 

Regards,
   Rashad

------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
witty-interest mailing list
witty-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/witty-interest
------------------------------------------------------------------------------
The Windows 8 Center 
In partnership with Sourceforge
Your idea - your app - 30 days. Get started!
http://windows8center.sourceforge.net/
what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/
_______________________________________________
witty-interest mailing list
witty-interest@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/witty-interest

Reply via email to