Index organized tables would do this and it would be a generic capability. - Luke
Msg is shrt cuz m on ma treo -----Original Message----- From: Georgi Chulkov [mailto:[EMAIL PROTECTED] Sent: Monday, September 17, 2007 11:50 PM Eastern Standard Time To: Tom Lane Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Raw device I/O for large objects Hi, > We've heard this idea proposed before, and it's been shot down as a poor > use of development effort every time. Check the archives for previous > threads, but the basic argument goes like this: when Oracle et al did > that twenty years ago, it was a good idea because (1) operating systems > tended to have sucky filesystems, (2) performance and reliability > properties of same were not very consistent across platforms, and (3) > being large commercial software vendors they could afford to throw lots > of warm bodies at anything that seemed like a bottleneck. None of those > arguments holds up well for us today however. If you think you want to > reimplement a filesystem you need to have some pretty concrete reasons > why you can outsmart all the smart folks who have worked on > your-favorite-OS's filesystems for lo these many years. There's also > the fact that on any reasonably modern disk hardware, "raw I/O" is > anything but. Thanks, I agree with all your arguments. Here's the reason why I'm looking at raw device storage for large objects only (as opposed to all tables): with raw device I/O I can control, to an extent, spatial locality. So, if I have an application that wants to store N large objects (totaling several gigabytes) and read them back in some order that is well-known in advance, I could store my large objects in that order on the raw device.* Sequentially reading them back would then be very efficient. With a file system underneath, I don't have that freedom. (Such a scenario occurs with raster databases, for example.) * assuming I have a way to communicate these requirements; that's a whole new problem Please allow me to ask then: 1. In your opinion, would the above scenario indeed benefit from a raw-device interface for large objects? 2. How feasible it is to decouple general table storage from large object storage? Thank you for your time, Georgi ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly