[PERFORM] tuning
hi- i would like to see if someone could recommend something to make my query run faster. System specs: PostgreSQL 7.4.2 on RedHat 9 dual AMD Athlon 2GHz processors 1 gig memory mirrored 7200 RPM IDE disks Values in postgresql.conf: shared_buffers = 1000 sort_mem is commented out effective_cache_size is commented out random_page_cost is commented out Relevant tables: product --- id serial productlistid integer vendorid integer item varchar(32) descrip varchar(256) price double vendor -- id serial vendorname varchar(64) A view i made in order to easily retrieve the vendor name: create view productvendorview as select p.id, p.productlistid, v.vendorname, p.item, p.descrip, p.price from product p, vendor v where p.vendorid = v.id; Here are some indices i have created: create index product_plid on product (productlistid); create index product_plidloweritem on product (productlistid, lower(item) varchar_pattern_ops); create index product_plidlowerdescrip on product (productlistid, lower(descrip) varchar_pattern_ops); Here is the query in question: select * from productvendorview where (productlistid=3 or productlistid=5 or productlistid=4) and (lower(item) like '9229%' or lower(descrip) like 'toner%') order by vendorname,item limit 100; This query scans 412,457 records. Here is the EXPLAIN ANALYZE for the query: Limit (cost=45718.83..45719.08 rows=100 width=108) (actual time=39093.636..39093.708 rows=100 loops=1) -> Sort (cost=45718.83..45727.48 rows=3458 width=108) (actual time=39093.629..39093.655 rows=100 loops=1) Sort Key: v.vendorname, p.item -> Hash Join (cost=22.50..45515.57 rows=3458 width=108) (actual time=95.490..39062.927 rows=2440 loops=1) Hash Cond: ("outer".vendorid = "inner".id) -> Seq Scan on test p (cost=0.00..45432.57 rows=3457 width=62) (actual time=89.066..39041.654 rows=2444 loops=1) Filter: (((productlistid = 3) OR (productlistid = 5) OR (productlistid = 4)) AND ((lower((item)::text) ~~ '9229%'::text) OR (lower((descrip)::text) ~~ 'toner%'::text))) -> Hash (cost=20.00..20.00 rows=1000 width=54) (actual time=6.289..6.289 rows=0 loops=1) -> Seq Scan on vendor v (cost=0.00..20.00 rows=1000 width=54) (actual time=0.060..3.653 rows=2797 loops=1) Total runtime: 39094.713 ms (10 rows) Thanks! -Clark ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] data=writeback
> When I give "mount" at the command line, everything looks just fine : > > /dev/sda2 on / type ext3 (rw,noatime,data=ordered) > none on /proc type proc (rw) > usbdevfs on /proc/bus/usb type usbdevfs (rw) > /dev/sda1 on /boot type ext3 (rw,noatime,data=ordered) > none on /dev/pts type devpts (rw,gid=5,mode=620) > none on /dev/shm type tmpfs (rw) > /dev/sdb1 on /usr/local/pgsql type ext3 (rw,noatime,data=writeback) > /dev/sda3 on /usr/local/pgsql/wal type ext3 (rw,noatime,data=ordered) > > It looks like the labels are not really used, just the mount-points. Or > could this cause other problems I am not aware of? Everything seems to > be working just fine, for several months now... Probably /dev/sdb1 and /dev/sda3 have the same labels and mount simply mounts them in a consistent way according to some logic we're not aware of. I'd say: if it works don't touch it ;) What remains unresolved is the question whether data=writeback is ok or not. We'll see if somebody has more information on that one... Bye, Chris. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] freebsd or linux
I'm running a server with lots of counts and calculations. currently its ubuntu server is freebsd faster? also this is a i386 machine. or linux and bsd is about the same. this is not to be an argument just looking. Current benchmarks to compare thanks
Re: [PERFORM] Storing Digital Video
Rodrigo Madera wrote: I am concerned with performance issues involving the storage of DV on a database. I though of some options, which would be the most advised for speed? 1) Pack N frames inside a "container" and store the container to the db. 2) Store each frame in a separate record in the table "frames". 3) (type something here) Thanks for the help, My experience has been that this is a very bad idea. Many people want to store all sorts of data in a database such as email messages, pictures, etc... The idea of a relational database is to perform queries against data. If you are needing to just store data then store it on a disk and use the database as the indexer of the data. Keep in mind the larger the database the slower some operations become. Unless you are operating on the frame data (which you either store as blobs or hex-encoded data) I'd recommend you store the data on a hard drive and let the database store meta data about the video such as path information, run time, author, etc... We do this on an application storing close to a million images and the performance is impressive. 1. we don't have to do any sort of data manipulation storing the data in or retrieving the data out of the database. 2. our database is compact and extremely fast - it is using the database for what it was designed for - relational queries. My $0.02 Rodrigo ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Loading the entire DB into RAM
If memory serves me correctly I have seen several posts about this in the past. I'll try to recall highlights. 1. Create a md in linux sufficiently large enough to handle the data set you are wanting to store. 2. Create a HD based copy somewhere as your permanent storage mechanism. 3. Start up your PostgreSQL instance with the MD as the data store 4. Load your data to the MD instance. 5. Figure out how you will change indexes _and_ ensure that your disk storage is consistent with your MD instance. I haven't done so, but it would be interesting to have a secondary database somewhere that is your primary storage. It needn't be especially powerful, or even available. It serves as the place to generate your indexing data. You could then use SLONY to propogate the data to the MD production system. Of course, if you are updating your system that resides in ram, you should be thinking the other way. Have SLONY replicate changes to the other, permanent storage, system. Either way you do it, I can't think of an out of the box method to doing it. Somehow one has to transfer data from permanent storage to the md instance, and, likewise, back to permanent storage. Out of curiosity, what are you using as the search engine? Charles A. Landemaine wrote: I have a web server with PostgreSQL and RHEL. It hosts a search engine, and each time some one makes a query, it uses the HDD Raid array. The DB is not very big, it is less than a GB. I plan to add more RAM anyway. What I'd like to do is find out how to keep the whole DB in RAM so that each time some one does a query, it doesn't use the HDD. Is it possible, if so, how? Thanks, Charles. ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match