from:"Jesper Krogh"

Re: [PERFORM] Lock pileup causes server to stall

2014-11-12 Thread Jesper Krogh

>>> >>> Current FK checking makes you wait if the referenced tuple is modified >>> on any indexed column, not just those that are actually used in >>> foreign keys. Maybe this case would be sped up if we optimized that. >> >> Even if it is an gin index that is being modified? seems like

Re: [PERFORM] Lock pileup causes server to stall

2014-11-11 Thread Jesper Krogh

> On 10/11/2014, at 22.40, Alvaro Herrera wrote: > > Josh Berkus wrote: >> All, >> >> pg version: 9.3.5 >> RHEL 6.5 >> 128GB/32 cores >> Configured with shared_buffers=16GB >> Java/Tomcat/JDBC application >> >> Server has an issue that whenever we get lock waits (transaction lock >> waits, usu

Re: [PERFORM] Planner performance extremely affected by an hanging transaction (20-30 times)?

2013-09-20 Thread Jesper Krogh

On 21/09/2013, at 00.01, Jeff Janes wrote: > See "In progress INSERT wrecks plans on table" and "Performance bug in > prepared statement binding in 9.2" also on this list This feels like the same http://postgresql.1045698.n5.nabble.com/Slow-query-plan-generation-fast-query-PG-9-2-td5769363.html

Re: [PERFORM] Slow query-plan generation (fast query) PG 9.2

2013-09-03 Thread Jesper Krogh

On 03/09/13 09:47, Craig Ringer wrote: On 09/03/2013 03:46 PM, jes...@krogh.cc wrote: Hi. I have a strange situation where generating the query plan takes 6s+ and executing it takes very little time. How do you determine that it's planning time at fault here? Not that I'm sure, but the timing

Re: [PERFORM] Best practice when reindexing in production

2013-05-29 Thread Jesper Krogh

On 29/05/13 14:24, Niels Kristian Schjødt wrote:On 29/05/13 14:24, Niels Kristian Schjødt wrote: Hi, I have a database with quite some data (millions of rows), that is heavily updated all the time. Once a day I would like to reindex my database (and maybe re cluster it - don't know if that's w

Re: [PERFORM] hardware upgrade, performance degrade?

2013-03-01 Thread Jesper Krogh

On 01/03/2013, at 10.52, Steven Crandell wrote: > Recently I moved my ~600G / ~15K TPS database from a > 48 core@2.0GHz server with 512GB RAM on 15K RPM disk > to a newer server with > 64 core@2.2Ghz server with 1T of RAM on 15K RPM disks > > The move was from v9.1.4 to v9.1.8 (eventually als

Re: [PERFORM] FTS performance issue probably due to wrong planner estimate of detoasting

2013-02-07 Thread Jesper Krogh

On 08/02/13 01:52, Stefan Keller wrote: Hi, I have problems with the performance of FTS in a query like this: SELECT * FROM FullTextSearch WHERE content_tsv_gin @@ plainto_tsquery('english', 'good'); It's slow (> 30 sec.) for some GB (27886 html files, originally 73 MB zipped). The planner

Re: [PERFORM] Planner selects different execution plans depending on limit

2012-09-13 Thread Jesper Krogh

On 13/09/12 16:42, Bill Martin wrote: Yes, I've run the ANALYZE command. Regards, Bill Martin The main problem in your case is actually that you dont store the tsvector in the table. If you store to_tsvector('simple',content.content) in a column in the database and search against that instead

Re: [PERFORM] Planner selects different execution plans depending on limit

2012-09-10 Thread Jesper Krogh

On 10/09/12 16:24, bill_mar...@freenet.de wrote: Hi All I´ve ft_simple_core_content_content_idx ON core_content USING gin (to_tsvector('simple'::regconfig, content) ); If I´m seaching for a word which is NOT in the column content the query plan and the execution time differs with the g

Re: [PERFORM] Performance of a large array access by position (tested version 9.1.3)

2012-06-25 Thread Jesper Krogh

On 22/06/12 09:02, Maxim Boguk wrote: Hi all, May be I completely wrong but I always assumed that the access speed to the array element in PostgreSQL should be close to constant time. But in tests I found that access speed degrade as O(N) of array size. Test case (performed on large not busy

Re: [PERFORM] Gin index insert performance issue

2012-03-12 Thread Jesper Krogh

On 13/03/12 06:43, Rural Hunter wrote: I tried to increase work_mem but the inserts hang more time each time with less frequency. So it makes almost no difference for the total hanging time. Frequent vacuum is not a choice since the hang happens very 3-5 mins. is there any improvement I can m

Re: [PERFORM] Very long deletion time on a 200 GB database

2012-02-23 Thread Jesper Krogh

On 23/02/12 09:39, Reuven M. Lerner wrote: Hi, everyone. I'm maintaining an application that exists as a "black box" in manufacturing plants. The system is based on Windows, .NET, and PostgreSQL 8.3. I'm a Unix kind of guy myself, but the application layer and system administration are being

Re: [PERFORM] Query planner doesn't use index scan on tsvector GIN index if LIMIT is specifiedQuery planner doesn't use index scan on tsvector GIN index if LIMIT is specified

2012-01-10 Thread Jesper Krogh

On 2012-01-10 18:04, Tom Lane wrote: darklow writes: But the performance problems starts when i do the same query specifying LIMIT. *SELECT id FROM cache WHERE tsv @@ to_tsquery('test:*') limit 20;* By some reason index is not used. It apparently thinks there are enough matches that it might a

Re: [PERFORM] Problems with FTS

2011-11-30 Thread Jesper Krogh

On 2011-11-30 21:58, Robert Haas wrote: The row-count estimates look reasonably accurate, so there's some other problem here. What do you have random_page_cost, seq_page_cost, and effective_cache_size set to? You might try "SET random_page_cost=2" or even "SET random_page_cost=0.5; SET seq_page

Re: [PERFORM] Summaries on SSD usage?

2011-09-02 Thread Jesper Krogh

On 2011-09-03 00:04, Stefan Keller wrote: 2011/9/2 Scott Marlowe: On Tue, Aug 30, 2011 at 11:23 AM, Stefan Keller wrote: How big is your DB? What kind of reads are most common, random access or sequential? How big of a dataset do you pull out at once with a query. SSDs are usually not a big wi

Re: [PERFORM] Summaries on SSD usage?

2011-09-01 Thread Jesper Krogh

On 2011-09-01 23:28, Jim Nasby wrote: On Aug 30, 2011, at 12:23 PM, Stefan Keller wrote: I'm looking for summaries (or best practices) on SSD usage with PostgreSQL. My use case is mainly a "read-only" database. Are there any around? I'm not sure, but for read-only why not just put more memory i

Re: [PERFORM] sequential scan unduly favored over text search gin index

2011-06-20 Thread Jesper Krogh

On 2011-06-20 17:38, Sushant Sinha wrote: I have a tsvector column docvector and a gin index on it docmeta1_docvector_idx I have a simple query "select * from docmeta1 where docvector @@ plainto_tsquery('english', 'free');" I find that the planner chooses a sequential scan of the table even whe

Re: [PERFORM] Performance advice for a new low(er)-power server

2011-06-16 Thread Jesper Krogh

On 2011-06-16 17:09, Haestan wrote: I am evaluating hardware for a new PostgreSQL server. For reasons concerning power consumption and available space it should not have more than 4 disks (in a 1U case), if possible. Now, I am not sure what disks to use and how to layout them to get the best perf

Re: [PERFORM] seq scan in the case of max() on the primary key column

2011-06-16 Thread Jesper Krogh

On 2011-06-16 15:55, Svetlin Manavski wrote: Hi everybody, I am running PostgreSQL 9.0 which performs well in most of the cases. I would skip all the parameters if these are not necessary. I need to frequently (every min) get the max value of the primary key column on some tables, like this cas

Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan

2011-05-15 Thread Jesper Krogh

On 2011-05-16 06:41, Jesper Krogh wrote: On 2011-05-16 03:18, Greg Smith wrote: You can't do it in real-time. You don't necessarily want that to even if it were possible; too many possibilities for nasty feedback loops where you always favor using some marginal index that happens

Re: [PERFORM] reducing random_page_cost from 4 to 2 to force index scan

2011-05-15 Thread Jesper Krogh

On 2011-05-16 03:18, Greg Smith wrote: You can't do it in real-time. You don't necessarily want that to even if it were possible; too many possibilities for nasty feedback loops where you always favor using some marginal index that happens to be in memory, and therefore never page in things t

Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-03 Thread Jesper Krogh

On 2011-05-04 07:25, Willy-Bas Loos wrote: are you saying that, generally speaking, moving the data would be better unless the SAN performs worse than the disks? It was more, "given all the incertainties, that seems like the least risky". The SAN might actually be less well performing than what

Re: [PERFORM] [PERFORMANCE] expanding to SAN: which portion best to move

2011-05-03 Thread Jesper Krogh

On 2011-05-03 17:52, Willy-Bas Loos wrote: Our database has gotten rather large and we are running out of disk space. our disks are 15K rpm SAS disks in RAID 10. We are going to rent some space on a FibreChannel SAN. That gives us the opportunity to separate the data and the indexes. Now i thoug

Re: [PERFORM] REINDEX takes half a day (and still not complete!)

2011-04-17 Thread Jesper Krogh

> > > How do DB folks do this with small maintenance windows? This is for a > very high traffic website so it's beginning to get embarrassing. Normally there is no need to issue reindex. What's your reason for the need? Jesper >

Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-11 Thread Jesper Krogh

On 2011-04-11 22:39, James Cloos wrote: "GA" == Glyn Astill writes: GA> I was hoping someone had seen this sort of behaviour before, GA> and could offer some sort of explanation or advice. Jesper's reply is probably most on point as to the reason. I know that recent Opterons use some of the

Re: [PERFORM] Linux: more cores = less concurrency.

2011-04-11 Thread Jesper Krogh

On 2011-04-11 21:42, Glyn Astill wrote: I'll have to try with the synthetic benchmarks next then, but somethings definately going off here. I'm seeing no disk activity at all as they're selects and all pages are in ram. Well, if you dont have enough computations to be bottlenecked on the cpu

Re: [PERFORM] Intel SSDs that may not suck

2011-04-06 Thread Jesper Krogh

On 2011-03-28 22:21, Greg Smith wrote: Some may still find these two cheap for enterprise use, given the use of MLC limits how much activity these drives can handle. But it's great to have a new option for lower budget system that can tolerate some risk there. Drifting of the topic slightly

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jesper Krogh

On 2011-03-29 18:50, Jeff wrote: we have some new drives that we are going to use initially, but eventually it'll be a secure-erase'd one we replace it with (which should perform identical to a new one) What enclosure & controller are you using on the 24 disk beast? LSI ELP and a HP D2

Re: [PERFORM] Intel SSDs that may not suck

2011-03-29 Thread Jesper Krogh

On 2011-03-29 16:16, Jeff wrote: halt for 0.5-2 seconds, then resume. The fix we're going to do is replace each drive in order with the rebuild occuring between each. Then we do a security erase to reset the drive back to completely empty (including the "spare" blocks kept around for writes).

Re: [PERFORM] Intel SSDs that may not suck

2011-03-28 Thread Jesper Krogh

On 2011-03-29 06:13, Merlin Moncure wrote: My own experience with MLC drives is that write cycle expectations are more or less as advertised. They do go down (hard), and have to be monitored. If you are writing a lot of data this can get pretty expensive although the cost dynamics are getting bet

Re: [PERFORM] Request for feedback on hardware for a new database server

2011-03-17 Thread Jesper Krogh

On 2011-03-18 01:51, Oliver Charles wrote: Hello, At MusicBrainz we're looking to get a new database server, and are hoping to buy this in the next couple of days. I'm mostly a software guy, but I'm posting this on behalf of Rob, who's actually going to be buying the hardware. Here's a quote of

Re: [PERFORM] Indexes with condition using immutable functions applied to column not used

2011-02-07 Thread Jesper Krogh

On 2011-02-08 01:14, Sylvain Rabot wrote: CREATE INDEX directory_id_user_mod_cons_hash_0_btree_idx ON mike.directory USING btree (id_user) WHERE __mod_cons_hash(id_user, 4) = 0; CREATE INDEX directory_id_user_mod_cons_hash_1_btree_idx ON mike.directory USING btree (id_user) WHERE __mod_cons_has

Re: [PERFORM] Postgres insert performance and storage requirement compared to Oracle

2010-10-28 Thread Jesper Krogh

On 2010-10-28 15:13, Merlin Moncure wrote: On Wed, Oct 27, 2010 at 3:47 PM, Jesper Krogh wrote: On 2010-10-27 20:51, Merlin Moncure wrote: Yes, I am quite aware of how the o/s page cache works. All else being equal, I more compact database obviously would be preferred. However

Re: [PERFORM] Postgres insert performance and storage requirement compared to Oracle

2010-10-27 Thread Jesper Krogh

On 2010-10-27 20:51, Merlin Moncure wrote: Yes, I am quite aware of how the o/s page cache works. All else being equal, I more compact database obviously would be preferred. However 'all else' is not necessarily equal. I can mount my database on bzip volume, that must make it faster, right?

Re: [PERFORM] Postgres insert performance and storage requirement compared to Oracle

2010-10-27 Thread Jesper Krogh

On 2010-10-27 20:28, Merlin Moncure wrote: Postgres indexes are pretty compact, and oracle (internals I am not familiar with) also has to do MVCC type management, so I am suspecting your measurement is off (aka, operator error) or oracle is cheating somehow by optimizing away storage requirements

Re: [PERFORM] BBU Cache vs. spindles

2010-10-22 Thread Jesper Krogh

On 2010-10-22 17:37, Greg Smith wrote: I think that most people who have thought they were safe to turn off full_page_writes in the past did so because they believed they were in category (1) here. I've never advised anyone to do that, because it's so difficult to validate the truth of. Jus

Re: [PERFORM] Slow count(*) again...

2010-10-21 Thread Jesper Krogh

On 2010-10-21 06:47, Scott Carey wrote: On a wimpy disk, I/O bound for sure. But my disks go 1000MB/sec. No query can go fast enough for them. The best I've gotten is 800MB/sec, on a wide row (average 800 bytes). Most tables go 300MB/sec or so. And with 72GB of RAM, many scans are in-mem

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Jesper Krogh

On 2010-10-14 21:56, Robert Haas wrote: On Thu, Oct 14, 2010 at 12:22 AM, mark wrote: Could this be an interesting test use of https://www.fossexperts.com/ ? 'Community' driven proposal - multiple people / orgs agree to pay various portions? Maybe with multiple funders a reasonable target

Re: [PERFORM] Slow count(*) again...

2010-10-14 Thread Jesper Krogh

On 2010-10-14 06:22, mark wrote: Could this be an interesting test use of https://www.fossexperts.com/ ? 'Community' driven proposal - multiple people / orgs agree to pay various portions? Maybe with multiple funders a reasonable target fund amount could be reached. I might convince my boss

Re: [PERFORM] Slow count(*) again...

2010-10-13 Thread Jesper Krogh

On 2010-10-13 15:28, Robert Haas wrote: On Wed, Oct 13, 2010 at 6:16 AM, Neil Whelchel wrote: I might go as far as to rattle the cage of the developers to see if it makes any sense to add some column oriented storage capability to Postgres. That would be the hot ticket to be able to specify

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jesper Krogh

On 2010-10-12 19:07, Tom Lane wrote: Anyway, if anyone is hot to make COUNT(*) faster, that's where to look. I don't think any of the previous discussion in this thread is on-point at all, except for the parts where people suggested avoiding it. I would have to say that allthough it is nice

Re: [PERFORM] Slow count(*) again...

2010-10-12 Thread Jesper Krogh

On 2010-10-12 18:02, Scott Carey wrote: However, for large reporting queries and sequential scans, XFS will win in the long run if you use the online defragmenter. Otherwise, your sequential scans won't be all that sequential on any file system over time if your tables aren't written once, f

Re: [PERFORM] gist indexes for distance calculations

2010-09-30 Thread Jesper Krogh

On 2010-09-30 20:33, Marcelo Zabani wrote: If you can also pinpoint me to where I can find this sort of information (index utilization and planning, performance tuning), I'd be very grateful. Thank you already, Isn't this what the knngist patches are for? https://commitfest.postgresql.org/a

Re: [PERFORM] Query much faster with enable_seqscan=0

2010-09-21 Thread Jesper Krogh

On 2010-09-21 20:21, Ogden wrote: I assume you mean random_page_cost? It is currently set to 4.0 - is it better to increase or decrease this value? Should be lowered to a bit over seq_page_cost.. and more importantly.. you should make sure that you have updated your statistics .. run "A

Re: [PERFORM] PostgreSQL PITR - more doubts

2010-07-12 Thread Jesper Krogh

On 2010-07-12 13:23, Jayadevan M wrote: Hello all, One doubt about how PostgreSQL PITR works. Let us say I have all the archived WALs for the past week with archive_command = 'cp -i %p /home/postgres/archive/%f Yes, It starts out form "where it needs to". Assuming you did a pg_start_backup()

Re: [PERFORM] Need help in performance tuning.

2010-07-09 Thread Jesper Krogh

On 2010-07-10 00:59, Greg Smith wrote: Matthew Wakeling wrote: > If you have an external pool solution, you can put it somewhere > else - maybe on multiple somewhere elses. This is the key point to observe: if you're at the point where you have so many connections that you need a pool, the l

Re: [PERFORM] Write performance

2010-06-24 Thread Jesper Krogh

On 2010-06-24 15:45, Janning Vygen wrote: On Thursday 24 June 2010 15:16:05 Janning wrote: On Thursday 24 June 2010 14:53:57 Matthew Wakeling wrote: On Thu, 24 Jun 2010, Janning wrote: We have a 12 GB RAM machine with intel i7-975 and using 3 disks "Seagate Barracuda 7200.11,

[PERFORM] Aggressive autovacuuming ?

2010-06-20 Thread Jesper Krogh

argest consequence I can see at the moment is that when I get a full vacuum (for preventing transaction-id wraparound) it would be run with the same aggressive settings, thus giving a real performance hit in that situation. Has anyone tried to do similar? What is your experience? Is the idea totally

Re: [PERFORM] Need to increase performance of a query

2010-06-10 Thread Jesper Krogh

On 2010-06-10 19:50, Anne Rosset wrote: Any advice on how to make it run faster? What timing do you get if you run it with \t (timing on) and without explain analyze ? I would be surprised if you can get it much faster than what is is.. I may be that a significant portion is "planning cost"

Re: [PERFORM] planner costs in "warm cache" tests

2010-05-31 Thread Jesper Krogh

On 2010-05-30 20:34, Tom Lane wrote: Jesper Krogh writes: testdb=# set seq_page_cost = 0.1; SET testdb=# set random_page_cost = 0.1; SET Well, hmm, I really doubt that that represents reality either. A page access is by no means "free" even when the page is alread

[PERFORM] planner costs in "warm cache" tests

2010-05-30 Thread Jesper Krogh

Hi. I'm trying to get the planner to do sort of the correct thing when choosing between index-scans on btree indices and bitmap heap scans. There has been several things going on in parallel. One is that the statistics data is off: http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/1

Re: [PERFORM] which hardware setup

2010-05-24 Thread Jesper Krogh

Option 2: App Server and Postgres: Dual Xeon 5520 quad core with 12GB ram and 2x 146GB 15k RPM SAS (RAID1) disks you didnt mention your dataset size, but i the second option would be preferrable in most situations since it gives more of the os memory for disc caching. 12 gb vs 4 gb for

Re: [PERFORM] Ugh - bad plan with LIMIT in a complex SELECT, any way to fix this?

2010-05-08 Thread Jesper Krogh

Overal comment.. Try reading hrough these old threads most of your problem is the same issue: http://article.gmane.org/gmane.comp.db.postgresql.performance/22395/match=gin http://thread.gmane.org/gmane.comp.db.postgresql.performance/22331/focus=22434 Tabl

Re: [PERFORM] 3ware vs. MegaRAID

2010-04-09 Thread Jesper Krogh

On 2010-04-09 20:22, Greg Smith wrote: Jesper Krogh wrote: I've spent quite some hours googling today. Am I totally wrong if the: HP MSA-20/30/70 and Sun Oracle J4200's: https://shop.sun.com/store/product/53a01251-2fce-11dc-9482-080020a9ed93 are of the same type just from "major&

Re: [PERFORM] 3ware vs. MegaRAID

2010-04-09 Thread Jesper Krogh

On 2010-04-09 17:27, Greg Smith wrote: Jesper Krogh wrote: Can someone shed "simple" light on an extremely simple question. How do you physicallly get 48 drives attached to an LSI that claims to only have 2 internal and 2 external ports? (the controller claims to support up to

Re: [PERFORM] 3ware vs. MegaRAID

2010-04-08 Thread Jesper Krogh

On 2010-04-08 05:44, Dave Crooke wrote: For a card level RAID controller, I am a big fan of the LSI , which is available in a PCIe riser form factor for blade / 1U servers, and comes with 0.5GB of battery backed cache. Full Linux support including mainline kernel drivers and command line conf

Re: [PERFORM] moving pg_xlog -- yeah, it's worth it!

2010-02-09 Thread Jesper Krogh

> > Frankly, I was quite surprised by this, since some of the benchmarks > people have published on the effects of using a separate RAID for > the WAL files have only shown a one or two percent difference when > using a hardware RAID controller with BBU cache configured for > write-back. Hi Kevin.

Re: [PERFORM] Digesting explain analyze

2010-01-06 Thread Jesper Krogh

Ron Mayer wrote: >> ...The inner sets are on average 3.000 for >> both id1 and id2 and a typical limit would be 100, so if I could convince >> postgresql to not fetch all of them then I would reduce the set retrieved >> by around 60. The dataset is quite large so the random query is not very >> lik

[PERFORM] Digesting explain analyze

2010-01-06 Thread Jesper Krogh

Hi. I have a table that consists of somewhere in the magnitude of 100.000.000 rows and all rows are of this tuples (id1,id2,evalue); Then I'd like to speed up a query like this: explain analyze select id from table where id1 = 2067 or id2 = 2067 order by evalue asc limit 100;

Re: [PERFORM] Message queue table - strange performance drop with changing limit size.

2010-01-01 Thread Jesper Krogh

Greg Williamson wrote: > Jesper -- > > I apologize for top-quoting -- a challenged reader. > > This doesn't directly address your question, but I can't help but > notice that the estimates for rows is _wildly_ off the actual number > in each and every query. How often / recently have you run ANAL

[PERFORM] Message queue table - strange performance drop with changing limit size.

2010-01-01 Thread Jesper Krogh

Hi. I have a "message queue" table, that contains in the order of 1-10m "messages". It is implemented using TheSchwartz: http://search.cpan.org/~bradfitz/TheSchwartz-1.07/lib/TheSchwartz.pm So when a "worker" picks the next job it goes into the "job" table an select the top X highest priority mes

Re: [PERFORM] Queryplan within FTS/GIN index -search.

2009-11-03 Thread Jesper Krogh

Tom Lane wrote: > It may well be that Jesper's identified a place where the GIN code could > be improved --- it seems like having the top-level search logic be more > aware of the AND/OR structure of queries would be useful. But the > particular example shown here doesn't make a very good case for

Re: [PERFORM] Compression in PG

2009-11-01 Thread Jesper Krogh

Shaul Dar wrote: > Hi, > > I have several long text fields in my DB that I would love to compress > (descriptions, URLs etc). Can you tell me what options exists in PG > (+pointers please), typical effect on space and run time? variable length text fields .. e.g TEXT will automatically be stored

Re: [PERFORM] Queryplan within FTS/GIN index -search.

2009-10-30 Thread Jesper Krogh

Tom Lane wrote: > But having said that, this particular test case is far from compelling. > Any sane text search application is going to try to filter out > common words as stopwords; it's only the failure to do that that's > making this run slow. Below is tests-runs not with a "commonterm" but an

Re: [PERFORM] Queryplan within FTS/GIN index -search.

2009-10-30 Thread Jesper Krogh

Hi. I've now got a test-set that can reproduce the problem where the two fully equivalent queries ( body_fts @@ to_tsquery("commonterm & nonexistingterm") and body_fts @@ to_tsquery("coomonterm") AND body_fts @@ to_tsquery("nonexistingterm") give a difference of x300 in execution time. (grows wi

Re: [PERFORM] bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).

2009-10-26 Thread Jesper Krogh

Craig Ringer wrote: > On 8.4 on a different system Pg uses the seq scan by preference, with a > runtime of 1148ms. It doesn't seem to want to do a bitmap heap scan when > searching for `commonterm' even when enable_seqscan is set to `off'. A > search for `commonterm80' also uses a seq scan (1067ms)

Re: [PERFORM] bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).

2009-10-26 Thread Jesper Krogh

Craig Ringer wrote: > On Tue, 2009-10-27 at 06:08 +0100, Jesper Krogh wrote: > >>> You should probably re-generate your random value for each call rather >>> than store it. Currently, every document with commonterm20 is guaranteed >>> to also have commonterm40,

Re: [PERFORM] bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).

2009-10-26 Thread Jesper Krogh

Craig Ringer wrote: > On Mon, 2009-10-26 at 21:02 +0100, Jesper Krogh wrote: > >> Test system.. average desktop, 1 SATA drive and 1.5GB memory with pg 8.4.1. >> >> The dataset consists of words randomized, but .. all records contains >> "commonterm", ar

[PERFORM] bitmap heap scan way cheaper than seq scan on the same amount of tuples (fts-search).

2009-10-26 Thread Jesper Krogh

Hi. I'm currently trying to figure out why the tsearch performance seems to vary a lot between different queryplans. I have created a sample dataset that sort of resembles the data I have to work on. The script that builds the dataset is at: http://krogh.cc/~jesper/build-test.pl and http://krogh.

Re: [PERFORM] Full text search - query plan? PG 8.4.1

2009-10-26 Thread Jesper Krogh

Tom Lane wrote: > Jesper Krogh writes: >> Is is possible to manually set the cost for the @@ operator? > > You want to set the cost for the underlying function. alter function ts_match_vq(tsvector,tsquery) cost 500 seems to change my test-queries in a very positive way (e.g. re

Re: [PERFORM] Full text search - query plan? PG 8.4.1

2009-10-23 Thread Jesper Krogh

Scott Marlowe wrote: > On Fri, Oct 23, 2009 at 2:32 PM, Jesper Krogh wrote: >> Tom Lane wrote: >>> Jesper Krogh writes: >>>> Tom Lane wrote: >>>>> ... There's something strange about your tsvector index. Maybe >>>>> it's re

Re: [PERFORM] Full text search - query plan? PG 8.4.1

2009-10-23 Thread Jesper Krogh

Tom Lane wrote: > Jesper Krogh writes: >> Tom Lane wrote: >>> ... There's something strange about your tsvector index. Maybe >>> it's really huge because the documents are huge? > >> huge is a relative term, but length(ts_vector(body)) is about 200

Re: [PERFORM] Calculating selectivity for the query-planner on ts_vector colums.

2009-10-23 Thread Jesper Krogh

Tom Lane wrote: > Jesper Krogh writes: >> It seems to me that the row estimates on a ts_vector search is a bit on >> the low side for terms that is not in th MCV-list in pg_stats: > > tsvector has its own selectivity estimator that's not like plain scalar > equ

[PERFORM] Calculating selectivity for the query-planner on ts_vector colums.

2009-10-23 Thread Jesper Krogh

Hi It seems to me that the row estimates on a ts_vector search is a bit on the low side for terms that is not in th MCV-list in pg_stats: ftstest=# explain select id from ftstest where ftstest_body_fts @@ to_tsquery('nonexistingterm') order by id limit 10;

Re: [PERFORM] Queryplan within FTS/GIN index -search.

2009-10-22 Thread Jesper Krogh

Jeff Davis wrote: > On Thu, 2009-10-22 at 18:28 +0200, Jesper Krogh wrote: >> I somehow would expect the index-search to take advantage of the MCV's >> informations in the statistics that sort of translate it into a search >> and post-filtering (as PG's queryplanne

[PERFORM] Queryplan within FTS/GIN index -search.

2009-10-22 Thread Jesper Krogh

Hi My indexing base is now up to 7.5m documents, I have raise statistics target to 1000 for the tsvector column in order to make the query-planner choose more correctly. That works excellent. Table structure is still: ftstest=# \d ftsbody Table "public.ftsbody"

Re: [PERFORM] Random penalties on GIN index updates?

2009-10-21 Thread Jesper Krogh

Robert Haas wrote: > On Wed, Oct 21, 2009 at 2:35 PM, Tom Lane wrote: >> Jesper Krogh writes: >>> What I seems to miss a way to make sure som "background" application is >>> the one getting the penalty, so a random user doing a single insert >>>

Re: [PERFORM] Random penalties on GIN index updates?

2009-10-21 Thread Jesper Krogh

Tom Lane wrote: > jes...@krogh.cc writes: >> If i understand the technicalities correct then INSERT/UPDATES to the >> index will be accumulated in the "maintainance_work_mem" and the "user" >> being unlucky to fill it up will pay the penalty of merging all the >> changes into the index? > > You ca

Re: [PERFORM] Full text search - query plan? PG 8.4.1

2009-10-18 Thread Jesper Krogh

Tom Lane wrote: > Jesper Krogh writes: >> "commonterm" matches 37K of the 50K documents (majority), but the query >> plan is "odd" in my eyes. > >> * Why does it mis-guess the cost of a Seq Scan on textbody so much? > > The cost looks about

[PERFORM] Full text search - query plan? PG 8.4.1

2009-10-18 Thread Jesper Krogh

Hi. I'm currently testing out PostgreSQL's Full Text Search capabillities. We're currenly using Xapian, it has some nice features and some drawbacks (sorting), so it is especially this area I'm investigating. I've loaded the database with 50K documents, and the table definition is: ftstest=# \d

Re: [PERFORM] Using IOZone to simulate DB access patterns

2009-04-04 Thread Jesper Krogh

henk de wit wrote: I've been using Bonnie++ for ages to do filesystem testing of new DB servers. But Josh Drake recently turned me on to IOZone. Perhaps a little off-topic here, but I'm assuming you are using Linux to test your DB server (since you mention Bonnie++). But it seems to me that

Re: [PERFORM] Backup strategies

2008-10-15 Thread Jesper Krogh

Ivan Voras wrote: Warning: DO NOT do on-the-fly binary backups without snapshots. Archiving the database directory with tar on a regular file system, while the server is running, will result in an archive that most likely won't work when restored. Even if you do a "pg_start_backup/pg_stop_backu

Re: [PERFORM] Message queue table..

2008-04-18 Thread Jesper Krogh

Craig Ringer wrote: Jesper Krogh wrote: Hi. I have this "message queue" table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone dige

[PERFORM] Message queue table..

2008-04-18 Thread Jesper Krogh

Hi. I have this "message queue" table.. currently with 8m+ records. Picking the top priority messages seem to take quite long.. it is just a matter of searching the index.. (just as explain analyze tells me it does). Can anyone digest further optimizations out of this output? (All records h

[PERFORM] "Slow" query or just "Bad hardware"?

2008-03-27 Thread Jesper Krogh

8,1056859,2386006,2386015,2386023,4265832,4231262,4265743,5302612,1121056,1121 090,1121074,688659,688650}'::integer[])) -> Index Scan using ecn_ref_idx on number eumbers (cost=0.00..2.74 rows=1 width=108) (actual time=1.794..1.795 rows=0 loops=389) Index Cond: (numbers.reference_id = me.id) Total runtime: 2287.701 ms (10 rows) .. subsequent run: 32.367ms On a X4600 server with 32GB of ram and Equalogic iSCSI SAN attached. Jesper -- Jesper Krogh -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Benchmark: Dell/Perc 6, 8 disk RAID 10

2008-03-13 Thread Jesper Krogh

Scott Marlowe wrote: On Thu, Mar 13, 2008 at 3:09 PM, justin <[EMAIL PROTECTED]> wrote: I chose to use ext3 on these partition You should really consider another file system. ext3 has two flaws that mean I can't really use it properly. A 2TB file system size limit (at least on the servers

Re: [PERFORM] Benchmark: Dell/Perc 6, 8 disk RAID 10

2008-03-13 Thread Jesper Krogh

Scott Marlowe wrote: On Fri, Mar 14, 2008 at 12:17 AM, Jesper Krogh <[EMAIL PROTECTED]> wrote: Scott Marlowe wrote: > On Thu, Mar 13, 2008 at 3:09 PM, justin <[EMAIL PROTECTED]> wrote: > >> I chose to use ext3 on these partition > > You should really consi

Re: [PERFORM] Restore performance?

2006-04-11 Thread Jesper Krogh

you can speed up reloads by increasing your checkpoint segments to a big > number like 256 and the checkpoint timeout to something like 10 minutes. > All other normal tuning parameters should be what you plan > to use for your normal operations, too. Thanks. Jesper -- Jesper Krogh -

Re: [PERFORM] Restore performance?

2006-04-10 Thread Jesper Krogh

supposed to be wiser. Check. > 3. make sure you are trapping the restore errors properly > psql newdb 2>&1 | cat | tee err works for me. Thats noted. -- Jesper Krogh, [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain analyze is your friend

Re: [PERFORM] Restore performance?

2006-04-10 Thread Jesper Krogh

> If they both took the same amount of time, then you are almost certainly > bottlenecked on gzip. > > Try a faster CPU or use "gzip -fast". gzip does not seem to be the bottleneck, on restore is psql the nr. 1 consumer on cpu-time. Jesper Sorry for the double

[PERFORM] Dump restore performance 7.3 -> 8.1

2006-04-10 Thread Jesper Krogh

re any tricks I can use to speed this dump+restore process up? Neither disk-io (viewed using vmstat 1) or cpu (viewed using top) seems to be the bottleneck. The database contains quite alot of BLOB's, thus the size. Jesper -- Jesper Krogh --

[PERFORM] Restore performance?

2006-04-10 Thread Jesper Krogh

re there any tricks I can use to speed this dump+restore process up? The database contains quite alot of BLOB, thus the size. Jesper -- ./Jesper Krogh, [EMAIL PROTECTED], Jabber ID: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: explain an

Re: [PERFORM] Storing binary data.

2004-08-11 Thread Jesper Krogh

I gmane.comp.db.postgresql.performance, skrev Shridhar Daithankar: > On Wednesday 11 Aug 2004 7:59 pm, Jesper Krogh wrote: > > The "common" solution, I guess would be to store them in the filesystem > > instead, but I like to have them just in the database it is ni

[PERFORM] Storing binary data.

2004-08-11 Thread Jesper Krogh

he" them then it should be excactly as fast i assume. The binary data is not a part of most queries in the database only a few explicitly written to fetch them and they are not accessed very often. What do people normally do? Thanks, Jesper -- ./Jesper Krogh, [EMAI

Re: [PERFORM] pg_dump performance?

2004-08-05 Thread Jesper Krogh

I gmane.comp.db.postgresql.performance, skrev Christopher Kings-Lynne: > Is it the dump or the restore that's really slow? Primarily the dump, it seems to be CPU-bound on the postmaster' process. No signs on IO-bottleneck when I try to monitor with iostat or vmstat -- ./Jesper

[PERFORM] pg_dump performance?

2004-08-05 Thread Jesper Krogh

ily backup too.. (running pg_dump over night ) Jesper -- ./Jesper Krogh, [EMAIL PROTECTED] Jabber ID: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings

96 matches

Mail list logo