[PERFORM] Help with count(*)

2003-11-14 Thread Rajesh Kumar Mallah


Hi , 

my database seems to be taking too long for a select count(*)
i think there are lot of dead rows. I do a vacuum full it improves
bu again the performance drops in a short while ,
can anyone please tell me if anything worng with my fsm settings
current fsm=55099264 (not sure how i calculated it)

Regds
Mallah

tradein_clients=# SELECT count(*) from data_bank.profiles ;

++
| count  |
++
| 123065 |
++
(1 row)

Time: 49756.969 ms
tradein_clients=#
tradein_clients=#
tradein_clients=# VACUUM full verbose analyze  data_bank.profiles ;
INFO:  vacuuming "data_bank.profiles"

INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 43423 pages
DETAIL:  246130 dead row versions cannot be removed yet.
Nonremovable row versions range from 136 to 2036 bytes long.
There were 427579 unused item pointers.
Total free space (including removable row versions) is 178536020 bytes.
15934 pages are or will become empty, including 0 at the end of the table.
38112 pages containing 178196624 free bytes are potential move destinations.
CPU 1.51s/0.63u sec elapsed 23.52 sec.
INFO:  index "profiles_pincode" now contains 369195 row versions in 3353 pages
DETAIL:  0 index row versions were removed.
379 index pages have been deleted, 379 are currently reusable.
CPU 0.20s/0.24u sec elapsed 22.73 sec.
INFO:  index "profiles_city" now contains 369195 row versions in 3411 pages
DETAIL:  0 index row versions were removed.
1030 index pages have been deleted, 1030 are currently reusable.
CPU 0.17s/0.21u sec elapsed 20.67 sec.
INFO:  index "profiles_branch" now contains 369195 row versions in 2209 pages
DETAIL:  0 index row versions were removed.
783 index pages have been deleted, 783 are currently reusable.
CPU 0.07s/0.14u sec elapsed 6.38 sec.
INFO:  index "profiles_area_code" now contains 369195 row versions in 2606 pages
DETAIL:  0 index row versions were removed.
856 index pages have been deleted, 856 are currently reusable.
CPU 0.11s/0.17u sec elapsed 19.62 sec.
INFO:  index "profiles_source" now contains 369195 row versions in 3137 pages
DETAIL:  0 index row versions were removed.
1199 index pages have been deleted, 1199 are currently reusable.
CPU 0.14s/0.12u sec elapsed 9.95 sec.
INFO:  index "co_name_index_idx" now contains 368742 row versions in 3945 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.19s/0.69u sec elapsed 11.56 sec.
INFO:  index "address_index_idx" now contains 368898 row versions in 4828 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.16s/0.61u sec elapsed 9.17 sec.
INFO:  index "profiles_exp_cat" now contains 153954 row versions in 2168 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.25u sec elapsed 3.14 sec.
INFO:  index "profiles_imp_cat" now contains 73476 row versions in 1030 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.05s/0.11u sec elapsed 8.73 sec.
INFO:  index "profiles_manu_cat" now contains 86534 row versions in 1193 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/0.13u sec elapsed 1.44 sec.
INFO:  index "profiles_serv_cat" now contains 19256 row versions in 267 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.01u sec elapsed 0.25 sec.
INFO:  index "profiles_pid" now contains 369195 row versions in 812 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.03s/0.12u sec elapsed 0.41 sec.
INFO:  index "profiles_pending_branch_id" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "profiles": moved 0 row versions, truncated 43423 to 43423 pages
DETAIL:  CPU 1.76s/3.01u sec elapsed 60.39 sec.
INFO:  vacuuming "pg_toast.pg_toast_39873340"
INFO:  "pg_toast_39873340": found 0 removable, 65 nonremovable row versions in 15 pages
DETAIL:  0 dead row versions cannot be removed yet.
Nonremovable row versions range from 47 to 2034 bytes long.
There were 0 unused item pointers.
Total free space (including removable row versions) is 17672 bytes.
0 pages are or will become empty, including 0 at the end of the table.
14 pages containing 17636 free bytes are potential move destinations.
CPU 0.00s/0.00u sec elapsed 0.77 sec.
INFO:  index "pg_toast_39873340_index" now contains 65 row versions in 2 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.46 sec.
INFO:  "pg_toast_39873340": moved 0 row versions, truncated 15 to 15 pages
DETAIL:  CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "data_bank.profiles"
INFO:  "pr

Re: [PERFORM] Help with count(*)

2003-11-14 Thread Shridhar Daithankar
On Friday 14 November 2003 12:51, Rajesh Kumar Mallah wrote:
> Hi ,
>
> my database seems to be taking too long for a select count(*)
> i think there are lot of dead rows. I do a vacuum full it improves
> bu again the performance drops in a short while ,
> can anyone please tell me if anything worng with my fsm settings
> current fsm=55099264 (not sure how i calculated it)

If you don't need exact count, you can use statistics. Just analyze frequently 
and you will get the statistics.

and I didn't exact;y understand this in the text.

INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 
43423 pages
DETAIL:  246130 dead row versions cannot be removed yet.

Is there a transaction holoding up large amount of stuff?

 Shridhar


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] INSERT extremely slow with large data sets (fwd)

2003-11-14 Thread Slavisa Garic
Hi Everyone,
 
I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x onwards) 
and with all of them I noticed same problem with INSERTs when there is a
large data set. Just to so you guys can compare time it takes to insert
one row into a table when there are only few rows present and when there
are thousands:

Rows PresentStart Time  Finish Time

100 1068790804.12   1068790804.12
10001068790807.87   1068790807.87
50001068790839.26   1068790839.27
1   1068790909.24   1068790909.26
2   1068791172.82   1068791172.85
3   1068791664.06   1068791664.09 
4   1068792369.94   1068792370.0
5   1068793317.53   1068793317.6
6   1068794369.38   1068794369.47

As you can see if takes awfully lots of time for me just to have those
values inserted. Now to make a picture a bit clearer for you this table 
has lots of information in there, about 25 columns. Also there are few
indexes that I created so that the process of selecting values from there
is faster which by the way works fine. Selecting anything takes under 5
seconds.

Any help would be greatly appreciated even pointing me in the right
direction where to ask this question. By the way I designed the database
this way as my application that uses PGSQL a lot during the execution so
there was a huge need for fast SELECTs. Our experiments are getting larger
and larger every day so fast inserts would be good as well.

Just to note those times above are of INSERTs only. Nothing else done that
would be included in those times. Machine was also free and that was the
only process running all the time and the machine was Intel(R) Pentium(R)
4 CPU 2.40GHz.

Regards,
Slavisa





---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] IDE Hardware RAID Controller

2003-11-14 Thread Nick Barr
Heya,

FYI just spotted this and thought I would pass it on, for all those who are
looking at new boxes.

http://www.theinquirer.net/?article=12665
http://www.promise.com/product/product_detail_eng.asp?productId=112&familyId
=2

Looks like a four-channel hot-swap IDE (SATA) hardware RAID controller with
up to 256Mb onboard RAM.


Nick





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] INSERT extremely slow with large data sets (fwd)

2003-11-14 Thread Jeff
On Fri, 14 Nov 2003 20:38:33 +1100 (EST)
Slavisa Garic <[EMAIL PROTECTED]> wrote:

> Any help would be greatly appreciated even pointing me in the right
> direction where to ask this question. By the way I designed the
> database this way as my application that uses PGSQL a lot during the
> execution so there was a huge need for fast SELECTs. Our experiments
> are getting larger and larger every day so fast inserts would be good
> as well.
> 

First, you need to upgrade to 7.3.4, 7.4 is prefable if a dump/reload is
not too bad.

Standard set of questions:

1. Any foreign keys
2. are these inserts batched into transactions
3. CPU usage?
4. OS?
5. PG config? [shared_buffers, effective_cache_size, etc]
6. IO saturation?

Also, try searching the archives. lots of juicy info there too.

-- 
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Christopher Browne
Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Rajesh Kumar Mallah) 
wrote:
> INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 43423 pages
> DETAIL:  246130 dead row versions cannot be removed yet.
> Nonremovable row versions range from 136 to 2036 bytes long.

It seems as though you have a transaction open that is holding onto a
whole lot of old rows.

I have seen this happen somewhat-invisibly when a JDBC connection
manager opens transactions for each connection, and then no processing
happens to use those connections for a long time.  The open
transactions prevent vacuums from doing any good...
-- 
If this was helpful,  rate me
http://cbbrowne.com/info/multiplexor.html
"Waving away a cloud of smoke, I  look up, and am blinded by a bright,
white light.  It's God. No,  not Richard Stallman, or  Linus Torvalds,
but God. In a booming voice, He  says: "THIS IS A SIGN. USE LINUX, THE
FREE Unix SYSTEM FOR THE 386." -- Matt Welsh

---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] constant vs function param differs in performance

2003-11-14 Thread SZŰCS Gábor
Dear Gurus,

I have two SQL function that produce different times and I can't understand
why. Here is the basic difference between them:

CREATE FUNCTION test_const_1234 () RETURNS int4 AS '
  SELECT ... 1234 ... 1234  1234 ...
' LANGUAGE 'SQL';

CREATE FUNCTION test_param (int4) RETURNS int4 AS '
  SELECT ... $1  $1  $1 ...
' LANGUAGE 'SQL';

Some sample times for different data:

test_const_1234() 450 msec
test_param(1234) 2700-4000 msec (probably disk cache)
test_const_5678()   13500 msec
test_param(5678)14500 msec

Is there a sane explanation? a solution?
I can send more info if you wish.

TIA,
G.
--- cut here ---


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Seeking help with a query that takes too long

2003-11-14 Thread Nick Fankhauser

> Does actor_case_assignment contain more columns than just the two ids?
> If yes, do these additional fields account for ca. 70 bytes per tuple?
> If not, try
>   VACUUM FULL ANALYSE actor_case_assignment;

actor_case_assignment has its own primary key and a "role" field in addition
to the ids you've seen, so 70 bytes sounds reasonable. (The PK is to allow a
remote mirroring application to update these records- otherwise it would be
unnecessary.)



> 7ms per
> tuple returned looks like a lot of disk seeks are involved.  Is
> clustering actor on actor_full_name_uppercase an option or would this
> slow down other queries?

Good question... I've never used clustering in PostgreSQL before, so I'm
unsure. I presume this is like clustering in Oracle where the table is
ordered to match the index? If so, I think you may be onto something because
the only other field We regularly query on is the actor_id. Actor_id has a
unique index with no clustering currently, so I don't think I'd lose a thing
by clustering on actor_full_name_uppercase.

I'll give this a try & let you know how it changes.

BTW, you are correct that caching has a big affect on the actual time
figures in this case- I'm working on my development DB, so cahced info
doesn't get trampled as quickly by other users. Is there a way to flush out
the cache in a testing situation like this in order to start from a
consistent base?


Thanks!
-Nick



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Query question

2003-11-14 Thread Neil Conway
Josh Berkus <[EMAIL PROTECTED]> writes:
> The only thing you're adding to the query is a second SORT step, so it 
> shouldn't require any more time/memory than the query's first SORT
> did.

Interesting -- I wonder if it would be possible for the optimizer to
detect this and avoid the redundant inner sort ... (/me muses to
himself)

-Neil


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Seeking help with a query that takes too long

2003-11-14 Thread Manfred Koizar
On Fri, 14 Nov 2003 11:00:38 -0500, "Nick Fankhauser"
<[EMAIL PROTECTED]> wrote:
>Good question... I've never used clustering in PostgreSQL before, so I'm
>unsure. I presume this is like clustering in Oracle where the table is
>ordered to match the index?

Yes, something like that.  With the exception that Postgres looses the
clustered status, while you INSERT and UPDATE tuples.  So you have to
re-CLUSTER from time to time.  Look at pg_stats.correlation to see, if
its necessary.

> Is there a way to flush out
>the cache in a testing situation like this in order to start from a
>consistent base?

To flush Postgres shared buffers:
SELECT count(*) FROM another_large_table;

To flush your database pages from the OS cache:
tar cf /dev/null /some/large/directory

And run each of your tests at least twice to get a feeling how caching
affects your specific queries.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Query question

2003-11-14 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> Interesting -- I wonder if it would be possible for the optimizer to
> detect this and avoid the redundant inner sort ... (/me muses to
> himself)

I think the ability to generate two sort steps is a feature, not a bug.
This has been often requested in connection with user-defined
aggregates, where it's handy to be able to control the order of arrival
of rows at the aggregation function.  If the optimizer suppressed the
inner sort then we'd lose that ability.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Hannu Krosing
Christopher Browne kirjutas R, 14.11.2003 kell 16:13:
> Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Rajesh Kumar Mallah) 
> wrote:
> > INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 43423 
> > pages
> > DETAIL:  246130 dead row versions cannot be removed yet.
> > Nonremovable row versions range from 136 to 2036 bytes long.
> 
> It seems as though you have a transaction open that is holding onto a
> whole lot of old rows.
> 
> I have seen this happen somewhat-invisibly when a JDBC connection
> manager opens transactions for each connection, and then no processing
> happens to use those connections for a long time.  The open
> transactions prevent vacuums from doing any good...

Can't the backend be made to delay the "real" start of transaction until
the first query gets executed ?


Hannu


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes:
> Christopher Browne kirjutas R, 14.11.2003 kell 16:13:
>> I have seen this happen somewhat-invisibly when a JDBC connection
>> manager opens transactions for each connection, and then no processing
>> happens to use those connections for a long time.  The open
>> transactions prevent vacuums from doing any good...

> Can't the backend be made to delay the "real" start of transaction until
> the first query gets executed ?

That is on the TODO list.  I looked at it briefly towards the end of the
7.4 development cycle, and decided that it was nontrivial and I didn't
have time to make it happen before beta started.  I don't recall why it
didn't seem trivial.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Will LaShell
Hannu Krosing wrote:

Christopher Browne kirjutas R, 14.11.2003 kell 16:13:
 

Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Rajesh Kumar Mallah) wrote:
   

INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 43423 pages
DETAIL:  246130 dead row versions cannot be removed yet.
Nonremovable row versions range from 136 to 2036 bytes long.
 

It seems as though you have a transaction open that is holding onto a
whole lot of old rows.
I have seen this happen somewhat-invisibly when a JDBC connection
manager opens transactions for each connection, and then no processing
happens to use those connections for a long time.  The open
transactions prevent vacuums from doing any good...
   

Can't the backend be made to delay the "real" start of transaction until
the first query gets executed ?
 

That seems counter intuitive doesn't it?  Why write more code in the 
server when the client is the thing that has the problem?

Will

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Christopher Browne
After a long battle with technology, [EMAIL PROTECTED] (Hannu Krosing), an earthling, 
wrote:
> Christopher Browne kirjutas R, 14.11.2003 kell 16:13:
>> Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Rajesh Kumar Mallah) 
>> wrote:
>> > INFO:  "profiles": found 0 removable, 369195 nonremovable row versions in 43423 
>> > pages
>> > DETAIL:  246130 dead row versions cannot be removed yet.
>> > Nonremovable row versions range from 136 to 2036 bytes long.
>> 
>> It seems as though you have a transaction open that is holding onto a
>> whole lot of old rows.
>> 
>> I have seen this happen somewhat-invisibly when a JDBC connection
>> manager opens transactions for each connection, and then no processing
>> happens to use those connections for a long time.  The open
>> transactions prevent vacuums from doing any good...
>
> Can't the backend be made to delay the "real" start of transaction until
> the first query gets executed ?

One would hope so.  Some time when I have the Round Tuits, I ought to
take a browse of the connection pool code to notice if there's
anything to notice.  

The thing that I keep imagining would be a slick idea would be to have
a thread periodically go through once for however many connections the
pool permits and fire a short transaction through every
otherwise-unoccupied connection in the pool, in effect, doing a sort
of "vacuum" of the connections.  I don't get very favorable reactions
when I suggest that, though...
-- 
(reverse (concatenate 'string "ac.notelrac.teneerf" "@" "454aa"))
http://cbbrowne.com/info/sgml.html
Rules  of  the  Evil Overlord  #80.  "If  my  weakest troops  fail  to
eliminate a  hero, I will send  out my best troops  instead of wasting
time with progressively stronger ones  as he gets closer and closer to
my fortress." 

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Help with count(*)

2003-11-14 Thread Tom Lane
Will LaShell <[EMAIL PROTECTED]> writes:
> Hannu Krosing wrote:
>> Can't the backend be made to delay the "real" start of transaction until
>> the first query gets executed ?

> That seems counter intuitive doesn't it?  Why write more code in the 
> server when the client is the thing that has the problem?

Because there are a lot of clients with the same problem :-(

A more principled argument is that we already postpone the setting of
the transaction snapshot until the first query arrives within the
transaction.  In a very real sense, the setting of the snapshot *is*
the start of the transaction.  So it would make sense if incidental
stuff like VACUUM also thought that the transaction hadn't started
until the first query arrives.  (I believe the previous discussion
also agreed that we wanted to postpone the freezing of now(), which
currently also happens at BEGIN rather than the first command after
BEGIN.)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] constant vs function param differs in performance

2003-11-14 Thread Tom Lane
"=?iso-8859-2?B?U1rbQ1MgR+Fib3I=?=" <[EMAIL PROTECTED]> writes:
> I have two SQL function that produce different times and I can't understand
> why.

The planner often produces different plans when there are constants in
WHERE clauses than when there are variables, because it can get more
accurate ideas of how many rows will be retrieved.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] n_distinct way off, but following a pattern.

2003-11-14 Thread Nick Fankhauser
Hi-

I'm seeing estimates for n_distinct that are way off for a large table
(8,700,000 rows). They get better by setting the stats target higher, but
are still off by a factor of 10 with the stats set to 1000. I've noticed and
reported a similar pattern before on another table. Because this follows the
same very consistent pattern, I figured it was worth reporting again. This
looks more like a bug than randomness. If the poor result was simply due to
having a small sample to work from, the estimates should be all over the
map, but these are consistently low, and vary in almost exact inverse
proportion to the stats target:

run 1: run2: run3:
n_distinct estimate, statistics = 10:   3168   3187  3212
n_distinct estimate, statistics = 100:  23828  24059 23615
n_distinct estimate, statistics = 1000: 194690 194516194081
Actual distinct values: 3340724

Or to put it another way, if you were to take the estimate from analyze,
divide by the stats target and multiply by 1, the result would be pretty
close to exact. (Within a factor of 2, which ought to be plenty close for
planning purposes.)

I'm running version 7.3.2

Any thoughts from folks familiar with this part of the source code?

Regards,
 -Nick

PS-
Here's a log of the session that I got this from.

alpha=# select count(distinct actor_id) from actor_case_assignment;
-[ RECORD 1 ]--
count | 3340724
alpha=# analyze;
ANALYZE
alpha=# SELECT * FROM pg_stats
alpha-#  WHERE tablename='actor_case_assignment' AND attname='actor_id';
-[ RECORD
1 ]-+---


schemaname| public
tablename | actor_case_assignment
attname   | actor_id
null_frac | 0
avg_width | 16
n_distinct| 3168
most_common_vals  |
{18105XS,18115XS,18106XS,18113JD02,18115JD02,18106J27,18113XS,18113A10656,18
115LST,18108XS}
most_common_freqs |
{0.0206667,0.0206667,0.0196667,0.019,0.0176667,0.017,0.016,0.015,0.0
14,0.0136667}
histogram_bounds  |
{18067A000-07P,18067PD397SC1574-1,18105LBPD,18106A2119-49,18106PD399IF845-1,
18108A03068-20,18108LECS207,18108PTW03737278-2,18111A19788-77,18115A50420,18
115XC}
correlation   | 0.876795
alpha=#
alpha=# alter table actor_case_assignment alter column actor_id set
statistics 100;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# SELECT * FROM pg_stats
alpha-#  WHERE tablename='actor_case_assignment' AND attname='actor_id';
-[ RECORD 1 ]

schemaname| public
tablename | actor_case_assignment
attname   | actor_id
null_frac | 0
avg_width | 17
n_distinct| 23828
most_common_vals  | {18115XS,18113JD02,18106XS,1

alpha=# alter table actor_case_assignment alter column actor_id set
statistics 1000;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# SELECT * FROM pg_stats
alpha-#  WHERE tablename='actor_case_assignment' AND attname='actor_id';
-[ RECORD 1 ]-

schemaname| public
tablename | actor_case_assignment
attname   | actor_id
null_frac | 0
avg_width | 16
n_distinct| 194690
most_common_vals  | {18106XS,18115XS,18115...


alpha=# \x
Expanded display is off.
alpha=# alter table actor_case_assignment alter column actor_id set
statistics 10;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# select n_distinct from pg_stats where
tablename='actor_case_assignment and attname='actor_id';
alpha'# ';
ERROR:  parser: parse error at or near "actor_id" at character 85
alpha=# select n_distinct from pg_stats where
tablename='actor_case_assignment' and attname='actor_id';
 n_distinct

   3187
(1 row)

alpha=# alter table actor_case_assignment alter column actor_id set
statistics 10;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# select n_distinct from pg_stats where
tablename='actor_case_assignment' and attname='actor_id';
 n_distinct

   3212
(1 row)

alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# alter table actor_case_assignment alter column actor_id set
statistics 100;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# select n_distinct from pg_stats where
tablename='actor_case_assignment' and attname='actor_id';
 n_distinct

  24059
(1 row)

alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# alter table actor_case_assignment alter column actor_id set
statistics 100;
ALTER TABLE
alpha=# select n_distinct from pg_stats where
tablename='actor_case_assignment' and attname='actor_id';
 n_distinct

  23615
(1 row)

alpha=# alter table actor_case_assignment alter column actor_id set
statistics 1000;
ALTER TABLE
alpha=# analyze actor_case_assignment;
ANALYZE
alpha=# select n_distinct from pg_stats where
tablename='actor_cas

Re: [PERFORM] Help with count(*)

2003-11-14 Thread Andrew Sullivan
On Fri, Nov 14, 2003 at 02:16:56PM -0500, Christopher Browne wrote:
> otherwise-unoccupied connection in the pool, in effect, doing a sort
> of "vacuum" of the connections.  I don't get very favorable reactions
> when I suggest that, though...

Because it's a kludge on top of another kludge, perhaps?  ;-)  This
needs to be fixed properly, not through an ungraceful series of
workarounds.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
<[EMAIL PROTECTED]>  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] INSERT extremely slow with large data sets (fwd)

2003-11-14 Thread George Essig
Slavisa Garic wrote:

> Hi Everyone,
 
> I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x 
> onwards) 
> and with all of them I noticed same problem with INSERTs when there is 
> a
> large data set. Just to so you guys can compare time it takes to insert
> one row into a table when there are only few rows present and when 
> there
> are thousands:

Try running VACUUM ANALYZE periodically during inserts.  I found this to help.

George Essig


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Error in transaction processing

2003-11-14 Thread radha.manohar
When I execute a transaction using embedded sql statements in a c program,
I get the error,

Error in transaction processing. I could see from the documentation that
it means, "Postgres signalled to us that we cannot start, commit or
rollback the transaction"

I don't find any mistakes in the transaction statements.

What can I do to correct this error?

Your response would be very much appreciated.

Thanks and Regards,

Radha





---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] n_distinct way off, but following a pattern.

2003-11-14 Thread Tom Lane
"Nick Fankhauser" <[EMAIL PROTECTED]> writes:
> I'm seeing estimates for n_distinct that are way off for a large table

Estimating n_distinct from a small sample is inherently a hard problem.
I'm not surprised that the estimates would get better as the sample size
increases.  But maybe we can do better.  The method we are currently
using is this:

/*--
 * Estimate the number of distinct values using the estimator
 * proposed by Haas and Stokes in IBM Research Report RJ 10025:
 *n*d / (n - f1 + f1*n/N)
 * where f1 is the number of distinct values that occurred
 * exactly once in our sample of n rows (from a total of N),
 * and d is the total number of distinct values in the sample.
 * This is their Duj1 estimator; the other estimators they
 * recommend are considerably more complex, and are numerically
 * very unstable when n is much smaller than N.

It would be interesting to see exactly what inputs are going into this
equation.  Do you feel like adding some debug printouts into this code?
Or just looking at the variables with a debugger?  In 7.3 it's about
line 1060 in src/backend/commands/analyze.c.

BTW, this is already our second try at this problem, the original 7.2
equation didn't last long at all ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] INSERT extremely slow with large data sets (fwd)

2003-11-14 Thread Slavisa Garic
Does VACUUM ANALYZE help with the analysis or it also speeds up the
process. I know i could try that before I ask but experiment is running
now and I am too curious to wait :),

Anyway thanks for the hint,
Slavisa

On Fri, 14 Nov 2003, George Essig wrote:

> Slavisa Garic wrote:
> 
> > Hi Everyone,
>  
> > I am using PostgreSQL 7.3.2 and have used earlier versions (7.1.x 
> > onwards) 
> > and with all of them I noticed same problem with INSERTs when there is 
> > a
> > large data set. Just to so you guys can compare time it takes to insert
> > one row into a table when there are only few rows present and when 
> > there
> > are thousands:
> 
> Try running VACUUM ANALYZE periodically during inserts.  I found this to help.
> 
> George Essig
> 
> 


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Query question

2003-11-14 Thread Christopher Kings-Lynne
The only thing you're adding to the query is a second SORT step, so it 
shouldn't require any more time/memory than the query's first SORT
did.


Interesting -- I wonder if it would be possible for the optimizer to
detect this and avoid the redundant inner sort ... (/me muses to
himself)
That's somethign I've wondered myself as well.  Also - I wonder if the 
optimiser could be made smart enough to push down the outer LIMIT and 
OFFSET clauses into the subquery.

Chris



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html