Re: [GENERAL] Weird indices

2001-02-21 Thread Stephan Szabo

On Wed, 21 Feb 2001, Martijn van Oosterhout wrote:

> On Tue, Feb 20, 2001 at 05:02:22PM -0800, Stephan Szabo wrote:
> > 
> > IIRC, There's something which is effectively :
> > estimated rows = *
> > I think fraction defaults to (is always?) 1/10 for the standard
> > index type.  That's where the 50 comes from. And the frequency is
> > probably from the last vacuum analyze.
> 
> Is there a way to change this fraction?
> 
> We have a table with over 1 million rows and the statistics Postgres gathers
> are not particularly useful. There is not one (non-null) value that occurs
> significantly more often than other values but the distribution looks a lot
> like a 1/x curve I guess. The most common value occurs 5249 times but the
> average is only 95, so Postgres chooses seq scan almost always. We actually
> now set enable_seqscan=off in many areas of our code to speed it up to a
> useful rate. (This table also happens to have an (accedental) clustering on
> this column also).
> 
> What is the reasoning behind estimating like that? Why not just the average
> or the average + 1 SD?
> 
> Another idea, is there a use for making a "cohesiveness" index. ie. if
> you're looking X by looking up the index, on average, how many also matching
> tuples will be in the next 8k (or whatever size). Since these are likely to
> be in the cache the cost of retreival would be much lower. This would mean
> that an index on a clustered column would have a much lower estimated cost
> than an index on other columns. This would make clustering more useful.

Well, there's been talk about keeping better statistics in the future (see
hackers archives, I can't remember the thread though, it was a while ago).
Keeping the most common frequency and some kind of frequency graph or 
standard deviation would probably be useful.  As for cohesiveness, that
gets kind of difficult to keep track of as changes are made but could
probably be of some use to the estimator.

As far as I know the only way to change the fraction is through
recompiling but Tom would probably know better about that, unfortunately
that's a really big stick to hit the problem with.





[GENERAL] How to release SET() in PgSQL?

2001-02-21 Thread Andrey Y. Mosienko


Hello All!


I used MySQL for a long time. There is SET() conception.
I can define SET('one','two','three') and use that type:

CREATE TABLE "test" (
"a" SET('one','two','three')
};

And then:

 SELECT * FROM test WHERE a = 'one';
or 
 SELECT * FROM test WHERE a like 'one,three';

How can I do it in PgSQL? I need it so much!


-- 
with respection Andrey Feofilactovich.
e-mail: [EMAIL PROTECTED], [EMAIL PROTECTED]
ICQ: 28073807



Re: [GENERAL] postgres load

2001-02-21 Thread Denis Perchine

On Wednesday 21 February 2001 01:19, Joseph wrote:
> I republish my question for I had no answer, and this is a serious
> problem to me... I've used explain, vacuum, indexes... and so on, few
> nested requests...

Do not use ReiserFS on production machines. There are still enormous amount 
of bugs includig data corruption, etc. See recent linux-kernel discussions 
for details. That's why what you expiriencing is possible.

> 
> I am running PGSql 7.0.3 over Linux 2/ELF with a ReiserFS
> filesystem,
> Bi-P3 800 and 2Gb of RAM.
>
> My database jump from 8 in load to 32 without any real reason
> nor too
> much requests.
>
> I already do vacuum even on the fly ifever that can decrease
> load, but
> nothing...
>
> I've done many indexed also...
>
> Can someone help me ?
>
> Emmanuel

-- 
Sincerely Yours,
Denis Perchine

--
E-Mail: [EMAIL PROTECTED]
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
--



Re: [GENERAL] vacuum analyze again...

2001-02-21 Thread Pete Forman

Bruce Momjian writes:
 > > Bruce Momjian <[EMAIL PROTECTED]> writes:
 > > 
 > > > No, we have no ability to randomly pick rows to use for
 > > > estimating statistics.  Should we have this ability?
 > > 
 > > That would be really slick, especially given the fact that VACUUM
 > > runs much faster than VACUUM ANALYZE for a lot of PG users.  I
 > > could change my daily maintenance scripts to do a VACUUM of
 > > everything, followed by a VACUUM ANALYZE of the small tables,
 > > followed by a VACUUM ANALYZE ESTIMATE (or whatever) of the large
 > > tables.
 > > 
 > > Even cooler would be the ability to set a table size threshold,
 > > so that VACUUM ANALYZE would automatically choose the appropriate
 > > method based on the table size.
 > 
 > Added to TODO:
 > 
 > * Allow ANALYZE to process a certain random precentage of
 >   rows

Does this reduced analysis need to be random?  Why not allow the DBA
to specify what rows or blocks to do in some way.
-- 
Pete Forman -./\.- Disclaimer: This post is originated
WesternGeco   -./\.-  by myself and does not represent
[EMAIL PROTECTED] -./\.-  opinion of Schlumberger, Baker
http://www.crosswinds.net/~petef  -./\.-  Hughes or their divisions.



[GENERAL] sequence and stored procedure

2001-02-21 Thread Renaud Tthonnart

How can I use sequences in a stored procedure written with pl/pgsql?

Thanks in advance!
Renaud THONNART



Re: [GENERAL] DBD::Pg is suddenly acting up!

2001-02-21 Thread newsreader

Thank you.  Look what I get..
--
$ pg_ctl start -o "-F -S 2048"
postmaster successfully started up.
$ usage: /home/newsreader/pgsql/bin/postmaster [options]
-B nbufsset number of shared buffers
-D datadir  set data directory
-S  silent mode (disassociate from tty)
-a system   use this authentication system
-b backend  use a specific backend server executable
-d [1-5]set debugging level
-i  listen on TCP/IP sockets as well as Unix domain socket
-N nprocs   set max number of backends (1..1024, default 32)
-n  don't reinitialize shared memory after abnormal exit
-o option   pass 'option' to each backend servers
-p port specify port for postmaster to listen on
-s  send SIGSTOP to all backend servers if one dies
-


I've found that 
pg_ctl -o "-o -F -S 2048" start
works as well as 
pg_ctl start -o "-o -F -S 2048"

--
If you read man page of pg_ctl you will see that
it is telling you wrong





On Wed, Feb 21, 2001 at 11:29:30PM -0500, Tom Lane wrote:
> [EMAIL PROTECTED] writes:
> > pg_ctl is completely not working for me. I do
> > $ pg_ctl -o "-F -S 2048" start
> > and it keeps telling me I'm not doing it right.
> 
> Indeed, you are not.  Try
>   pg_ctl start -o "-F -S 2048"
> 
>   regards, tom lane



Re: [GENERAL] C function woes

2001-02-21 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Is it good practice to scribble around in data type internals?  Why not

> text *
> hello()
> {
>   return textin("hello world");
> }

Well, he could do that if he wanted, but that doesn't teach him anything
about how to prepare his own datatype, which I suppose is the real point
of the exercise.

(Actually, the above code will be *less* portable to 7.1 than the other
version, because the calling convention of textin() changed ...)

regards, tom lane



Re: [GENERAL] DBD::Pg is suddenly acting up!

2001-02-21 Thread newsreader

BTW what is show below was done on
a different machine where postgres is
installed in a user directory..
Just to rule out confusion I had earlier
I deleted bin lib include in the user
home directory and reinstall it under
/usr/local/pgsql and the problem still
remains namely I have to supply extra -o
to make it work.


On Wed, Feb 21, 2001 at 11:43:46PM -0500, [EMAIL PROTECTED] wrote:
> Thank you.  Look what I get..
> --
> $ pg_ctl start -o "-F -S 2048"
> postmaster successfully started up.
> $ usage: /home/newsreader/pgsql/bin/postmaster [options]
> -B nbufsset number of shared buffers
> -D datadir  set data directory
> -S  silent mode (disassociate from tty)
> -a system   use this authentication system
> -b backend  use a specific backend server executable
> -d [1-5]set debugging level
> -i  listen on TCP/IP sockets as well as Unix domain socket
> -N nprocs   set max number of backends (1..1024, default 32)
> -n  don't reinitialize shared memory after abnormal exit
> -o option   pass 'option' to each backend servers
> -p port specify port for postmaster to listen on
> -s  send SIGSTOP to all backend servers if one dies
> -
> 
> 
> I've found that 
>   pg_ctl -o "-o -F -S 2048" start
> works as well as 
>   pg_ctl start -o "-o -F -S 2048"
> 
> --
> If you read man page of pg_ctl you will see that
> it is telling you wrong
> 
> 
> 
> 
> 
> On Wed, Feb 21, 2001 at 11:29:30PM -0500, Tom Lane wrote:
> > [EMAIL PROTECTED] writes:
> > > pg_ctl is completely not working for me. I do
> > >   $ pg_ctl -o "-F -S 2048" start
> > > and it keeps telling me I'm not doing it right.
> > 
> > Indeed, you are not.  Try
> > pg_ctl start -o "-F -S 2048"
> > 
> > regards, tom lane



[GENERAL] www-sql question

2001-02-21 Thread lonnie

Hello All,

I have installed the latest version of PostgreSQL 7.0.3 and all seems to
be fine with it.

My problem is that I have used www-sql in the past with MySQL which
works fine, but since I have just migrated to the newest version of
PostgreSQL, I have tried a recompile --with-database-pgsql which also
went fine, but the application does not want to seem to work.

Has anyone had any luck running www-sql with the newest version of
PostgreSQL?

Also, I am running a Mandrake Linux box and my PostgreSQL seems to be
working fine as I can compile and run the examples without any problems.

Really confused about this.
Lonnie Cumberland




[GENERAL] Re: AllocSetAlloc()

2001-02-21 Thread Karel Zak


On Wed, 21 Feb 2001, Tom Lane wrote:

> Antonis Antoniou <[EMAIL PROTECTED]> writes:
> > I would like  to know what is causing this error. The machine has 1GB of
> > memory and I use  PostgreSQL 7.1beta4.
> > The same function on  PostgreSQL 7.0.2  works ok.  Could someone explain
> > me  what this error means and where to look in order to solve it?
> > select  function_name;
> > ERROR:  Memory exhausted in AllocSetAlloc()
> 
> If it worked in 7.0 and fails now, then that probably means we have a
> bug to fix.  May we see a complete, reproducible example?
> 

 This error returns aset.c only if malloc() ruturs NULL. Not is a proble in
OS/libs?

Karel
 




Re: [GENERAL] pg_shadow.passwd versus pg_hba.conf password passwd

2001-02-21 Thread Peter Eisentraut

Richard Lynch writes:

> If I'm reading "man pg_passwd" correctly, I can create a standard
> Un*x passwd file and use that with "password" in pg_hba.conf

Correct.

> However, the current installation seems to be using "crypt", with no
> passwd file, and with unencrypted passwords in the pg_shadow.passwd
> field

I don't know what your current installation is, but that is definitely a
possible scenario.

> -- Or, at least, as far as I can tell, since /etc/.meta.id has
> the same text as the admin's pg_shadow.passwd field.

The file /etc/.meta.id is not used by PostgreSQL as distributed.

> So, my question is, what is the "passwd" field in pg_shadow for?...

If you don't use the extra argument after "password" in pg_hba.conf then
that's where the password comes from.

> Is that where an unencrypted password would be stored if I used
> "password" rather than "crypt"?...

"password" vs "crypt" is only related to what goes over the wire, not
where the password comes from.

> That seems the exact opposite of the reality on this box.  Or can I
> get pg_hba.conf to just use that field somehow with "crypt"?

Crypt with password file is not possible, I'm afraid.

> If I *cannot* use pg_shadow.passwd for the encrypted password,

You can.  You *are*, AFAICT.

> and I use standard Un*x passwd file, does create_user know enough with
> -P to fill that in properly, or am I on my own?...
>
> How is Cobalt getting this to work with "localhost all crypt" in
> pg_hba.conf, but the password does not seem to be encrypted:
> /etc/.meta.id is plaintext of pg_shadow.passwd, and there is no
> obvious passwd file, so where's the crypt?

On the wire.

-- 
Peter Eisentraut  [EMAIL PROTECTED]   http://yi.org/peter-e/




Re: [GENERAL] Weird indices

2001-02-21 Thread Bruce Momjian

> > Also, more work would be required for every update.  Right now an
> > update requires a B-tree insert for each index.  With this change,
> > every update would require an additional B-tree lookup and write for
> > each index.  That would require on average a bit less than one
> > additional block write per index.  That's a lot.
> > 
> > In exchange, certain queries would become faster.  Specifically, any
> > query which only needed the information found in an index would become
> > faster.  Each such query would save on average a bit less than one
> > additional block read per value found in the index.  But since the
> > indices would be less efficient, some of the advantage would be lost
> > due to extra block reads of the index.
> > 
> > What you are suggesting seems possible, but it does not seem to be
> > obviously better.
> 
> It may not be as obvious as it first seemed to me, but I bet there are
> certain databases out there that have just the right pattern of data
> that would benefit from this.  I suppose this is something that
> compilers have tried to balance all along.  Maybe there could be a
> different type of index that could be manually added by admins who
> wanted to fiddle around with their database.

If you want performance options, MySQL is the champ.  We usually require
a clear reason to give users more options because too many options can
be quite confusing.  It is more a design philosophy.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



Re: [GENERAL] Weird indices

2001-02-21 Thread Joseph Shraibman

Ian Lance Taylor wrote:
> 


> You're right.  The mechanism used to preserve multiple versions of
> heap tuples could be extended to index tuples as well.
> 
> Based on the heap tuple implementation, this would require adding two
> transaction ID's and a few flags to each index tuple.  That's not
> insignificant.  In a B-tree, right now, I think there are 8 bytes plus
> the key for each item in the tree.  This would require adding another
> 10 bytes or so.  That's a lot.
> 
OK now this is starting to make sense to me.  I think.  I guess I'll
really have to sift throught the code again to figure out the rest.

> Also, more work would be required for every update.  Right now an
> update requires a B-tree insert for each index.  With this change,
> every update would require an additional B-tree lookup and write for
> each index.  That would require on average a bit less than one
> additional block write per index.  That's a lot.
> 
> In exchange, certain queries would become faster.  Specifically, any
> query which only needed the information found in an index would become
> faster.  Each such query would save on average a bit less than one
> additional block read per value found in the index.  But since the
> indices would be less efficient, some of the advantage would be lost
> due to extra block reads of the index.
> 
> What you are suggesting seems possible, but it does not seem to be
> obviously better.

It may not be as obvious as it first seemed to me, but I bet there are
certain databases out there that have just the right pattern of data
that would benefit from this.  I suppose this is something that
compilers have tried to balance all along.  Maybe there could be a
different type of index that could be manually added by admins who
wanted to fiddle around with their database.
> 
> If you feel strongly about this, the most reasonable thing would be
> for you to implement it, and test the results.  Since as far as I can
> see what you are suggesting is not clearly better, it's unlikely that
> anybody else is going to go to the considerable effort of implement it
> on your behalf.
> 


> Some things could, sure.  It's not obvious to me that many things
> could.  The planner can't spend a lot of time looking at an index to
> decide whether or not to use it.  If it's going to do that, it's
> better off to just decide to use the index in the first place.  Index
> examination is not free.  It requires disk reads just like everything
> else.
> 
Not free, but possibly worth it if it saves a seq scan.

> > > I don't think there is any way to do that today.  It would be possible
> > > to implement something along the lines I suggest above.  I have no
> > > idea if the Postgres maintainers have any plans along these lines.
> > >
> > At the end of a transaction, when it sets the bit that this tuple isn't
> > valid, couldn't it at the same time also remove it if was no longer
> > visible to any transaction?  It wouldn't remove the need for vacuum
> > because there may be another transaction that prevents it from being
> > removed right then and there.
> 
> Yes, this could be done.  It wouldn't speed things up, though.  In
> fact, it would slow them down.  The only advantage would be that
> VACUUM would be required less often--an advantage which is not
> insignificant.
> 
> I would guess that in the average multi-user database less than half
> of the tuples could be deleted at that point.  It would be easy to
> instrument Postgres to test this--why don't you try that?
> 

I just might.  I've been thinking of hacking postgres, but for adding
xml support to postgres. That seems to be mostly a matter of parsing.



-- 
Joseph Shraibman
[EMAIL PROTECTED]
Increase signal to noise ratio.  http://www.targabot.com