date:20020424

[HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Curt Sampson

At 12:41 PM 4/23/02 -0400, Bruce Momjian wrote:

>This is an interesting point, that an index scan may fit in the cache
>while a sequential scan may not.

If so, I would expect that the number of pages read is significantly
smaller than it was with a sequential scan. If that's the case,
doesn't that mean that the optimizer made the wrong choice anyway?

BTW, I just did a quick walk down this chain of code to see what happens
during a sequential scan:

access/heap/heapam.c
storage/buffer/bufmgr.c
storage/smgr/smgr.c
storage/smgr/md.c

and it looks to me like individual reads are being done in BLKSIZE
chunks, whether we're scanning or not.

During a sequential scan, I've heard that it's more efficient to
read in multiples of your blocksize, say, 64K chunks rather than
8K chunks, for each read operation you pass to the OS. Does anybody
have any experience to know if this is indeed the case? Has anybody
ever added this to postgresql and benchmarked it?

Certainly if there's a transaction based limit on disk I/O, as well
as a throughput limit, it would be better to read in larger chunks.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-24 Thread mlw


Curt Sampson wrote:
> 
> On Tue, 23 Apr 2002, mlw wrote:
> 
> > > On a system that has neither read-ahead nor sorting of I/O requests,
> > > yes. Which systems are you using that provide neither of these
> > > facilities?
> >
> > This only happens if the OS can organize the I/O requests in such a manner. It
> > is a non-trivial function.
> 
> Well, if you call less than 200 lines of code (including lots of
> comments), "non-trivial," yes. Have a look at NetBSD's
> src/sys/kern/subr_disk.c for one example implementation.
> 
> But trivial or not, if all operating systems on which Postgres runs
> are doing this, your point is, well, pointless. So, once again, which
> systems are you using that do *not* do this?

I am not arguing about whether or not they do it, I am saying it is not always
possible. I/O requests do not remain in queue waiting for reordering
indefinitely.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[HACKERS] Returning text from stored procedures??

2002-04-24 Thread Steffen Nielsen


Hi!

I seem to have trouble returning large strings of text from stored procedures 
(coded in C), It works fine for smaller data?
And elog prints out the result nicely; but when the result is "returned" it 
makes the server disconnect I use the standard Datum methods (actually 
I've tried any possible way returning the damn text :-( ).

Any Ideas why? Are there restriction on the size of text a stored procedure 
can return?

Any help is appreciated :-)

/Steffen Nielsen


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] "make report"

2002-04-24 Thread Thomas Lockhart


> If you want to collect information about what features are portable you
> can check other software packages, product manuals, ports trees, etc.
> Most issues are documented someplace.

Oh goodness. Thanks for offering me a new hobby ;)

> Btw., yes, 8 byte integers are missing on some platforms.

Right. The two areas which come to mind are integer availability and the
timezone support (as you might know we support *three* different time
zone models). At the moment, none of the developers know the features
supported on the platforms we claim to support. Which platforms do not
have int8 support still? Which do not have time zone interfaces fitting
into the two "zonefull" styles? I'd like to know, but istm that the
people *with* the platforms could do this much more easily than those
without. What am I missing here??

  - Thomas

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> > I'd be willing to consider making the behavior variable-specific
> > if anyone can identify particular variables that need to behave
> > differently.  But overall I think it's better that the behavior
> > be consistent --- so you'll need a good argument to convince me
> > that anything should behave differently ;-).
> > 
> > There is a variant case that should also have been illustrated:
> > what if there is no error, but the user does ROLLBACK instead of
> > COMMIT?  The particular case that is causing difficulty for me is
> > 
> > begin;
> > create schema foo;
> > set search_path = foo;
> > rollback;
> > 
> > There is *no* alternative here but to roll back the search_path
> > setting.
> 
>   begin;
>   ;
>   ERROR:  parser: parse error at or near ""
> 
> There's *no* alternative here but to call *rollback*(commit).
> However PostgreSQL doesn't call *rollback* automatically and
> it's the user's responsibility to call *rollback* on errors.
> IMHO what to do with errors is users' responsibility basically.
> The behavior of the *search_path" variable is a *had better*
> or *convenient* kind of thing not a *no alternative* kind
> of thing.

I understand from an ODBC perspective that it is the apps
responsibility, but we need some defined behavior for a psql script that
is fed into the database.

Assuming the SET commands continue to come after it is aborted but
before the COMMIT/ROLLBACK, we need to define how to handle it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] namedatalen part 2 (cont'd)

2002-04-24 Thread Bruce Momjian


Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > ...Based on that data, I'd vote against making any changes to NAMEDATALEN.
> 
> It looked to me like the cost for going to NAMEDATALEN = 64 would be
> reasonable.  Based on these numbers I'd have a problem with 128 or more.
> 
> But as you observe, pgbench numbers are not very repeatable.  It'd be
> nice to have some similar experiments with another benchmark before
> making a decision.

Yes, 64 looked like the appropriate value too.  Actually, I was
surprised to see as much of a slowdown as we did.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Bruce Momjian

OK, I have applied the following patch to fix these warnings.  However,
I need Mario to confirm these are the right changes.  Thanks.

---

Peter Eisentraut wrote:
> This patch does not compile correctly:
> 
> pg_backup_archiver.c: In function `ahwrite':
> pg_backup_archiver.c:1252: warning: pointer of type `void *' used in arithmetic
> pg_backup_archiver.c:1259: warning: pointer of type `void *' used in arithmetic
> pg_backup_archiver.c:1263: warning: pointer of type `void *' used in arithmetic
> make: *** [pg_backup_archiver.o] Error 1
> 
> 
> Bruce Momjian writes:
> 
> >
> > Patch applied.  Thanks.
> >
> > ---
> >
> >
> > Mario Weilguni wrote:
> > > Am Donnerstag, 11. April 2002 17:44 schrieb Tom Lane:
> > > > "Mario Weilguni" <[EMAIL PROTECTED]> writes:
> > > > > And I did not find out how I can detect the large object
> > > > > chunksize, either from getting it from the headers (include
> > > > > "storage/large_object.h" did not work)
> > > >
> > >
> > > You did not answer if it's ok to post the patch, hope it's ok:
> >
> >
> 
> -- 
> Peter Eisentraut   [EMAIL PROTECTED]
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

Index: src/bin/pg_dump/pg_backup_archiver.c
===
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v
retrieving revision 1.43
diff -c -r1.43 pg_backup_archiver.c
*** src/bin/pg_dump/pg_backup_archiver.c24 Apr 2002 02:21:04 -  1.43
--- src/bin/pg_dump/pg_backup_archiver.c24 Apr 2002 14:01:15 -
***
*** 1249,1266 
  int remaining = AH->lo_buf_size - AH->lo_buf_used;
  int slack = nmemb * size - remaining;

! memcpy(AH->lo_buf + AH->lo_buf_used, ptr, remaining);
  res = lo_write(AH->connection, AH->loFd, AH->lo_buf, 
AH->lo_buf_size);
  ahlog(AH, 5, "wrote %d bytes of large object data (result = %d)\n",
AH->lo_buf_size, res);
  if (res != AH->lo_buf_size)
die_horribly(AH, modulename, "could not write to large object 
(result: %d, expected: %d)\n",
 res, AH->lo_buf_size);
! memcpy(AH->lo_buf, ptr + remaining, slack);
  AH->lo_buf_used = slack;
   } else {
 /* LO Buffer is still large enough, buffer it */
!memcpy(AH->lo_buf + AH->lo_buf_used, ptr, size * nmemb);
 AH->lo_buf_used += size * nmemb;
   }

--- 1249,1266 
  int remaining = AH->lo_buf_size - AH->lo_buf_used;
  int slack = nmemb * size - remaining;

! memcpy((char *)AH->lo_buf + AH->lo_buf_used, ptr, remaining);
  res = lo_write(AH->connection, AH->loFd, AH->lo_buf, 
AH->lo_buf_size);
  ahlog(AH, 5, "wrote %d bytes of large object data (result = %d)\n",
AH->lo_buf_size, res);
  if (res != AH->lo_buf_size)
die_horribly(AH, modulename, "could not write to large object 
(result: %d, expected: %d)\n",
 res, AH->lo_buf_size);
! memcpy(AH->lo_buf, (char *)ptr + remaining, slack);
  AH->lo_buf_used = slack;
   } else {
 /* LO Buffer is still large enough, buffer it */
!memcpy((char *)AH->lo_buf + AH->lo_buf_used, ptr, size * nmemb);
 AH->lo_buf_used += size * nmemb;
   }

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Bruce Momjian


Curt Sampson wrote:
> At 12:41 PM 4/23/02 -0400, Bruce Momjian wrote:
> 
> >This is an interesting point, that an index scan may fit in the cache
> >while a sequential scan may not.
> 
> If so, I would expect that the number of pages read is significantly
> smaller than it was with a sequential scan. If that's the case,
> doesn't that mean that the optimizer made the wrong choice anyway?
> 
> BTW, I just did a quick walk down this chain of code to see what happens
> during a sequential scan:
> 
> access/heap/heapam.c
> storage/buffer/bufmgr.c
> storage/smgr/smgr.c
> storage/smgr/md.c
> 
> and it looks to me like individual reads are being done in BLKSIZE
> chunks, whether we're scanning or not.
> 
> During a sequential scan, I've heard that it's more efficient to
> read in multiples of your blocksize, say, 64K chunks rather than
> 8K chunks, for each read operation you pass to the OS. Does anybody
> have any experience to know if this is indeed the case? Has anybody
> ever added this to postgresql and benchmarked it?
> 
> Certainly if there's a transaction based limit on disk I/O, as well
> as a throughput limit, it would be better to read in larger chunks.

We expect the file system to do re-aheads during a sequential scan. 
This will not happen if someone else is also reading buffers from that
table in another place.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Mario Weilguni


I wanted to correct the patch this evening after work, and will check your changes. 
Thanks!

-Ursprüngliche Nachricht-
Von: Bruce Momjian [mailto:[EMAIL PROTECTED]]
Gesendet: Mittwoch, 24. April 2002 16:03
An: Peter Eisentraut
Cc: Mario Weilguni; [EMAIL PROTECTED]
Betreff: Re: [HACKERS] Inefficient handling of LO-restore + Patch



OK, I have applied the following patch to fix these warnings.  However,
I need Mario to confirm these are the right changes.  Thanks.

---

Peter Eisentraut wrote:
> This patch does not compile correctly:
> 
> pg_backup_archiver.c: In function `ahwrite':
> pg_backup_archiver.c:1252: warning: pointer of type `void *' used in arithmetic
> pg_backup_archiver.c:1259: warning: pointer of type `void *' used in arithmetic
> pg_backup_archiver.c:1263: warning: pointer of type `void *' used in arithmetic
> make: *** [pg_backup_archiver.o] Error 1
> 
> 
> Bruce Momjian writes:
> 
> >
> > Patch applied.  Thanks.
> >
> > ---
> >
> >
> > Mario Weilguni wrote:
> > > Am Donnerstag, 11. April 2002 17:44 schrieb Tom Lane:
> > > > "Mario Weilguni" <[EMAIL PROTECTED]> writes:
> > > > > And I did not find out how I can detect the large object
> > > > > chunksize, either from getting it from the headers (include
> > > > > "storage/large_object.h" did not work)
> > > >
> > >
> > > You did not answer if it's ok to post the patch, hope it's ok:
> >
> >
> 
> -- 
> Peter Eisentraut   [EMAIL PROTECTED]
> 
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Tom Lane

Philip Warner <[EMAIL PROTECTED]> writes:
> At 16:46 15/04/02 +0200, Mario Weilguni wrote:
>> And how about getting database internals via SQL-functions - e.g. getting 
>> BLCSIZE, LOBBLCSIZE?

> ISTM that there would be some merit in making a selection of compile-time 
> options available via SQL. Is this worth considering?

This could usefully be combined with the nearby thread about recording
configuration options (started by Thomas).  I'd be inclined to make it
a low-footprint affair where you do something like

select compilation_options();

and you get back a long textual list of var=value settings, say

VERSION=7.3devel
PLATFORM=hppa-hp-hpux10.20, compiled by GCC 2.95.3
BLCKSZ=8192
MULTIBYTE=yes
etc etc etc etc

This would solve the diagnostic need as-is, and it doesn't seem
unreasonable to me to expect applications to look through the output
for a particular line if they need to get the state of a specific
configuration option.  It's also trivial to extend/modify as the set
of options changes over time.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] namedatalen part 2 (cont'd)

2002-04-24 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Yes, 64 looked like the appropriate value too.  Actually, I was
> surprised to see as much of a slowdown as we did.

I was too.  pgbench runs the same backend(s) throughout the test,
so it shouldn't be paying anything meaningful in disk I/O for the
larger catalog size.  After the first set of queries all the relevant
catalog rows will be cached in syscache.  So where's the performance
hit coming from?

It'd be interesting to redo these runs with profiling turned on
and compare the profiles at, say, 32 and 512 to see where the time
is going for larger NAMEDATALEN.  Might be something that's easy
to fix once we identify it.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Thomas Lockhart


> OK, would people please vote on how to handle SET in an aborted
> transaction?
> at the end, should 'x' equal:
> 1 - All SETs are rolled back in aborted transaction
> 2 - SETs are ignored after transaction abort
> 3 - All SETs are honored in aborted transaction
> ? - Have SETs vary in behavior depending on variable

I'll vote for "?", if for no other reason that you are proposing taking
away a huge chunk of "language space" by apriori disallowing out of band
behaviors for anything starting with "SET". I think that is likely
Hiroshi's concern also.

If we can fit all current "SET" behaviors into a transaction model, then
I'm not against that (though we should review the list of attributes
which *are* currently affected before settling on this). afaik we have
not reviewed current behaviors and have not thought through the "what
if's" that some soft of premature policy decision might constrain in the
future.

Let me give you some examples. We might someday have nested
transactions, or transactions which can be resumed from the point of
failure. We *might* want to be able to affect recovery behaviors, and we
*might* want to do so with something like

begin;
update foo...
update bar...

set blah to blah
update baz...
update bar...

end;

Now we currently *don't* support this behavior, but istm that we
shouldn't preclude it in the language by forcing some blanket "all SET
statements will be transaction aware".

What language elements would you propose to cover the out of band cases
if you *do* disallow "SET" in that context? If you don't have a
candidate, I'd be even more reluctant to go along with the results of
some arbitrary vote which is done in a narrow context.

And btw, if we *are* going to put transaction semantics on all of our
global variables (which is the context for starting this "SET"
discussion, right? Is that really the context we are still in, even
though you have phrased a much more general statement above?) then let's
have the discussion on *HOW* we are going to accomplish that *BEFORE*
deciding to make a semantic constraint on our language support.

Hmm, if we are going to use transaction semantics, then we should
consider using our existing transaction mechanisms, and if we use our
existing transaction mechanisms we should consider pushing these global
variables into tables or in memory tables a la "temp tables". We get the
transaction semantics for free, with the cost of value lookup at the
beginning of a transaction or statement (not sure what we can get away
with here).

If we are *not* going to use those existing mechanisms, then what
mechanism *are* we going to use? Some sort of "abort hook" mechanism to
allow SET to register things to be rolled back?

If we end up making changes and increasing constraints, then we should
also expect some increased functionality as part of the scheme,
specifically "SET extensibility". We should allow (future) packages to
define their parameters and allow SET to help.

Just some thoughts...

   - Thomas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] Table checking/dumping program

2002-04-24 Thread Martijn van Oosterhout


[Please CC any replies, I'm subscribed nomail]

As promised I've given it a bit of polish and it's actually almost useful.
You can have a look at it http://svana.org/kleptog/pgsql/pgfsck.html

Just unpack the files into a directory. It's just a perl script with two
modules so no compiling necessary. You can download the package directly at
http://svana.org/kleptog/pgsql/pgfsck-0.01.tar.gz

I've tested it on versions 6.5, 7.0 and 7.2 and it works. It shouldn't
crash, no matter how bad a file you feed it. It can output insert statements
also to help reconstruction.

Here is an example of the program being run over a suitably hexedited file.
# ./pgfsck -r 16559 kleptog website
-- Detected database format 7.2
-- Table pg_class(1259):Page 1:Tuple 0: Unknown type _aclitem (1034)
-- Table pg_class(1259):Page 1:Tuple 49: Unknown type _aclitem (1034)
-- Table website(16559):Page 0:Tuple 7: Tuple incorrect length (parsed 
data=57,length=1638)
-- Table website(16559):Page 0:Tuple 44: Decoding tuple runs off end: 627338916 > 69
-- Table website(16559):Page 0:Tuple 70: Bad tuple offset. Should be: 3784 <= 11592 < 
8192

Currently the following features are not supported:

- Toasted / compressed tuples
- Checking indexes doesn't work (should it?)
- Views just produce empty output (because they are)
- Arrays don't work
- Since each type output has to be written, many types are not correctly output
- Split tables (1GB) are not supported past the first part.
- Some system tables in some versions have a strange layout. You may get many
  harmless warnings about the formats of pg_class, pg_attribute and/or pg_type.

Most of these are basically because I don't know how they work, but with a
bit of work some of these should be fixable.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Canada, Mexico, and Australia form the Axis of Nations That
> Are Actually Quite Nice But Secretly Have Nasty Thoughts About America

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Tom Lane

Thomas Lockhart <[EMAIL PROTECTED]> writes:
> Let me give you some examples. We might someday have nested
> transactions, or transactions which can be resumed from the point of
> failure. We *might* want to be able to affect recovery behaviors, and we
> *might* want to do so with something like

> begin;
> update foo...
> update bar...
> 
> set blah to blah
> update baz...
> update bar...
> 
> end;

Sure, once we have savepoints or nested transactions I would expect SET
to work like that.  The "alternative 1" should better be phrased as
"SETs should work the same way as regular SQL commands do".

I agree with your comment that it would be useful to look closely at the
list of settable variables to see whether any of them need different
semantics.  Here's the list of everything that can be SET after backend
start (some of these require superuser privilege to set, but that seems
irrelevant):

datestyle
timezone
XactIsoLevel
client_encoding
server_encoding
seed
session_authorization
enable_seqscan
enable_indexscan
enable_tidscan
enable_sort
enable_nestloop
enable_mergejoin
enable_hashjoin
ksqo
geqo
debug_assertions
debug_print_query
debug_print_parse
debug_print_rewritten
debug_print_plan
debug_pretty_print
show_parser_stats
show_planner_stats
show_executor_stats
show_query_stats
show_btree_build_stats
explain_pretty_print
stats_command_string
stats_row_level
stats_block_level
trace_notify
trace_locks
trace_userlocks
trace_lwlocks
debug_deadlocks
sql_inheritance
australian_timezones
password_encryption
transform_null_equals
geqo_threshold
geqo_pool_size
geqo_effort
geqo_generations
geqo_random_seed
sort_mem
vacuum_mem
trace_lock_oidmin
trace_lock_table
max_expr_depth
wal_debug
commit_delay
commit_siblings
effective_cache_size
random_page_cost
cpu_tuple_cost
cpu_index_tuple_cost
cpu_operator_cost
geqo_selection_bias
client_min_messages
default_transaction_isolation
dynamic_library_path
search_path
server_min_messages

Right offhand, I am not seeing anything here for which there's a
compelling case not to roll it back on error.

In fact, I have yet to hear *any* plausible example of a variable
that we would really seriously want not to roll back on error.

> And btw, if we *are* going to put transaction semantics on all of our
> global variables (which is the context for starting this "SET"
> discussion, right? Is that really the context we are still in, even
> though you have phrased a much more general statement above?) then let's
> have the discussion on *HOW* we are going to accomplish that *BEFORE*
> deciding to make a semantic constraint on our language support.

Hardly necessary: we'll just make guc.c keep track of the
start-of-transaction values of all variables that have changed in the
current transaction, and restore them to that value upon transaction
abort.  Doesn't seem like a big deal to me.  We've got tons of other
code that does exactly the same sort of thing.

> Hmm, if we are going to use transaction semantics, then we should
> consider using our existing transaction mechanisms, and if we use our
> existing transaction mechanisms we should consider pushing these global
> variables into tables or in memory tables a la "temp tables".

Quite a few of the GUC settings are values that need to be set and used
during startup, before we have table access up and running.  I do not
think that it's very practical to expect them to be accessed through
table access mechanisms.

> If we end up making changes and increasing constraints, then we should
> also expect some increased functionality as part of the scheme,
> specifically "SET extensibility".

It might well be a good idea to allow variables to be added to guc.c's
lists on-the-fly by the initialization routines of loadable modules.
But that's orthogonal to this discussion, IMHO.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Returning text from stored procedures??

2002-04-24 Thread Tom Lane


Steffen Nielsen <[EMAIL PROTECTED]> writes:
> Any Ideas why? Are there restriction on the size of text a stored procedure 
> can return?

One gig ...

It's hard to guess what you're doing wrong when you haven't shown us
your code.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-24 Thread Luis Alberto Amigo Navarro


I was thinking in something independent from the executor, simply a variable
that recommends or not the use of a particular index, it could be obtained
from user, and so it could be improved(a factor lower than 1) on planner.
How about something like this?

- Original Message -
From: "Bruce Momjian" <[EMAIL PROTECTED]>
To: "Luis Alberto Amigo Navarro" <[EMAIL PROTECTED]>
Cc: "Lincoln Yeoh" <[EMAIL PROTECTED]>; "Tom Lane" <[EMAIL PROTECTED]>;
"mlw" <[EMAIL PROTECTED]>; "Andrew Sullivan" <[EMAIL PROTECTED]>;
"PostgreSQL-development" <[EMAIL PROTECTED]>
Sent: Tuesday, April 23, 2002 6:42 PM
Subject: Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE


> Luis Alberto Amigo Navarro wrote:
> > Hi All.
> > I've been reading all the thread and I want to add a few points:
> >
> > You can set enable_seqscan=off in small or easy queries, but in large
> > queries index can speed parts of the query and slow other, so I think it
is
> > neccesary if you want Postgres to become a Wide-used DBMS that the
planner
> > could be able to decide accuratelly, in the thread there is a point that
> > might be useful, it will be very interesting that the planner could
learn
> > with previous executions, even there could be a warm-up policy to let
> > planner learn about how the DB is working, this info could be stored
with DB
> > data, and could statistically show how use of index or seqscan works on
> > every column of the DB.
>
> Yes, I have always felt it would be good to feed back information from
> the executor to the optimizer to help with later estimates.  Of course,
> I never figured out how to do it.  :-)
>
> --
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 853-3000
>   +  If your life is a hard drive, |  830 Blythe Avenue
>   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
>


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] Parser translations and schemas

2002-04-24 Thread Tom Lane


Here's today's tidbit for those who like to argue about nitty little
details of behavior ...

Presently, the parser (in particular gram.y) has quite a few special
transformations for certain type and function names.  For example,

You write   You get

char(N) bpchar
trim(BOTH foo)  btrim(foo)

The question for the day is: should these transformations be applied to
schema-qualified names?  And should the parser force the transformed
names to be looked up in the system schema (pg_catalog), or should it
allow them to be searched for using the regular namespace search path?

I want to make the following proposal:

1. Transformations are applied only to unqualified names.  If you
write a qualified name then it is treated as a plain-vanilla identifier
and looked up in the catalogs without transformation, even if the name
component happens to match a name that would be transformed standing
alone.

2. If a transformation is applied then the resulting name will always
be forced to be looked up in the system schema; ie, the output will
effectively be "pg_catalog.something" not just "something".

Some examples:

You write   You get

char(N) pg_catalog.bpchar
pg_catalog.char pg_catalog.char (not bpchar)
realpg_catalog.float4
myschema.real   myschema.real (not float4)
trim(BOTH foo)  pg_catalog.btrim(foo)
pg_catalog.trim(BOTH foo)   an error (since the special production
allowing BOTH won't be used)

I have a number of reasons for thinking that this is a reasonable way to
go.  Point one: transforming qualified names seems to violate the
"principle of least surprise".  If I write myschema.real I would not
expect that to be converted to myschema.float4, especially if I weren't
aware that Postgres internally calls REAL "float4".  Point two: I don't
believe that we need to do it to meet the letter of the SQL spec.
AFAICT the spec treats all the names of built-in types and functions as
keywords, not as names belonging to a system schema.  So special
behavior is required for TRIM(foo) but not for DEFINITION_SCHEMA.TRIM(foo).
Point three: if we do transform a name, then we are expecting a
particular system type or function to be selected, and we ought to
ensure that that happens; thus explicitly qualifying the output name
seems proper.  Again, this seems less surprising than other alternatives.
If I have a datatype myschema.float4 that I've put into the search path
in front of pg_catalog, I think I'd be surprised to have it get picked
when I write REAL.

Another reason for doing it this way is that I think it's necessary for
reversibility.  For example, consider what format_type should put out
when it's trying to write a special-cased type name.  If it needs to
emit numeric(10,4) then it *cannot* stick "pg_catalog." on the front of
that --- the result wouldn't parse.  (At least not unless we uglify the
grammar a whole lot more to allow pg_catalog.FOO everywhere that just
FOO currently has a special production.)  So we need to make parsing
rules that guarantee that numeric(10,4) will be interpreted as
pg_catalog.numeric and not something else, regardless of the active
search path.  On the other hand, a plain user datatype that happens
to be named "real" should be accessible as myschema.real without
interference from the real->float4 transformation.

A corner case that maybe requires more discussion is what about type and
function names that are reserved per spec, but which we do not need any
special transformation for?  For example, the spec thinks that
OCTET_LENGTH() is a keyword, but our implementation treats it as an
ordinary function name.  I feel that the grammar should not prefix
"pg_catalog." to any name that it hasn't transformed or treated
specially in any way, even if that name is reserved per spec.  Note that
this will not actually lead to any non-spec-compliant behavior as long
as one allows the system to search pg_catalog before any user-provided
schemas --- which is in fact the default behavior, as it's currently set
up.

Another point is that I believe that REAL should be transformed to
pg_catalog.float4, but the quoted identifier "real" should not be.
This would require a bit of surgery --- presently xlateSqlType is
applied to pretty much everything whether it's a quoted identifier
or not.  But if we allow xlateSqlType to continue to work that way,
then user-schema names that happen to match one of its target names
are going to behave strangely.  I think we will need to get rid of
xlateSqlType/xlateSqlFunc and instead implement all the name
transformations we want as special productions, so that the target
names are shown as keywords in keywords.c.  Without this, ruleutils.c
will not have a clue that the user type name "real" needs to be quoted
to keep it from being transformed.

BTW: as the code s

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Mario Weilguni


Am Mittwoch, 24. April 2002 16:03 schrieb Bruce Momjian:
> OK, I have applied the following patch to fix these warnings.  However,
> I need Mario to confirm these are the right changes.  Thanks.

I've checked it and works fine, but the memcpy() prototype says it should be 
void pointers. Will this give errors with non-gcc compilers?

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Michael Loftis


Vote number 1 -- ROLL BACK

Bruce Momjian wrote:

>OK, would people please vote on how to handle SET in an aborted
>transaction?  This vote will allow us to resolve the issue and move
>forward if needed.
>
>In the case of:
>
>   SET x=1;
>   BEGIN;
>   SET x=2;
>   query_that_aborts_transaction;
>   SET x=3;
>   COMMIT;
>
>at the end, should 'x' equal:
>   
>   1 - All SETs are rolled back in aborted transaction
>   2 - SETs are ignored after transaction abort
>   3 - All SETs are honored in aborted transaction
>   ? - Have SETs vary in behavior depending on variable
>
>Our current behavior is 2.
>
>Please vote and I will tally the results.
>



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] "make report"

2002-04-24 Thread Peter Eisentraut

Thomas Lockhart writes:

> Right. The two areas which come to mind are integer availability and the
> timezone support (as you might know we support *three* different time
> zone models). At the moment, none of the developers know the features
> supported on the platforms we claim to support. Which platforms do not
> have int8 support still?

"Still" is the wrong word.  There used to be platforms with certain areas
of trouble, and those platforms don't go away.

But since you asked:  QNX 4 and SCO OpenServer are known to lack 8 byte
integers.

> Which do not have time zone interfaces fitting
> into the two "zonefull" styles? I'd like to know, but istm that the
> people *with* the platforms could do this much more easily than those
> without. What am I missing here??

I don't think polling users this way will yield reliable results.  If you
really want to find out, break something and see if someone complains.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] "make report"

2002-04-24 Thread Igor Kovalenko


It depends. QNX4 may be used with GCC, in which case it does have long long.
I am not sure if that combination will play along with Postgres, but it
should not be assumed impossible.

- Original Message -
From: "Peter Eisentraut" <[EMAIL PROTECTED]>
To: "Thomas Lockhart" <[EMAIL PROTECTED]>
Cc: "PostgreSQL Hackers" <[EMAIL PROTECTED]>
Sent: Wednesday, April 24, 2002 1:14 PM
Subject: Re: [HACKERS] "make report"


> Thomas Lockhart writes:
>
> > Right. The two areas which come to mind are integer availability and the
> > timezone support (as you might know we support *three* different time
> > zone models). At the moment, none of the developers know the features
> > supported on the platforms we claim to support. Which platforms do not
> > have int8 support still?
>
> "Still" is the wrong word.  There used to be platforms with certain areas
> of trouble, and those platforms don't go away.
>
> But since you asked:  QNX 4 and SCO OpenServer are known to lack 8 byte
> integers.
>
> > Which do not have time zone interfaces fitting
> > into the two "zonefull" styles? I'd like to know, but istm that the
> > people *with* the platforms could do this much more easily than those
> > without. What am I missing here??
>
> I don't think polling users this way will yield reliable results.  If you
> really want to find out, break something and see if someone complains.
>
> --
> Peter Eisentraut   [EMAIL PROTECTED]
>
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to [EMAIL PROTECTED] so that your
> message can get through to the mailing list cleanly
>


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Peter Eisentraut

Tom Lane writes:

> This could usefully be combined with the nearby thread about recording
> configuration options (started by Thomas).  I'd be inclined to make it
> a low-footprint affair where you do something like
>
>   select compilation_options();
>
> and you get back a long textual list of var=value settings, say
>
> VERSION=7.3devel
> PLATFORM=hppa-hp-hpux10.20, compiled by GCC 2.95.3
> BLCKSZ=8192
> MULTIBYTE=yes
> etc etc etc etc

This assumes that compilation options only matter in the server and that
only SQL users would be interested in them.  In fact, compilation options
affect client and utility programs as well, and it's not unusual to have a
wild mix (if only unintentional).

IMHO, the best place to put this information is in the version output, as
in:

$ ./psql --version
psql (PostgreSQL) 7.3devel
contains support for: readline

An SQL interface in addition to that would be OK, too.  But let's not dump
everything into one SHOW command.

-- 
Peter Eisentraut   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Vince Vielhaber


On Wed, 24 Apr 2002, Michael Loftis wrote:

> Vote number 1 -- ROLL BACK

I agree..  Number 1 - ROLL BACK

>
> Bruce Momjian wrote:
>
> >OK, would people please vote on how to handle SET in an aborted
> >transaction?  This vote will allow us to resolve the issue and move
> >forward if needed.
> >
> >In the case of:
> >
> > SET x=1;
> > BEGIN;
> > SET x=2;
> > query_that_aborts_transaction;
> > SET x=3;
> > COMMIT;
> >
> >at the end, should 'x' equal:
> >
> > 1 - All SETs are rolled back in aborted transaction
> > 2 - SETs are ignored after transaction abort
> > 3 - All SETs are honored in aborted transaction
> > ? - Have SETs vary in behavior depending on variable
> >
> >Our current behavior is 2.
> >
> >Please vote and I will tally the results.
> >
>
>
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>


Vince.
-- 
==
Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net
 56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directoryhttp://www.camping-usa.com
   Online Giftshop Superstorehttp://www.cloudninegifts.com
==




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Parser translations and schemas

2002-04-24 Thread Peter Eisentraut


Tom Lane writes:

> You write You get
>
> char(N)   pg_catalog.bpchar
> pg_catalog.char   pg_catalog.char (not bpchar)
> real  pg_catalog.float4
> myschema.real myschema.real (not float4)
> trim(BOTH foo)pg_catalog.btrim(foo)
> pg_catalog.trim(BOTH foo) an error (since the special production
>   allowing BOTH won't be used)

Exactly my thoughts.

> A corner case that maybe requires more discussion is what about type and
> function names that are reserved per spec, but which we do not need any
> special transformation for?  For example, the spec thinks that
> OCTET_LENGTH() is a keyword, but our implementation treats it as an
> ordinary function name.  I feel that the grammar should not prefix
> "pg_catalog." to any name that it hasn't transformed or treated
> specially in any way, even if that name is reserved per spec.

I agree.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] "make report"

2002-04-24 Thread Peter Eisentraut


Igor Kovalenko writes:

> It depends. QNX4 may be used with GCC, in which case it does have long long.
> I am not sure if that combination will play along with Postgres, but it
> should not be assumed impossible.

The point is, it should not be assumed possible.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Implement a .NET Data

2002-04-24 Thread Francisco Jr.



Thanks Dave.

I will create the project at gborg.postgresql.org on
Friday :)

> You might want to look at
> http://gborg.postgresql.org.
> 
> Regards, Dave. 

___
Yahoo! Empregos
O trabalho dos seus sonhos pode estar aqui. Cadastre-se hoje mesmo no Yahoo! Empregos 
e tenha acesso a milhares de vagas abertas!
http://br.empregos.yahoo.com/

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Tom Lane

Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Tom Lane writes:
>> select compilation_options();

> This assumes that compilation options only matter in the server and that
> only SQL users would be interested in them.  In fact, compilation options
> affect client and utility programs as well, and it's not unusual to have a
> wild mix (if only unintentional).

Good point.  It'd be worthwhile to have some way of extracting such
information from the clients as well.

> IMHO, the best place to put this information is in the version output, as
> in:

> $ ./psql --version
> psql (PostgreSQL) 7.3devel
> contains support for: readline

Is that sufficient?  The clients probably are not affected by quite as
many config options as the server, but they still have a nontrivial
list.  (Multibyte, SSL, Kerberos come to mind at once.)  I'd not like
to see us assume that a one-line output format will do the job.

A way to interrogate the libpq being used by psql might be good too.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread mlw


We have had several threads about index usage, specifically when PostgreSQL has
the choice of using one or not.

There seems to be a few points of view:

(1) The planner and statistics need to improve, so that erroneously using an
index (or not) happens less frequently or not at all.

(2) Use programmatic hints which allow coders specify which indexes are used
during a query. (ala Oracle)

(3) It is pretty much OK as-is, just use enable_seqscan=false in the query.

My point of view is about this subject is one from personal experience. I had a
database on which PostgreSQL would always (erroneously) choose not to use an
index. Are my experiences typical? Probably not, but are experiences like it
very common? I don't know, but we see a number "Why won't PostgreSQL use my
index" messages to at least conclude that it happens every now and then. In my
experience, when it happens, it is very frustrating.

I think statement (1) is a good idea, but I think it is optimistic to expect
that a statistical analysis of a table will contain enough information for all
possible cases.

Statement (2) would allow the flexibility needed, but as was pointed out, the
hints may become wrong over time as characteristics of the various change.

Statement (3) is not good enough because disabling sequential scans affect
whole queries and sub-queries which would correctly not use an index would be
forced to do so.

My personal preference is that some more specific mechanism than enable_seqscan
be provided for the DBA to assure an index is used. Working on the statistics
and the planner is fine, but I suspect there will always be a strong argument
for manual override in the exceptional cases where it will be needed.

What do you all think? What would be a good plan of attack?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> OK, would people please vote on how to handle SET in an aborted
> transaction?  This vote will allow us to resolve the issue and move
> forward if needed.
> 
> In the case of:
> 
> SET x=1;
> BEGIN;
> SET x=2;
> query_that_aborts_transaction;
> SET x=3;
> COMMIT;
> 
> at the end, should 'x' equal:
> 
> 1 - All SETs are rolled back in aborted transaction
> 2 - SETs are ignored after transaction abort
> 3 - All SETs are honored in aborted transaction
> ? - Have SETs vary in behavior depending on variable
> 
> Our current behavior is 2.
> 
> Please vote and I will tally the results.

Is it a vote in the first place ?
I will vote the current(2 + 3 + ?).

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Tom Lane wrote:
> 

> Right offhand, I am not seeing anything here for which there's a
> compelling case not to roll it back on error.
> 
> In fact, I have yet to hear *any* plausible example of a variable
> that we would really seriously want not to roll back on error.

Honetsly I don't understand what kind of example you
expect. How about the following ?

[The curren schema is schema1]

begin;
create schema foo;
set search_path = foo;
create table t1 ();
.
   [error occurs]
rollback;
insert into t1 select * from schema1.t1;

Should the search_path be put back in this case ?
As I mentioned already many times, it doesn't seem
*should be* kind of thing.

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Bruce Momjian



OK, the votes are in:

#1
Lamar Owen
Jan Wieck
Tom Lane
Bruce Momjian
Joe Conway
Curt Sampson
Michael Loftis
Vince Vielhaber
Sander Steffann
 
#2
Bradley McLean
 
 
 
#3

#?
Thomas Lockhart
Hiroshi Inoue

Looks like #1 is the clear winner.

---

Bruce Momjian wrote:
> OK, would people please vote on how to handle SET in an aborted
> transaction?  This vote will allow us to resolve the issue and move
> forward if needed.
> 
> In the case of:
> 
>   SET x=1;
>   BEGIN;
>   SET x=2;
>   query_that_aborts_transaction;
>   SET x=3;
>   COMMIT;
> 
> at the end, should 'x' equal:
>   
>   1 - All SETs are rolled back in aborted transaction
>   2 - SETs are ignored after transaction abort
>   3 - All SETs are honored in aborted transaction
>   ? - Have SETs vary in behavior depending on variable
> 
> Our current behavior is 2.
> 
> Please vote and I will tally the results.
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   [EMAIL PROTECTED]   |  (610) 853-3000
>   +  If your life is a hard drive, |  830 Blythe Avenue
>   +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
> http://www.postgresql.org/users-lounge/docs/faq.html
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Jan Wieck

Hiroshi Inoue wrote:
> Tom Lane wrote:
> >
>
> > Right offhand, I am not seeing anything here for which there's a
> > compelling case not to roll it back on error.
> >
> > In fact, I have yet to hear *any* plausible example of a variable
> > that we would really seriously want not to roll back on error.
>
> Honetsly I don't understand what kind of example you
> expect. How about the following ?
>
> [The curren schema is schema1]
>
> begin;
> create schema foo;
> set search_path = foo;
> create table t1 ();
> .
>[error occurs]
> rollback;
> insert into t1 select * from schema1.t1;
>
> Should the search_path be put back in this case ?
> As I mentioned already many times, it doesn't seem
> *should be* kind of thing.

Sure  should  it!  You  gave  an example for the need to roll
back, because otherwise you would  end  up  with  an  invalid
search path "foo".

I  still believe that rolling back is the only right thing to
do. What if your application  doesn't  even  know  that  some
changes happened? Have a trigger that set's seqscan off, does
some stuff and intends to reset it later again. Now it elog's
out  before,  so your application will have to live with this
mis-setting on this pooled DB connection until  the  end?   I
don't think so!

Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Jan Wieck wrote:
> 
> Hiroshi Inoue wrote:
> > Tom Lane wrote:
> > >
> >
> > > Right offhand, I am not seeing anything here for which there's a
> > > compelling case not to roll it back on error.
> > >
> > > In fact, I have yet to hear *any* plausible example of a variable
> > > that we would really seriously want not to roll back on error.
> >
> > Honetsly I don't understand what kind of example you
> > expect. How about the following ?
> >
> > [The curren schema is schema1]
> >
> > begin;
> > create schema foo;
> > set search_path = foo;
> > create table t1 ();
> > .
> >[error occurs]
> > rollback;
> > insert into t1 select * from schema1.t1;
> >
> > Should the search_path be put back in this case ?
> > As I mentioned already many times, it doesn't seem
> > *should be* kind of thing.
> 
> Sure  should  it!  You  gave  an example for the need to roll
> back, because

>  otherwise you would  end  up  with  an  invalid
> search path "foo".

What's wrong with it ? The insert command after *rollback*
would fail. It seems the right thing to me. Otherwise
the insert command would try to append the data of the
table t1 to itself. The insert command is for copying
schema1.t1 to foo.t1 in case the previous create schema
command suceeded.

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Tom Lane


Hiroshi Inoue <[EMAIL PROTECTED]> writes:
> Honetsly I don't understand what kind of example you
> expect. How about the following ?

> [The curren schema is schema1]

>   begin;
>   create schema foo;
>   set search_path = foo;
>   create table t1 ();
>   .
>[error occurs]
>   rollback;
>   insert into t1 select * from schema1.t1;

> Should the search_path be put back in this case ?

Sure it should be.  Otherwise it's pointing at a nonexistent schema.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> OK, the votes are in:
> 
> #1
> Lamar Owen
> Jan Wieck
> Tom Lane
> Bruce Momjian
> Joe Conway
> Curt Sampson
> Michael Loftis
> Vince Vielhaber
> Sander Steffann
> 
> #2
> Bradley McLean
> 
> 
> 
> #3
> 
> #?
> Thomas Lockhart
> Hiroshi Inoue
> 
> Looks like #1 is the clear winner.

I voted not only ? but also 2 and 3.
And haven't I asked twice or so if it's a vote ?

Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > OK, the votes are in:
> > 
> > #1
> > Lamar Owen
> > Jan Wieck
> > Tom Lane
> > Bruce Momjian
> > Joe Conway
> > Curt Sampson
> > Michael Loftis
> > Vince Vielhaber
> > Sander Steffann
> > 
> > #2
> > Bradley McLean
> > 
> > 
> > 
> > #3
> > 
> > #?
> > Thomas Lockhart
> > Hiroshi Inoue
> > 
> > Looks like #1 is the clear winner.
> 
> I voted not only ? but also 2 and 3.
> And haven't I asked twice or so if it's a vote ?

Yes, it is a vote, and now that we see how everyone feels, we can
decide what to do.

Hiroshi, you can't vote for 2, 3, and ?.  Please pick one.  I picked '?'
for you because it seemed the closest to your intent.  I can put you
down for 1/3 of a vote for all three if you wish.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > Bruce Momjian wrote:
> > >
> > > OK, the votes are in:
> > >
> > > #1
> > > Lamar Owen
> > > Jan Wieck
> > > Tom Lane
> > > Bruce Momjian
> > > Joe Conway
> > > Curt Sampson
> > > Michael Loftis
> > > Vince Vielhaber
> > > Sander Steffann
> > >
> > > #2
> > > Bradley McLean
> > >
> > >
> > >
> > > #3
> > >
> > > #?
> > > Thomas Lockhart
> > > Hiroshi Inoue
> > >
> > > Looks like #1 is the clear winner.
> >
> > I voted not only ? but also 2 and 3.
> > And haven't I asked twice or so if it's a vote ?
> 
> Yes, it is a vote, and now that we see how everyone feels, we can
> decide what to do.
> 
> Hiroshi, you can't vote for 2, 3, and ?.

Why ?
I don't think the items are exclusive.
 
regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > > Bruce Momjian wrote:
> > > >
> > > > OK, the votes are in:
> > > >
> > > > #1
> > > > Lamar Owen
> > > > Jan Wieck
> > > > Tom Lane
> > > > Bruce Momjian
> > > > Joe Conway
> > > > Curt Sampson
> > > > Michael Loftis
> > > > Vince Vielhaber
> > > > Sander Steffann
> > > >
> > > > #2
> > > > Bradley McLean
> > > >
> > > >
> > > >
> > > > #3
> > > >
> > > > #?
> > > > Thomas Lockhart
> > > > Hiroshi Inoue
> > > >
> > > > Looks like #1 is the clear winner.
> > >
> > > I voted not only ? but also 2 and 3.
> > > And haven't I asked twice or so if it's a vote ?
> > 
> > Yes, it is a vote, and now that we see how everyone feels, we can
> > decide what to do.
> > 
> > Hiroshi, you can't vote for 2, 3, and ?.
> 
> Why ?
> I don't think the items are exclusive.

Well, 2 says roll back only after transaction aborts, 3 says honor all
SET's, and ? says choose the behavior depending on the variable.  How
can you have 2, 3, and ?.  Seems ? is the catch-all vote because it
doesn't predefine the same behavior for all variables.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> > > >
> > > > I voted not only ? but also 2 and 3.
> > > > And haven't I asked twice or so if it's a vote ?
> > >
> > > Yes, it is a vote, and now that we see how everyone feels, we can
> > > decide what to do.
> > >
> > > Hiroshi, you can't vote for 2, 3, and ?.
> >
> > Why ?
> > I don't think the items are exclusive.
> 
> Well, 2 says roll back only after transaction aborts,

Sorry for my poor understanding.
Isn't it 1 ?

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Bruce Momjian

Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > > > >
> > > > > I voted not only ? but also 2 and 3.
> > > > > And haven't I asked twice or so if it's a vote ?
> > > >
> > > > Yes, it is a vote, and now that we see how everyone feels, we can
> > > > decide what to do.
> > > >
> > > > Hiroshi, you can't vote for 2, 3, and ?.
> > >
> > > Why ?
> > > I don't think the items are exclusive.
> > 
> > Well, 2 says roll back only after transaction aborts,
> 
> Sorry for my poor understanding.
> Isn't it 1 ?

OK, original email attached. 1 rolls back all SETs in an aborted
transaction.  2 ignores SETs after transaction aborts, but  SETs before
the transaction aborted are honored.  3 honors all SETs.

---

In the case of:

SET x=1;
BEGIN;
SET x=2;
query_that_aborts_transaction;
SET x=3;
COMMIT;

at the end, should 'x' equal:

1 - All SETs are rolled back in aborted transaction
2 - SETs are ignored after transaction abort
3 - All SETs are honored in aborted transaction
? - Have SETs vary in behavior depending on variable

Our current behavior is 2.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Michael Loftis



Hiroshi Inoue wrote:

>What's wrong with it ? The insert command after *rollback*
>would fail. It seems the right thing to me. Otherwise
>the insert command would try to append the data of the
>table t1 to itself. The insert command is for copying
>schema1.t1 to foo.t1 in case the previous create schema
>command suceeded.
>
Exactly, in this example shows exactly why SETs should be part of the
transaction and roll back. Heck the insert may not even fail after all
anyway and insert into the wrong schema. If the insert depends on the
schema create succeeding it should be in the same transaction. (IE it
would get rolled back or not happen at all)

>
>
>regards, 
>Hiroshi Inoue
>   http://w2422.nsk.ne.jp/~inoue/
>
>---(end of broadcast)---
>TIP 2: you can get off all lists at once with the unregister command
>(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread Bradley McLean


* mlw ([EMAIL PROTECTED]) [020424 18:51]:
> 
> (2) Use programmatic hints which allow coders specify which indexes are used
> during a query. (ala Oracle)

We would certainly use this if it were available.  Hopefully not to
shoot ourselves in the foot, but for the rather common case of having
a small set of important predefined queries that run over data sets
that neither grow significantly nor change in characteristics (for
example, a table of airline routes and fares, with a few million
rows).

We want to squeeze every last bit of performance out of certain
queries, and we're willing to spend the time to verify that the
manual tuning beats the planner.

> What do you all think? What would be a good plan of attack?

I dunno.  If someone comes up with one that I can reasonably
contribute to, I will.

-Brad

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> > Bruce Momjian wrote:
> > >
> > > > > >
> > > > > > I voted not only ? but also 2 and 3.
> > > > > > And haven't I asked twice or so if it's a vote ?
> > > > >
> > > > > Yes, it is a vote, and now that we see how everyone feels, we can
> > > > > decide what to do.
> > > > >
> > > > > Hiroshi, you can't vote for 2, 3, and ?.
> > > >
> > > > Why ?
> > > > I don't think the items are exclusive.
> > >
> > > Well, 2 says roll back only after transaction aborts,
> >
> > Sorry for my poor understanding.
> > Isn't it 1 ?
> 
> OK, original email attached. 1 rolls back all SETs in an aborted
> transaction. 

> 2 ignores SETs after transaction aborts, but  SETs before
> the transaction aborted are honored.

Must I understand this from your previous posting
(2 says roll back only after transaction aborts,)
or original posting ? What I understood was 2 only
says that SET fails between a failure and the
subsequenct ROLLBACK call.

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Curt Sampson

On Wed, 24 Apr 2002, Bruce Momjian wrote:

> We expect the file system to do re-aheads during a sequential scan.
> This will not happen if someone else is also reading buffers from that
> table in another place.

Right. The essential difficulties are, as I see it:

1. Not all systems do readahead.

2. Even systems that do do it cannot always reliably detect that
they need to.

3. Even when the read-ahead does occur, you're still doing more
syscalls, and thus more expensive kernel/userland transitions, than
you have to.

Has anybody considered writing a storage manager that uses raw
partitions and deals with its own buffer caching? This has the potential
to be a lot more efficient, since the database server knows much more
about its workload than the operating system can guess.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > > Bruce Momjian wrote:
> > > >
> > > > > > >
> > > > > > > I voted not only ? but also 2 and 3.
> > > > > > > And haven't I asked twice or so if it's a vote ?
> > > > > >
> > > > > > Yes, it is a vote, and now that we see how everyone feels, we can
> > > > > > decide what to do.
> > > > > >
> > > > > > Hiroshi, you can't vote for 2, 3, and ?.
> > > > >
> > > > > Why ?
> > > > > I don't think the items are exclusive.
> > > >
> > > > Well, 2 says roll back only after transaction aborts,
> > >
> > > Sorry for my poor understanding.
> > > Isn't it 1 ?
> > 
> > OK, original email attached. 1 rolls back all SETs in an aborted
> > transaction. 
> 
> > 2 ignores SETs after transaction aborts, but  SETs before
> > the transaction aborted are honored.
> 
> Must I understand this from your previous posting
> (2 says roll back only after transaction aborts,)
> or original posting ? What I understood was 2 only
> says that SET fails between a failure and the
> subsequenct ROLLBACK call.

Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK
call, which is current behavior.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Michael Loftis wrote:
> 
> Hiroshi Inoue wrote:
> 
> >What's wrong with it ? The insert command after *rollback*
> >would fail. It seems the right thing to me. Otherwise
> >the insert command would try to append the data of the
> >table t1 to itself. The insert command is for copying
> >schema1.t1 to foo.t1 in case the previous create schema
> >command suceeded.
> >
> Exactly, in this example shows exactly why SETs should be part of the
> transaction and roll back. Heck the insert may not even fail after all
> anyway and insert into the wrong schema. If the insert depends on the
> schema create succeeding it should be in the same transaction. (IE it
> would get rolled back or not happen at all)

Where's the restriction that all objects in a schema
must be created in an transaction ? Each user has his
reason and would need various kind of command call sequence.
What I've mainly insisted is what to do with errors is
users' responsibilty but I've never seen the agreement
for it. So my current understanding is you all
are thinking what to do with errors is system's
responsibilty. Then no matter how users call commands
the dbms must behave appropriately, mustn't it ?

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Bruce Momjian

Curt Sampson wrote:
> On Wed, 24 Apr 2002, Bruce Momjian wrote:
> 
> > We expect the file system to do re-aheads during a sequential scan.
> > This will not happen if someone else is also reading buffers from that
> > table in another place.
> 
> Right. The essential difficulties are, as I see it:
> 
> 1. Not all systems do readahead.

If they don't, that isn't our problem.  We expect it to be there, and if
it isn't, the vendor/kernel is at fault.

> 2. Even systems that do do it cannot always reliably detect that
> they need to.

Yes, seek() in file will turn off read-ahead.  Grabbing bigger chunks
would help here, but if you have two people already reading from the
same file, grabbing bigger chunks of the file may not be optimal.

> 3. Even when the read-ahead does occur, you're still doing more
> syscalls, and thus more expensive kernel/userland transitions, than
> you have to.

I would guess the performance impact is minimal.

> Has anybody considered writing a storage manager that uses raw
> partitions and deals with its own buffer caching? This has the potential
> to be a lot more efficient, since the database server knows much more
> about its workload than the operating system can guess.

We have talked about it, but rejected it.  Look in TODO.detail in
optimizer and performance for 'raw'.  Also interesting info there about
optimizer cost estimates we have been talking about.

Specificially see:

http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html

Also see:

http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Michael Loftis wrote:
> > 
> > Hiroshi Inoue wrote:
> > 
> > >What's wrong with it ? The insert command after *rollback*
> > >would fail. It seems the right thing to me. Otherwise
> > >the insert command would try to append the data of the
> > >table t1 to itself. The insert command is for copying
> > >schema1.t1 to foo.t1 in case the previous create schema
> > >command suceeded.
> > >
> > Exactly, in this example shows exactly why SETs should be part of the
> > transaction and roll back. Heck the insert may not even fail after all
> > anyway and insert into the wrong schema. If the insert depends on the
> > schema create succeeding it should be in the same transaction. (IE it
> > would get rolled back or not happen at all)
> 
> Where's the restriction that all objects in a schema
> must be created in an transaction ? Each user has his
> reason and would need various kind of command call sequence.
> What I've mainly insisted is what to do with errors is
> users' responsibilty but I've never seen the agreement
> for it. So my current understanding is you all
> are thinking what to do with errors is system's
> responsibilty. Then no matter how users call commands
> the dbms must behave appropriately, mustn't it ?

Hiroshi, we need a psql solution too.  People are feeding query files
into psql all the time and we should have an appropriate behavior for
them.

I now understand your point that from a ODBC perspective, you may not
want SETs rolled back and you would rather ODBC handle what to do with
SETs.  Not sure I like pushing that job off to the application
programmer, but I think I see your point.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Jan Wieck


Bruce Momjian wrote:
> Hiroshi Inoue wrote:
> > Bruce Momjian wrote:
> > > 
> > > Hiroshi Inoue wrote:
> > > > Bruce Momjian wrote:
> > > > >
> > > > > > > >
> > > > > > > > I voted not only ? but also 2 and 3.
> > > > > > > > And haven't I asked twice or so if it's a vote ?
> > > > > > >
> > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can
> > > > > > > decide what to do.
> > > > > > >
> > > > > > > Hiroshi, you can't vote for 2, 3, and ?.
> > > > > >
> > > > > > Why ?
> > > > > > I don't think the items are exclusive.
> > > > >
> > > > > Well, 2 says roll back only after transaction aborts,
> > > >
> > > > Sorry for my poor understanding.
> > > > Isn't it 1 ?
> > > 
> > > OK, original email attached. 1 rolls back all SETs in an aborted
> > > transaction. 
> > 
> > > 2 ignores SETs after transaction aborts, but  SETs before
> > > the transaction aborted are honored.
> > 
> > Must I understand this from your previous posting
> > (2 says roll back only after transaction aborts,)
> > or original posting ? What I understood was 2 only
> > says that SET fails between a failure and the
> > subsequenct ROLLBACK call.
> 
> Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK
> call, which is current behavior.

What about a SET variable that controls the behaviour of
SET variables :-)


Jan

-- 

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Michael Loftis

Hiroshi Inoue wrote:

>Michael Loftis wrote:
>
>>Hiroshi Inoue wrote:
>>
>>>What's wrong with it ? The insert command after *rollback*
>>>would fail. It seems the right thing to me. Otherwise
>>>the insert command would try to append the data of the
>>>table t1 to itself. The insert command is for copying
>>>schema1.t1 to foo.t1 in case the previous create schema
>>>command suceeded.
>>>
>>Exactly, in this example shows exactly why SETs should be part of the
>>transaction and roll back. Heck the insert may not even fail after all
>>anyway and insert into the wrong schema. If the insert depends on the
>>schema create succeeding it should be in the same transaction. (IE it
>>would get rolled back or not happen at all)
>>
>
>Where's the restriction that all objects in a schema
>must be created in an transaction ? Each user has his
>reason and would need various kind of command call sequence.
>What I've mainly insisted is what to do with errors is
>users' responsibilty but I've never seen the agreement
>for it. So my current understanding is you all
>are thinking what to do with errors is system's
>responsibilty. Then no matter how users call commands
>the dbms must behave appropriately, mustn't it ?
>
IMHO as a user and developer it's more important to behave consistently.
A rollback should cause everything inside of a transaciton block to
rollback. If you need to keep something then it should either be done in
it's own transaction, or outside of an explicit transaction entirely.

There is no restriction. The system is handling an error in the way
instructed by the user either ROLLBACK or COMMIT. If you COMMIT with
errors, it's your problem. But if you askt he system to ROLLBACK it's
the users expectation that the DBMS will ROLLBACK. Not ROLLBACK this and
that, but leave another thing alone. You say BEGIN ... COMMIT you expect
a COMMIT, you say BEGIN ... ROLLBACK you expect a ROLLBACK. You say
BEGIN ... END the DBMS should 'do the right thing' (IE COMMIT if
successfull, ROLLBACK if not). Thats the behaviour I'd expect from ANY
transactional system.

The user will (and rightfully so) expect a ROLLBACK to do just that for
everything. Yes this will break the way things work currently, but on
the whole, and going forward, it makes the system consistent. Right now
we roll back SELECTs, CREATEs, UPDATEs, etc., but not SETs (or atleast
from what I can tell that's what we do.)

I understand what you're saying Hiroshi-san, but really, it's a very
weak reason. If you (as a programmer/developer) do something like in
your earlier example (perform an insert after ROLLBACK) then you know an
error occurred, and it's your own fault for inserting into the wrong
table outside of the transaction.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Vote totals for SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> >
> > Must I understand this from your previous posting
> > (2 says roll back only after transaction aborts,)
> > or original posting ? What I understood was 2 only
> > says that SET fails between a failure and the
> > subsequenct ROLLBACK call.
> 
> Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK
> call, which is current behavior.

Oh I see. It was my mistake to have participated this vote.
I'm not qualified from the first because I wasn't able to
understand your vote list. 

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Jan Wieck


Hiroshi Inoue wrote:
> > Sure  should  it!  You  gave  an example for the need to roll
> > back, because
> >  otherwise you would  end  up  with  an  invalid
> > search path "foo".
>
> What's wrong with it ? The insert command after *rollback*
> would fail. It seems the right thing to me. Otherwise
> the insert command would try to append the data of the
> table t1 to itself. The insert command is for copying
> schema1.t1 to foo.t1 in case the previous create schema
> command suceeded.

Wrong about your entire example is that the rollback is sheer
wrong placed to make up your case ;-p

There is absolutely no need to put the insert outside of  the
transaction that is intended to copy schema1.t1 to foo.t1.


Jan

--

#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] referential integrity problem

2002-04-24 Thread Tatsuo Ishii


It seems we can create a forein key using REFERENCES privilege but
cannot drop the table if its owner is not same as the referenced
table. Is this a feature or bug?

-- create a table as user foo
\c - foo
create table t1(i int primary key);
-- grant reference privilege to user bar
grant references on t1 to bar;
-- create a table as user bar
\c - bar
create table t2(i int references t1);
-- cannot drop t2 as user bar?
drop table t2;
NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "t1"
ERROR:  t1: Must be table owner.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Michael Loftis

Bruce Momjian wrote:

>Hiroshi, we need a psql solution too.  People are feeding query files
>into psql all the time and we should have an appropriate behavior for
>them.
>
>I now understand your point that from a ODBC perspective, you may not
>want SETs rolled back and you would rather ODBC handle what to do with
>SETs.  Not sure I like pushing that job off to the application
>programmer, but I think I see your point.
>

Ahhh Hiroshi is talkign formt he aspect of ODBC?  Well, thats an ODBC 
issue, should be handled by the ODBC driver.  Compliance with ODBC spec 
(or non-compliance) is not the issue of PostgreSQL proper.  Thats the 
issue of the ODBC driver and it's maintainers (sorry if I'm sounding 
like a bastard but heh).

If we start catering to all the different driver layers then we'll end 
up with a huge mess.  What we're 'catering' to is the SQLxx specs, and 
the expectations of a user when running and developing programs, am I right?

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Michael Loftis wrote:
> 
> Hiroshi Inoue wrote:
>
> >Where's the restriction that all objects in a schema
> >must be created in an transaction ? Each user has his
> >reason and would need various kind of command call sequence.
> >What I've mainly insisted is what to do with errors is
> >users' responsibilty but I've never seen the agreement
> >for it. So my current understanding is you all
> >are thinking what to do with errors is system's
> >responsibilty. Then no matter how users call commands
> >the dbms must behave appropriately, mustn't it ?
> >
> IMHO as a user and developer it's more important to behave consistently.
> A rollback should cause everything inside of a transaciton block to
> rollback.

Where does the *should* come from ?
The standard says that changes to the database should
be put back but doesn't say everything should be put back.

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Jan Wieck wrote:
> 
> Hiroshi Inoue wrote:
> > > Sure  should  it!  You  gave  an example for the need to roll
> > > back, because
> > >  otherwise you would  end  up  with  an  invalid
> > > search path "foo".
> >
> > What's wrong with it ? The insert command after *rollback*
> > would fail. It seems the right thing to me. Otherwise
> > the insert command would try to append the data of the
> > table t1 to itself. The insert command is for copying
> > schema1.t1 to foo.t1 in case the previous create schema
> > command suceeded.
> 
> Wrong about your entire example is that the rollback is sheer
> wrong placed to make up your case ;-p

Is this issue on the wrong(? not preferable) sequnence
of calls ?
Please don't miss the point.

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> Hiroshi, we need a psql solution too.  People are feeding query files
> into psql all the time and we should have an appropriate behavior for
> them.

What are you expecting for psql e.g. the following
wrong(?) example ?

[The curren schema is schema1]
begin;
create schema foo;
set search_path = foo;
create table t1 (); [error occurs]
commit;
insert into t1 select * from schema1.t1;

regards,
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Michael Loftis wrote:
> 
> Bruce Momjian wrote:
> 
> >Hiroshi, we need a psql solution too.  People are feeding query files
> >into psql all the time and we should have an appropriate behavior for
> >them.
> >
> >I now understand your point that from a ODBC perspective, you may not
> >want SETs rolled back and you would rather ODBC handle what to do with
> >SETs.  Not sure I like pushing that job off to the application
> >programmer, but I think I see your point.
> >
> 
> Ahhh Hiroshi is talkign formt he aspect of ODBC?  Well, thats an ODBC
> issue, should be handled by the ODBC driver. 

No. 

regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > Hiroshi, we need a psql solution too.  People are feeding query files
> > into psql all the time and we should have an appropriate behavior for
> > them.
> 
> What are you expecting for psql e.g. the following
> wrong(?) example ?
> 
>   [The curren schema is schema1]
> begin;
> create schema foo;
> set search_path = foo;
> create table t1 (); [error occurs]
> commit;
> insert into t1 select * from schema1.t1;

I am expecting the INSERT will use the search_path value that existed
before the error transaction began.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> Hiroshi Inoue wrote:
> >
> > What are you expecting for psql e.g. the following
> > wrong(?) example ?
> >
> >   [The curren schema is schema1]
> > begin;
> > create schema foo;
> > set search_path = foo;
> > create table t1 (); [error occurs]
> > commit;
> > insert into t1 select * from schema1.t1;
> 
> I am expecting the INSERT will use the search_path value that existed
> before the error transaction began.
> 

So you see foo.t1 which is a copy of schema1.t1
if all were successful and you may be able to see
the doubled schema1.t1 in case of errors.

regards, 
Hiroshi Inoue
http://w2422.nsk.ne.jp/~inoue/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Inefficient handling of LO-restore + Patch

2002-04-24 Thread Bruce Momjian

Mario Weilguni wrote:
> Am Mittwoch, 24. April 2002 16:03 schrieb Bruce Momjian:
> > OK, I have applied the following patch to fix these warnings.  However,
> > I need Mario to confirm these are the right changes.  Thanks.
> 
> I've checked it and works fine, but the memcpy() prototype says it should be 
> void pointers. Will this give errors with non-gcc compilers?

No, it is fine.  Anything can be cast _to_ a void pointer.  You just
can't do arithmetic on them.

Are you sure you want to use 'void *' in your code.  Looking at the
backend large object code, I see char *:

  extern int  inv_read(LargeObjectDesc *obj_desc, char *buf, int nbytes);
  extern int  inv_write(LargeObjectDesc *obj_desc, char *buf, int nbytes);

I guess my point is that these are just streams of bytes, _not_ really
streams of items of unknown length.  We know the length, and the length
is char.  This may simplify the code.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Bruce Momjian


Hiroshi Inoue wrote:
> Bruce Momjian wrote:
> > 
> > Hiroshi Inoue wrote:
> > >
> > > What are you expecting for psql e.g. the following
> > > wrong(?) example ?
> > >
> > >   [The curren schema is schema1]
> > > begin;
> > > create schema foo;
> > > set search_path = foo;
> > > create table t1 (); [error occurs]
> > > commit;
> > > insert into t1 select * from schema1.t1;
> > 
> > I am expecting the INSERT will use the search_path value that existed
> > before the error transaction began.
> > 
> 
> So you see foo.t1 which is a copy of schema1.t1
> if all were successful and you may be able to see
> the doubled schema1.t1 in case of errors.

Yes, I think that is how it would behave.  If you don't roll back 'set
search_path', you are pointing to a non-existant schema.

Probably the proper thing here would be to have the INSERT in the
transaction too.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Vote on SET in aborted transaction

2002-04-24 Thread Hiroshi Inoue

Bruce Momjian wrote:
> 
> > > >
> > > > What are you expecting for psql e.g. the following
> > > > wrong(?) example ?
> > > >
> > > >   [The curren schema is schema1]
> > > > begin;
> > > > create schema foo;
> > > > set search_path = foo;
> > > > create table t1 (); [error occurs]
> > > > commit;
> > > > insert into t1 select * from schema1.t1;
> > >
> > > I am expecting the INSERT will use the search_path value that existed
> > > before the error transaction began.
> > >
> >
> > So you see foo.t1 which is a copy of schema1.t1
> > if all were successful and you may be able to see
> > the doubled schema1.t1 in case of errors.
> 
> Yes, I think that is how it would behave.  If you don't roll back 'set
> search_path', you are pointing to a non-existant schema.

OK I see your standpoint. If Tom agrees with Bruce I don't
object any more.

regards,
Hiroshi Inoue

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Curt Sampson

On Wed, 24 Apr 2002, Bruce Momjian wrote:

> > 1. Not all systems do readahead.
>
> If they don't, that isn't our problem.  We expect it to be there, and if
> it isn't, the vendor/kernel is at fault.

It is your problem when another database kicks Postgres' ass
performance-wise.

And at that point, *you're* at fault. You're the one who's knowingly
decided to do things inefficiently.

Sorry if this sounds harsh, but this, "Oh, someone else is to blame"
attitude gets me steamed. It's one thing to say, "We don't support
this." That's fine; there are often good reasons for that. It's a
completely different thing to say, "It's an unrelated entity's fault we
don't support this."

At any rate, relying on the kernel to guess how to optimise for
the workload will never work as well as well as the software that
knows the workload doing the optimization.

The lack of support thing is no joke. Sure, lots of systems nowadays
support unified buffer cache and read-ahead. But how many, besides
Solaris, support free-behind, which is also very important to avoid
blowing out your buffer cache when doing sequential reads? And who
at all supports read-ahead for reverse scans? (Or does Postgres
not do those, anyway? I can see the support is there.)

And even when the facilities are there, you create problems by
using them.  Look at the OS buffer cache, for example. Not only do
we lose efficiency by using two layers of caching, but (as people
have pointed out recently on the lists), the optimizer can't even
know how much or what is being cached, and thus can't make decisions
based on that.

> Yes, seek() in file will turn off read-ahead.  Grabbing bigger chunks
> would help here, but if you have two people already reading from the
> same file, grabbing bigger chunks of the file may not be optimal.

Grabbing bigger chunks is always optimal, AFICT, if they're not
*too* big and you use the data. A single 64K read takes very little
longer than a single 8K read.

> > 3. Even when the read-ahead does occur, you're still doing more
> > syscalls, and thus more expensive kernel/userland transitions, than
> > you have to.
>
> I would guess the performance impact is minimal.

If it were minimal, people wouldn't work so hard to build multi-level
thread systems, where multiple userland threads are scheduled on
top of kernel threads.

However, it does depend on how much CPU your particular application
is using. You may have it to spare.

>   http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html

Well, this message has some points in it that I feel are just incorrect.

1. It is *not* true that you have no idea where data is when
using a storage array or other similar system. While you
certainly ought not worry about things such as head positions
and so on, it's been a given for a long, long time that two
blocks that have close index numbers are going to be close
together in physical storage.

2. Raw devices are quite standard across Unix systems (except
in the unfortunate case of Linux, which I think has been
remedied, hasn't it?). They're very portable, and have just as
well--if not better--defined write semantics as a filesystem.

3. My observations of OS performance tuning over the past six
or eight years contradict the statement, "There's a considerable
cost in complexity and code in using "raw" storage too, and
it's not a one off cost: as the technologies change, the "fast"
way to do things will change and the code will have to be
updated to match." While optimizations have been removed over
the years the basic optimizations (order reads by block number,
do larger reads rather than smaller, cache the data) have
remained unchanged for a long, long time.

4. "Better to leave this to the OS vendor where possible, and
take advantage of the tuning they do." Well, sorry guys, but
have a look at the tuning they do. It hasn't changed in years,
except to remove now-unnecessary complexity realated to really,
really old and slow disk devices, and to add a few thing that
guess workload but still do a worse job than if the workload
generator just did its own optimisations in the first place.

>   http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html

Well, this one, with statements like "Postgres does have control
over its buffer cache," I don't know what to say. You can interpret
the statement however you like, but in the end Postgres very little
control at all over how data is moved between memory and disk.

BTW, please don't take me as saying that all control over physical
IO should be done by Postgres. I just think that Posgres could do
a better job of managing data transfer between disk and memory than
the OS can. The rest of the things (using raw paritions, read-ahead,
free-behind, etc.) just drop out of that one idea.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Tom Lane


Curt Sampson <[EMAIL PROTECTED]> writes:
> Grabbing bigger chunks is always optimal, AFICT, if they're not
> *too* big and you use the data. A single 64K read takes very little
> longer than a single 8K read.

Proof?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Tatsuo Ishii


> Curt Sampson <[EMAIL PROTECTED]> writes:
> > Grabbing bigger chunks is always optimal, AFICT, if they're not
> > *too* big and you use the data. A single 64K read takes very little
> > longer than a single 8K read.
> 
> Proof?

Long time ago I tested with the 32k block size and got 1.5-2x speed up
comparing ordinary 8k block size in the sequential scan case.
FYI, if this is the case.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Bruce Momjian

Well, this is a very interesting email.  Let me comment on some points.

---

Curt Sampson wrote:
> On Wed, 24 Apr 2002, Bruce Momjian wrote:
> 
> > > 1. Not all systems do readahead.
> >
> > If they don't, that isn't our problem.  We expect it to be there, and if
> > it isn't, the vendor/kernel is at fault.
> 
> It is your problem when another database kicks Postgres' ass
> performance-wise.
> 
> And at that point, *you're* at fault. You're the one who's knowingly
> decided to do things inefficiently.

It is just hard to imagine an OS not doing read-ahead, at least in
simple cases.

> Sorry if this sounds harsh, but this, "Oh, someone else is to blame"
> attitude gets me steamed. It's one thing to say, "We don't support
> this." That's fine; there are often good reasons for that. It's a
> completely different thing to say, "It's an unrelated entity's fault we
> don't support this."

Well, we are guilty of trying to push as much as possible on to other
software.  We do this for portability reasons, and because we think our
time is best spent dealing with db issues, not issues then can be deal
with by other existing software, as long as the software is decent.

> At any rate, relying on the kernel to guess how to optimise for
> the workload will never work as well as well as the software that
> knows the workload doing the optimization.

Sure, that is certainly true.  However, it is hard to know what the
future will hold even if we had perfect knowledge of what was happening
in the kernel.  We don't know who else is going to start doing I/O once
our I/O starts.  We may have a better idea with kernel knowledge, but we
still don't know 100% what will be cached.

> The lack of support thing is no joke. Sure, lots of systems nowadays
> support unified buffer cache and read-ahead. But how many, besides
> Solaris, support free-behind, which is also very important to avoid

We have free-behind on our list.  I think LRU-K will do this quite well
and be a nice general solution for more than just sequential scans.

> blowing out your buffer cache when doing sequential reads? And who
> at all supports read-ahead for reverse scans? (Or does Postgres
> not do those, anyway? I can see the support is there.)
> 
> And even when the facilities are there, you create problems by
> using them.  Look at the OS buffer cache, for example. Not only do
> we lose efficiency by using two layers of caching, but (as people
> have pointed out recently on the lists), the optimizer can't even
> know how much or what is being cached, and thus can't make decisions
> based on that.

Again, are you going to know 100% anyway?

> 
> > Yes, seek() in file will turn off read-ahead.  Grabbing bigger chunks
> > would help here, but if you have two people already reading from the
> > same file, grabbing bigger chunks of the file may not be optimal.
> 
> Grabbing bigger chunks is always optimal, AFICT, if they're not
> *too* big and you use the data. A single 64K read takes very little
> longer than a single 8K read.

There may be validity in this.  It is easy to do (I think) and could be
a win.

> > > 3. Even when the read-ahead does occur, you're still doing more
> > > syscalls, and thus more expensive kernel/userland transitions, than
> > > you have to.
> >
> > I would guess the performance impact is minimal.
> 
> If it were minimal, people wouldn't work so hard to build multi-level
> thread systems, where multiple userland threads are scheduled on
> top of kernel threads.
> 
> However, it does depend on how much CPU your particular application
> is using. You may have it to spare.

I assume those apps are doing tons of kernel calls.  I don't think we
really do that many.

> > http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html
> 
> Well, this message has some points in it that I feel are just incorrect.
> 
> 1. It is *not* true that you have no idea where data is when
> using a storage array or other similar system. While you
> certainly ought not worry about things such as head positions
> and so on, it's been a given for a long, long time that two
> blocks that have close index numbers are going to be close
> together in physical storage.

SCSI drivers, for example, are pretty smart.  Not sure we can take
advantage of that from user-land I/O.

> 2. Raw devices are quite standard across Unix systems (except
> in the unfortunate case of Linux, which I think has been
> remedied, hasn't it?). They're very portable, and have just as
> well--if not better--defined write semantics as a filesystem.

Yes, but we are seeing some db's moving away from raw I/O.  Our
performance numbers beat most of the big db's already, so we must be
doing something right.  In fact, our big failing is more is missing
features and limitations of our db, rather than performance.

> 3. My observations of OS performance tu

Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-24 Thread Curt Sampson

On Wed, 24 Apr 2002, mlw wrote:

> I am not arguing about whether or not they do it, I am saying it is
> not always possible. I/O requests do not remain in queue waiting for
> reordering indefinitely.

It doesn't matter. When they go out to the disk they go out in
order. On every Unix-based OS I know of, and Novell Netware, if
you submit a single read request for consecutive blocks, those
blocks *will* be read sequentially, no matter what the system load.

So to get back to the original arugment:

> > >The supposed advantage of a sequential read over an random read, in
> > >an active multitasking system, is a myth. If you are executing one
> > >query and the system is doing only that query, you may be right.

No, it's very real, because your sequential read will not be broken up.

If you think it will, let me know which operating systems this
happens on, and how exactly it happens.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] md5 passwords and pg_shadow

2002-04-24 Thread Neil Conway


Hi all,

Why does the password_encryption GUC variable default to false?

AFAICT there shouldn't be any issues with client compatibility -- in
fact, I'd be inclined to rip out all support for storing cleartext
passwords...

Cheers,

Neil

-- 
Neil Conway <[EMAIL PROTECTED]>
PGP Key ID: DB3C29FC

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Sequential Scan Read-Ahead

2002-04-24 Thread Michael Loftis

Tom Lane wrote:

>Curt Sampson <[EMAIL PROTECTED]> writes:
>
>>Grabbing bigger chunks is always optimal, AFICT, if they're not
>>*too* big and you use the data. A single 64K read takes very little
>>longer than a single 8K read.
>>
>
>Proof?
>
I contend this statement.

It's optimal to a point.  I know that my system settles into it's best 
read-speeds @ 32K or 64K chunks.  8K chunks are far below optimal for my 
system.  Most systems I work on do far better at 16K than at 8K, and 
most don't see any degradation when going to 32K chunks.  (this is 
across numerous OSes and configs -- results are interpretations from 
bonnie disk i/o marks).

Depending on what you're doing it is more efficiend to read bigger 
blocks up to a point.  If you're multi-thread or reading in non-blocking 
mode, take as big a chunk as you can handle or are ready to process in 
quick order.  If you're picking up a bunch of little chunks here and 
there and know oyu're not using them again then choose a size that will 
hopeuflly cause some of the reads to overlap, failing that, pick the 
smallest usable read size.

The OS can never do that stuff for you.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] md5 passwords and pg_shadow

2002-04-24 Thread Bruce Momjian

Neil Conway wrote:
> Hi all,
> 
> Why does the password_encryption GUC variable default to false?
> 
> AFAICT there shouldn't be any issues with client compatibility -- in
> fact, I'd be inclined to rip out all support for storing cleartext
> passwords...

It is false so passwords can be handled by pre-7.2 clients.  Once you
encrypt them, you can't use passwords on pre-7.2 clients because they
don't understand the double-md5 hash required.  We will set it to true,
but when are most pre-7.2 clients gone?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-24 Thread Michael Loftis

A Block-sized read will not be rboken up.  But if you're reading ina 
 size bigger than the underlying systems block sizes then it can get 
broken up.

So yes a sequential read will get broken up.  A single read request for 
a block may or may not get broken up.  If you're freading with set block 
sizes you'll see the set sizes of blocks come through, but what the 
underlyign OS does is undefined, same for writing.  If the underlying 
block size is 8KB and you dump 4MB down on it, the OS may (and in many 
cases does) decide to write part of it, do a read ona  nearby sector, 
then write the rest.  This happens when doing long writes that end up 
spanning block groups because the inodes must be allocated.  

So the write WILL get broken up.  Reads are under the same gun.  IT all 
depends on how big.  To the application you may or may not see this 
(probably not, unless you're set non-blocking, because the kernel will 
just sleep you until your data is ready).  Further large read requests 
can of course be re-ordered by hardware.  Tagged Command Queueing on 
SCSI drives and RAIDs.  The ICP Vortex cards I use ina  number of 
systems have 64MB on-board cache.  They quite happily, and often 
re-order reads and writes when queueing them to keep things moving as 
fast as possible (Intel didn't buy them for their cards, they use the 
i960 as it is, Intel swiped them for their IP rights).  The OS also tags 
commands it fires to the ICP, which can be re-ordered on block-sized chunks.

Curt Sampson wrote:

>On Wed, 24 Apr 2002, mlw wrote:
>
>>I am not arguing about whether or not they do it, I am saying it is
>>not always possible. I/O requests do not remain in queue waiting for
>>reordering indefinitely.
>>
>
>It doesn't matter. When they go out to the disk they go out in
>order. On every Unix-based OS I know of, and Novell Netware, if
>you submit a single read request for consecutive blocks, those
>blocks *will* be read sequentially, no matter what the system load.
>
>So to get back to the original arugment:
>
The supposed advantage of a sequential read over an random read, in
an active multitasking system, is a myth. If you are executing one
query and the system is doing only that query, you may be right.

>
>No, it's very real, because your sequential read will not be broken up.
>
>If you think it will, let me know which operating systems this
>happens on, and how exactly it happens.
>
>cjs
>

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE

2002-04-24 Thread Curt Sampson

On Wed, 24 Apr 2002, Michael Loftis wrote:

> A Block-sized read will not be rboken up.  But if you're reading ina
>  size bigger than the underlying systems block sizes then it can get
> broken up.

In which operating systems, and under what circumstances?

I'll agree that some OSs may not coalesce adjacent reads into a
single read, but even so, submitting a bunch of single reads for
consecutive blocks is going to be much, much faster than if other,
random I/O occured between those reads.

> If the underlying
> block size is 8KB and you dump 4MB down on it, the OS may (and in many
> cases does) decide to write part of it, do a read ona  nearby sector,
> then write the rest.  This happens when doing long writes that end up
> spanning block groups because the inodes must be allocated.

Um...we're talking about 64K vs 8K reads here, not 4 MB reads. I am
certainly not suggesting Posgres ever submit 4 MB read requests to the OS.

I agree that any single-chunk reads or writes that cause non-adjacent
disk blocks to be accessed may be broken up. But in my sense,
they're "broken up" anyway, in that you have no choice but to take
a performance hit.

> Further large read requests can of course be re-ordered by hardware.
> ...The OS also tags ICP, which can be re-ordered on block-sized chunks.

Right. All the more reason to read in larger chunks when we know what we
need in advance, because that will give the OS, controllers, etc. more
advance information, and let them do the reads more efficiently.

cjs
-- 
Curt Sampson  <[EMAIL PROTECTED]>   +81 90 7737 2974   http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light.  --XTC

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread Hannu Krosing


On Thu, 2002-04-25 at 00:46, mlw wrote:
> We have had several threads about index usage, specifically when PostgreSQL has
> the choice of using one or not.
> 
> There seems to be a few points of view:
> 
> (1) The planner and statistics need to improve, so that erroneously using an
> index (or not) happens less frequently or not at all.
> 
> (2) Use programmatic hints which allow coders specify which indexes are used
> during a query. (ala Oracle)
> 
> (3) It is pretty much OK as-is, just use enable_seqscan=false in the query.
> 
> My point of view is about this subject is one from personal experience. I had a
> database on which PostgreSQL would always (erroneously) choose not to use an
> index. Are my experiences typical? Probably not, but are experiences like it
> very common?

I have currently 2 databases that run with enable_seqscan=false to avoid
choosing plans that take forever.

> I don't know, but we see a number "Why won't PostgreSQL use my
> index" messages to at least conclude that it happens every now and then. In my
> experience, when it happens, it is very frustrating.
> 
> I think statement (1) is a good idea, but I think it is optimistic to expect
> that a statistical analysis of a table will contain enough information for all
> possible cases.

Perhaps we can come up with some special rules to avoid grossly pessimal
plans.


Hannu



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: [HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread Luis Alberto Amigo Navarro


>
> (2) Use programmatic hints which allow coders specify which indexes are
used
> during a query. (ala Oracle)
>
As I said before it would be useful a way to improve(not force) using
indexes on particular queries, i.e. lowering the cost of using this index on
this query.
Regards


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Re: [HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread Hannu Krosing


On Thu, 2002-04-25 at 08:42, Luis Alberto Amigo Navarro wrote:
> >
> > (2) Use programmatic hints which allow coders specify which indexes are
> used
> > during a query. (ala Oracle)
> >
> As I said before it would be useful a way to improve(not force) using
> indexes on particular queries, i.e. lowering the cost of using this index on
> this query.
> Regards

I was told that DB2 has per-table (or rather per-tablespace) knowledge
of disk speeds, so keeping separate random and seqsqan costs for each 
table and index could be a good way here (to force use of a particular
index make its use cheap)


Hannu




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] PostgreSQL index usage discussion.

2002-04-24 Thread Luis Alberto Amigo Navarro


> I was told that DB2 has per-table (or rather per-tablespace) knowledge
> of disk speeds, so keeping separate random and seqsqan costs for each
> table and index could be a good way here (to force use of a particular
> index make its use cheap)
>

I was wondering something even easier, keeping 1 cost per index, 1 cost per
seqscan, but being allowed to scale cost for each index on each
query(recommended, null or unrecommended)
Regards


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

77 matches

Mail list logo