[HACKERS] Sequential Scan Read-Ahead
At 12:41 PM 4/23/02 -0400, Bruce Momjian wrote: >This is an interesting point, that an index scan may fit in the cache >while a sequential scan may not. If so, I would expect that the number of pages read is significantly smaller than it was with a sequential scan. If that's the case, doesn't that mean that the optimizer made the wrong choice anyway? BTW, I just did a quick walk down this chain of code to see what happens during a sequential scan: access/heap/heapam.c storage/buffer/bufmgr.c storage/smgr/smgr.c storage/smgr/md.c and it looks to me like individual reads are being done in BLKSIZE chunks, whether we're scanning or not. During a sequential scan, I've heard that it's more efficient to read in multiples of your blocksize, say, 64K chunks rather than 8K chunks, for each read operation you pass to the OS. Does anybody have any experience to know if this is indeed the case? Has anybody ever added this to postgresql and benchmarked it? Certainly if there's a transaction based limit on disk I/O, as well as a throughput limit, it would be better to read in larger chunks. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE
Curt Sampson wrote: > > On Tue, 23 Apr 2002, mlw wrote: > > > > On a system that has neither read-ahead nor sorting of I/O requests, > > > yes. Which systems are you using that provide neither of these > > > facilities? > > > > This only happens if the OS can organize the I/O requests in such a manner. It > > is a non-trivial function. > > Well, if you call less than 200 lines of code (including lots of > comments), "non-trivial," yes. Have a look at NetBSD's > src/sys/kern/subr_disk.c for one example implementation. > > But trivial or not, if all operating systems on which Postgres runs > are doing this, your point is, well, pointless. So, once again, which > systems are you using that do *not* do this? I am not arguing about whether or not they do it, I am saying it is not always possible. I/O requests do not remain in queue waiting for reordering indefinitely. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Returning text from stored procedures??
Hi! I seem to have trouble returning large strings of text from stored procedures (coded in C), It works fine for smaller data? And elog prints out the result nicely; but when the result is "returned" it makes the server disconnect I use the standard Datum methods (actually I've tried any possible way returning the damn text :-( ). Any Ideas why? Are there restriction on the size of text a stored procedure can return? Any help is appreciated :-) /Steffen Nielsen ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] "make report"
> If you want to collect information about what features are portable you > can check other software packages, product manuals, ports trees, etc. > Most issues are documented someplace. Oh goodness. Thanks for offering me a new hobby ;) > Btw., yes, 8 byte integers are missing on some platforms. Right. The two areas which come to mind are integer availability and the timezone support (as you might know we support *three* different time zone models). At the moment, none of the developers know the features supported on the platforms we claim to support. Which platforms do not have int8 support still? Which do not have time zone interfaces fitting into the two "zonefull" styles? I'd like to know, but istm that the people *with* the platforms could do this much more easily than those without. What am I missing here?? - Thomas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > > I'd be willing to consider making the behavior variable-specific > > if anyone can identify particular variables that need to behave > > differently. But overall I think it's better that the behavior > > be consistent --- so you'll need a good argument to convince me > > that anything should behave differently ;-). > > > > There is a variant case that should also have been illustrated: > > what if there is no error, but the user does ROLLBACK instead of > > COMMIT? The particular case that is causing difficulty for me is > > > > begin; > > create schema foo; > > set search_path = foo; > > rollback; > > > > There is *no* alternative here but to roll back the search_path > > setting. > > begin; > ; > ERROR: parser: parse error at or near "" > > There's *no* alternative here but to call *rollback*(commit). > However PostgreSQL doesn't call *rollback* automatically and > it's the user's responsibility to call *rollback* on errors. > IMHO what to do with errors is users' responsibility basically. > The behavior of the *search_path" variable is a *had better* > or *convenient* kind of thing not a *no alternative* kind > of thing. I understand from an ODBC perspective that it is the apps responsibility, but we need some defined behavior for a psql script that is fed into the database. Assuming the SET commands continue to come after it is aborted but before the COMMIT/ROLLBACK, we need to define how to handle it. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] namedatalen part 2 (cont'd)
Tom Lane wrote: > Neil Conway <[EMAIL PROTECTED]> writes: > > ...Based on that data, I'd vote against making any changes to NAMEDATALEN. > > It looked to me like the cost for going to NAMEDATALEN = 64 would be > reasonable. Based on these numbers I'd have a problem with 128 or more. > > But as you observe, pgbench numbers are not very repeatable. It'd be > nice to have some similar experiments with another benchmark before > making a decision. Yes, 64 looked like the appropriate value too. Actually, I was surprised to see as much of a slowdown as we did. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Inefficient handling of LO-restore + Patch
OK, I have applied the following patch to fix these warnings. However, I need Mario to confirm these are the right changes. Thanks. --- Peter Eisentraut wrote: > This patch does not compile correctly: > > pg_backup_archiver.c: In function `ahwrite': > pg_backup_archiver.c:1252: warning: pointer of type `void *' used in arithmetic > pg_backup_archiver.c:1259: warning: pointer of type `void *' used in arithmetic > pg_backup_archiver.c:1263: warning: pointer of type `void *' used in arithmetic > make: *** [pg_backup_archiver.o] Error 1 > > > Bruce Momjian writes: > > > > > Patch applied. Thanks. > > > > --- > > > > > > Mario Weilguni wrote: > > > Am Donnerstag, 11. April 2002 17:44 schrieb Tom Lane: > > > > "Mario Weilguni" <[EMAIL PROTECTED]> writes: > > > > > And I did not find out how I can detect the large object > > > > > chunksize, either from getting it from the headers (include > > > > > "storage/large_object.h" did not work) > > > > > > > > > > You did not answer if it's ok to post the patch, hope it's ok: > > > > > > -- > Peter Eisentraut [EMAIL PROTECTED] > > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 Index: src/bin/pg_dump/pg_backup_archiver.c === RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_backup_archiver.c,v retrieving revision 1.43 diff -c -r1.43 pg_backup_archiver.c *** src/bin/pg_dump/pg_backup_archiver.c24 Apr 2002 02:21:04 - 1.43 --- src/bin/pg_dump/pg_backup_archiver.c24 Apr 2002 14:01:15 - *** *** 1249,1266 int remaining = AH->lo_buf_size - AH->lo_buf_used; int slack = nmemb * size - remaining; ! memcpy(AH->lo_buf + AH->lo_buf_used, ptr, remaining); res = lo_write(AH->connection, AH->loFd, AH->lo_buf, AH->lo_buf_size); ahlog(AH, 5, "wrote %d bytes of large object data (result = %d)\n", AH->lo_buf_size, res); if (res != AH->lo_buf_size) die_horribly(AH, modulename, "could not write to large object (result: %d, expected: %d)\n", res, AH->lo_buf_size); ! memcpy(AH->lo_buf, ptr + remaining, slack); AH->lo_buf_used = slack; } else { /* LO Buffer is still large enough, buffer it */ !memcpy(AH->lo_buf + AH->lo_buf_used, ptr, size * nmemb); AH->lo_buf_used += size * nmemb; } --- 1249,1266 int remaining = AH->lo_buf_size - AH->lo_buf_used; int slack = nmemb * size - remaining; ! memcpy((char *)AH->lo_buf + AH->lo_buf_used, ptr, remaining); res = lo_write(AH->connection, AH->loFd, AH->lo_buf, AH->lo_buf_size); ahlog(AH, 5, "wrote %d bytes of large object data (result = %d)\n", AH->lo_buf_size, res); if (res != AH->lo_buf_size) die_horribly(AH, modulename, "could not write to large object (result: %d, expected: %d)\n", res, AH->lo_buf_size); ! memcpy(AH->lo_buf, (char *)ptr + remaining, slack); AH->lo_buf_used = slack; } else { /* LO Buffer is still large enough, buffer it */ !memcpy((char *)AH->lo_buf + AH->lo_buf_used, ptr, size * nmemb); AH->lo_buf_used += size * nmemb; } ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Sequential Scan Read-Ahead
Curt Sampson wrote: > At 12:41 PM 4/23/02 -0400, Bruce Momjian wrote: > > >This is an interesting point, that an index scan may fit in the cache > >while a sequential scan may not. > > If so, I would expect that the number of pages read is significantly > smaller than it was with a sequential scan. If that's the case, > doesn't that mean that the optimizer made the wrong choice anyway? > > BTW, I just did a quick walk down this chain of code to see what happens > during a sequential scan: > > access/heap/heapam.c > storage/buffer/bufmgr.c > storage/smgr/smgr.c > storage/smgr/md.c > > and it looks to me like individual reads are being done in BLKSIZE > chunks, whether we're scanning or not. > > During a sequential scan, I've heard that it's more efficient to > read in multiples of your blocksize, say, 64K chunks rather than > 8K chunks, for each read operation you pass to the OS. Does anybody > have any experience to know if this is indeed the case? Has anybody > ever added this to postgresql and benchmarked it? > > Certainly if there's a transaction based limit on disk I/O, as well > as a throughput limit, it would be better to read in larger chunks. We expect the file system to do re-aheads during a sequential scan. This will not happen if someone else is also reading buffers from that table in another place. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Inefficient handling of LO-restore + Patch
I wanted to correct the patch this evening after work, and will check your changes. Thanks! -Ursprüngliche Nachricht- Von: Bruce Momjian [mailto:[EMAIL PROTECTED]] Gesendet: Mittwoch, 24. April 2002 16:03 An: Peter Eisentraut Cc: Mario Weilguni; [EMAIL PROTECTED] Betreff: Re: [HACKERS] Inefficient handling of LO-restore + Patch OK, I have applied the following patch to fix these warnings. However, I need Mario to confirm these are the right changes. Thanks. --- Peter Eisentraut wrote: > This patch does not compile correctly: > > pg_backup_archiver.c: In function `ahwrite': > pg_backup_archiver.c:1252: warning: pointer of type `void *' used in arithmetic > pg_backup_archiver.c:1259: warning: pointer of type `void *' used in arithmetic > pg_backup_archiver.c:1263: warning: pointer of type `void *' used in arithmetic > make: *** [pg_backup_archiver.o] Error 1 > > > Bruce Momjian writes: > > > > > Patch applied. Thanks. > > > > --- > > > > > > Mario Weilguni wrote: > > > Am Donnerstag, 11. April 2002 17:44 schrieb Tom Lane: > > > > "Mario Weilguni" <[EMAIL PROTECTED]> writes: > > > > > And I did not find out how I can detect the large object > > > > > chunksize, either from getting it from the headers (include > > > > > "storage/large_object.h" did not work) > > > > > > > > > > You did not answer if it's ok to post the patch, hope it's ok: > > > > > > -- > Peter Eisentraut [EMAIL PROTECTED] > > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Inefficient handling of LO-restore + Patch
Philip Warner <[EMAIL PROTECTED]> writes: > At 16:46 15/04/02 +0200, Mario Weilguni wrote: >> And how about getting database internals via SQL-functions - e.g. getting >> BLCSIZE, LOBBLCSIZE? > ISTM that there would be some merit in making a selection of compile-time > options available via SQL. Is this worth considering? This could usefully be combined with the nearby thread about recording configuration options (started by Thomas). I'd be inclined to make it a low-footprint affair where you do something like select compilation_options(); and you get back a long textual list of var=value settings, say VERSION=7.3devel PLATFORM=hppa-hp-hpux10.20, compiled by GCC 2.95.3 BLCKSZ=8192 MULTIBYTE=yes etc etc etc etc This would solve the diagnostic need as-is, and it doesn't seem unreasonable to me to expect applications to look through the output for a particular line if they need to get the state of a specific configuration option. It's also trivial to extend/modify as the set of options changes over time. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] namedatalen part 2 (cont'd)
Bruce Momjian <[EMAIL PROTECTED]> writes: > Yes, 64 looked like the appropriate value too. Actually, I was > surprised to see as much of a slowdown as we did. I was too. pgbench runs the same backend(s) throughout the test, so it shouldn't be paying anything meaningful in disk I/O for the larger catalog size. After the first set of queries all the relevant catalog rows will be cached in syscache. So where's the performance hit coming from? It'd be interesting to redo these runs with profiling turned on and compare the profiles at, say, 32 and 512 to see where the time is going for larger NAMEDATALEN. Might be something that's easy to fix once we identify it. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote on SET in aborted transaction
> OK, would people please vote on how to handle SET in an aborted > transaction? > at the end, should 'x' equal: > 1 - All SETs are rolled back in aborted transaction > 2 - SETs are ignored after transaction abort > 3 - All SETs are honored in aborted transaction > ? - Have SETs vary in behavior depending on variable I'll vote for "?", if for no other reason that you are proposing taking away a huge chunk of "language space" by apriori disallowing out of band behaviors for anything starting with "SET". I think that is likely Hiroshi's concern also. If we can fit all current "SET" behaviors into a transaction model, then I'm not against that (though we should review the list of attributes which *are* currently affected before settling on this). afaik we have not reviewed current behaviors and have not thought through the "what if's" that some soft of premature policy decision might constrain in the future. Let me give you some examples. We might someday have nested transactions, or transactions which can be resumed from the point of failure. We *might* want to be able to affect recovery behaviors, and we *might* want to do so with something like begin; update foo... update bar... set blah to blah update baz... update bar... end; Now we currently *don't* support this behavior, but istm that we shouldn't preclude it in the language by forcing some blanket "all SET statements will be transaction aware". What language elements would you propose to cover the out of band cases if you *do* disallow "SET" in that context? If you don't have a candidate, I'd be even more reluctant to go along with the results of some arbitrary vote which is done in a narrow context. And btw, if we *are* going to put transaction semantics on all of our global variables (which is the context for starting this "SET" discussion, right? Is that really the context we are still in, even though you have phrased a much more general statement above?) then let's have the discussion on *HOW* we are going to accomplish that *BEFORE* deciding to make a semantic constraint on our language support. Hmm, if we are going to use transaction semantics, then we should consider using our existing transaction mechanisms, and if we use our existing transaction mechanisms we should consider pushing these global variables into tables or in memory tables a la "temp tables". We get the transaction semantics for free, with the cost of value lookup at the beginning of a transaction or statement (not sure what we can get away with here). If we are *not* going to use those existing mechanisms, then what mechanism *are* we going to use? Some sort of "abort hook" mechanism to allow SET to register things to be rolled back? If we end up making changes and increasing constraints, then we should also expect some increased functionality as part of the scheme, specifically "SET extensibility". We should allow (future) packages to define their parameters and allow SET to help. Just some thoughts... - Thomas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] Table checking/dumping program
[Please CC any replies, I'm subscribed nomail] As promised I've given it a bit of polish and it's actually almost useful. You can have a look at it http://svana.org/kleptog/pgsql/pgfsck.html Just unpack the files into a directory. It's just a perl script with two modules so no compiling necessary. You can download the package directly at http://svana.org/kleptog/pgsql/pgfsck-0.01.tar.gz I've tested it on versions 6.5, 7.0 and 7.2 and it works. It shouldn't crash, no matter how bad a file you feed it. It can output insert statements also to help reconstruction. Here is an example of the program being run over a suitably hexedited file. # ./pgfsck -r 16559 kleptog website -- Detected database format 7.2 -- Table pg_class(1259):Page 1:Tuple 0: Unknown type _aclitem (1034) -- Table pg_class(1259):Page 1:Tuple 49: Unknown type _aclitem (1034) -- Table website(16559):Page 0:Tuple 7: Tuple incorrect length (parsed data=57,length=1638) -- Table website(16559):Page 0:Tuple 44: Decoding tuple runs off end: 627338916 > 69 -- Table website(16559):Page 0:Tuple 70: Bad tuple offset. Should be: 3784 <= 11592 < 8192 Currently the following features are not supported: - Toasted / compressed tuples - Checking indexes doesn't work (should it?) - Views just produce empty output (because they are) - Arrays don't work - Since each type output has to be written, many types are not correctly output - Split tables (1GB) are not supported past the first part. - Some system tables in some versions have a strange layout. You may get many harmless warnings about the formats of pg_class, pg_attribute and/or pg_type. Most of these are basically because I don't know how they work, but with a bit of work some of these should be fixable. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > Canada, Mexico, and Australia form the Axis of Nations That > Are Actually Quite Nice But Secretly Have Nasty Thoughts About America ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Thomas Lockhart <[EMAIL PROTECTED]> writes: > Let me give you some examples. We might someday have nested > transactions, or transactions which can be resumed from the point of > failure. We *might* want to be able to affect recovery behaviors, and we > *might* want to do so with something like > begin; > update foo... > update bar... > > set blah to blah > update baz... > update bar... > > end; Sure, once we have savepoints or nested transactions I would expect SET to work like that. The "alternative 1" should better be phrased as "SETs should work the same way as regular SQL commands do". I agree with your comment that it would be useful to look closely at the list of settable variables to see whether any of them need different semantics. Here's the list of everything that can be SET after backend start (some of these require superuser privilege to set, but that seems irrelevant): datestyle timezone XactIsoLevel client_encoding server_encoding seed session_authorization enable_seqscan enable_indexscan enable_tidscan enable_sort enable_nestloop enable_mergejoin enable_hashjoin ksqo geqo debug_assertions debug_print_query debug_print_parse debug_print_rewritten debug_print_plan debug_pretty_print show_parser_stats show_planner_stats show_executor_stats show_query_stats show_btree_build_stats explain_pretty_print stats_command_string stats_row_level stats_block_level trace_notify trace_locks trace_userlocks trace_lwlocks debug_deadlocks sql_inheritance australian_timezones password_encryption transform_null_equals geqo_threshold geqo_pool_size geqo_effort geqo_generations geqo_random_seed sort_mem vacuum_mem trace_lock_oidmin trace_lock_table max_expr_depth wal_debug commit_delay commit_siblings effective_cache_size random_page_cost cpu_tuple_cost cpu_index_tuple_cost cpu_operator_cost geqo_selection_bias client_min_messages default_transaction_isolation dynamic_library_path search_path server_min_messages Right offhand, I am not seeing anything here for which there's a compelling case not to roll it back on error. In fact, I have yet to hear *any* plausible example of a variable that we would really seriously want not to roll back on error. > And btw, if we *are* going to put transaction semantics on all of our > global variables (which is the context for starting this "SET" > discussion, right? Is that really the context we are still in, even > though you have phrased a much more general statement above?) then let's > have the discussion on *HOW* we are going to accomplish that *BEFORE* > deciding to make a semantic constraint on our language support. Hardly necessary: we'll just make guc.c keep track of the start-of-transaction values of all variables that have changed in the current transaction, and restore them to that value upon transaction abort. Doesn't seem like a big deal to me. We've got tons of other code that does exactly the same sort of thing. > Hmm, if we are going to use transaction semantics, then we should > consider using our existing transaction mechanisms, and if we use our > existing transaction mechanisms we should consider pushing these global > variables into tables or in memory tables a la "temp tables". Quite a few of the GUC settings are values that need to be set and used during startup, before we have table access up and running. I do not think that it's very practical to expect them to be accessed through table access mechanisms. > If we end up making changes and increasing constraints, then we should > also expect some increased functionality as part of the scheme, > specifically "SET extensibility". It might well be a good idea to allow variables to be added to guc.c's lists on-the-fly by the initialization routines of loadable modules. But that's orthogonal to this discussion, IMHO. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Returning text from stored procedures??
Steffen Nielsen <[EMAIL PROTECTED]> writes: > Any Ideas why? Are there restriction on the size of text a stored procedure > can return? One gig ... It's hard to guess what you're doing wrong when you haven't shown us your code. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE
I was thinking in something independent from the executor, simply a variable that recommends or not the use of a particular index, it could be obtained from user, and so it could be improved(a factor lower than 1) on planner. How about something like this? - Original Message - From: "Bruce Momjian" <[EMAIL PROTECTED]> To: "Luis Alberto Amigo Navarro" <[EMAIL PROTECTED]> Cc: "Lincoln Yeoh" <[EMAIL PROTECTED]>; "Tom Lane" <[EMAIL PROTECTED]>; "mlw" <[EMAIL PROTECTED]>; "Andrew Sullivan" <[EMAIL PROTECTED]>; "PostgreSQL-development" <[EMAIL PROTECTED]> Sent: Tuesday, April 23, 2002 6:42 PM Subject: Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE > Luis Alberto Amigo Navarro wrote: > > Hi All. > > I've been reading all the thread and I want to add a few points: > > > > You can set enable_seqscan=off in small or easy queries, but in large > > queries index can speed parts of the query and slow other, so I think it is > > neccesary if you want Postgres to become a Wide-used DBMS that the planner > > could be able to decide accuratelly, in the thread there is a point that > > might be useful, it will be very interesting that the planner could learn > > with previous executions, even there could be a warm-up policy to let > > planner learn about how the DB is working, this info could be stored with DB > > data, and could statistically show how use of index or seqscan works on > > every column of the DB. > > Yes, I have always felt it would be good to feed back information from > the executor to the optimizer to help with later estimates. Of course, > I never figured out how to do it. :-) > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Parser translations and schemas
Here's today's tidbit for those who like to argue about nitty little details of behavior ... Presently, the parser (in particular gram.y) has quite a few special transformations for certain type and function names. For example, You write You get char(N) bpchar trim(BOTH foo) btrim(foo) The question for the day is: should these transformations be applied to schema-qualified names? And should the parser force the transformed names to be looked up in the system schema (pg_catalog), or should it allow them to be searched for using the regular namespace search path? I want to make the following proposal: 1. Transformations are applied only to unqualified names. If you write a qualified name then it is treated as a plain-vanilla identifier and looked up in the catalogs without transformation, even if the name component happens to match a name that would be transformed standing alone. 2. If a transformation is applied then the resulting name will always be forced to be looked up in the system schema; ie, the output will effectively be "pg_catalog.something" not just "something". Some examples: You write You get char(N) pg_catalog.bpchar pg_catalog.char pg_catalog.char (not bpchar) realpg_catalog.float4 myschema.real myschema.real (not float4) trim(BOTH foo) pg_catalog.btrim(foo) pg_catalog.trim(BOTH foo) an error (since the special production allowing BOTH won't be used) I have a number of reasons for thinking that this is a reasonable way to go. Point one: transforming qualified names seems to violate the "principle of least surprise". If I write myschema.real I would not expect that to be converted to myschema.float4, especially if I weren't aware that Postgres internally calls REAL "float4". Point two: I don't believe that we need to do it to meet the letter of the SQL spec. AFAICT the spec treats all the names of built-in types and functions as keywords, not as names belonging to a system schema. So special behavior is required for TRIM(foo) but not for DEFINITION_SCHEMA.TRIM(foo). Point three: if we do transform a name, then we are expecting a particular system type or function to be selected, and we ought to ensure that that happens; thus explicitly qualifying the output name seems proper. Again, this seems less surprising than other alternatives. If I have a datatype myschema.float4 that I've put into the search path in front of pg_catalog, I think I'd be surprised to have it get picked when I write REAL. Another reason for doing it this way is that I think it's necessary for reversibility. For example, consider what format_type should put out when it's trying to write a special-cased type name. If it needs to emit numeric(10,4) then it *cannot* stick "pg_catalog." on the front of that --- the result wouldn't parse. (At least not unless we uglify the grammar a whole lot more to allow pg_catalog.FOO everywhere that just FOO currently has a special production.) So we need to make parsing rules that guarantee that numeric(10,4) will be interpreted as pg_catalog.numeric and not something else, regardless of the active search path. On the other hand, a plain user datatype that happens to be named "real" should be accessible as myschema.real without interference from the real->float4 transformation. A corner case that maybe requires more discussion is what about type and function names that are reserved per spec, but which we do not need any special transformation for? For example, the spec thinks that OCTET_LENGTH() is a keyword, but our implementation treats it as an ordinary function name. I feel that the grammar should not prefix "pg_catalog." to any name that it hasn't transformed or treated specially in any way, even if that name is reserved per spec. Note that this will not actually lead to any non-spec-compliant behavior as long as one allows the system to search pg_catalog before any user-provided schemas --- which is in fact the default behavior, as it's currently set up. Another point is that I believe that REAL should be transformed to pg_catalog.float4, but the quoted identifier "real" should not be. This would require a bit of surgery --- presently xlateSqlType is applied to pretty much everything whether it's a quoted identifier or not. But if we allow xlateSqlType to continue to work that way, then user-schema names that happen to match one of its target names are going to behave strangely. I think we will need to get rid of xlateSqlType/xlateSqlFunc and instead implement all the name transformations we want as special productions, so that the target names are shown as keywords in keywords.c. Without this, ruleutils.c will not have a clue that the user type name "real" needs to be quoted to keep it from being transformed. BTW: as the code s
Re: [HACKERS] Inefficient handling of LO-restore + Patch
Am Mittwoch, 24. April 2002 16:03 schrieb Bruce Momjian: > OK, I have applied the following patch to fix these warnings. However, > I need Mario to confirm these are the right changes. Thanks. I've checked it and works fine, but the memcpy() prototype says it should be void pointers. Will this give errors with non-gcc compilers? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote on SET in aborted transaction
Vote number 1 -- ROLL BACK Bruce Momjian wrote: >OK, would people please vote on how to handle SET in an aborted >transaction? This vote will allow us to resolve the issue and move >forward if needed. > >In the case of: > > SET x=1; > BEGIN; > SET x=2; > query_that_aborts_transaction; > SET x=3; > COMMIT; > >at the end, should 'x' equal: > > 1 - All SETs are rolled back in aborted transaction > 2 - SETs are ignored after transaction abort > 3 - All SETs are honored in aborted transaction > ? - Have SETs vary in behavior depending on variable > >Our current behavior is 2. > >Please vote and I will tally the results. > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] "make report"
Thomas Lockhart writes: > Right. The two areas which come to mind are integer availability and the > timezone support (as you might know we support *three* different time > zone models). At the moment, none of the developers know the features > supported on the platforms we claim to support. Which platforms do not > have int8 support still? "Still" is the wrong word. There used to be platforms with certain areas of trouble, and those platforms don't go away. But since you asked: QNX 4 and SCO OpenServer are known to lack 8 byte integers. > Which do not have time zone interfaces fitting > into the two "zonefull" styles? I'd like to know, but istm that the > people *with* the platforms could do this much more easily than those > without. What am I missing here?? I don't think polling users this way will yield reliable results. If you really want to find out, break something and see if someone complains. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] "make report"
It depends. QNX4 may be used with GCC, in which case it does have long long. I am not sure if that combination will play along with Postgres, but it should not be assumed impossible. - Original Message - From: "Peter Eisentraut" <[EMAIL PROTECTED]> To: "Thomas Lockhart" <[EMAIL PROTECTED]> Cc: "PostgreSQL Hackers" <[EMAIL PROTECTED]> Sent: Wednesday, April 24, 2002 1:14 PM Subject: Re: [HACKERS] "make report" > Thomas Lockhart writes: > > > Right. The two areas which come to mind are integer availability and the > > timezone support (as you might know we support *three* different time > > zone models). At the moment, none of the developers know the features > > supported on the platforms we claim to support. Which platforms do not > > have int8 support still? > > "Still" is the wrong word. There used to be platforms with certain areas > of trouble, and those platforms don't go away. > > But since you asked: QNX 4 and SCO OpenServer are known to lack 8 byte > integers. > > > Which do not have time zone interfaces fitting > > into the two "zonefull" styles? I'd like to know, but istm that the > > people *with* the platforms could do this much more easily than those > > without. What am I missing here?? > > I don't think polling users this way will yield reliable results. If you > really want to find out, break something and see if someone complains. > > -- > Peter Eisentraut [EMAIL PROTECTED] > > > ---(end of broadcast)--- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Inefficient handling of LO-restore + Patch
Tom Lane writes: > This could usefully be combined with the nearby thread about recording > configuration options (started by Thomas). I'd be inclined to make it > a low-footprint affair where you do something like > > select compilation_options(); > > and you get back a long textual list of var=value settings, say > > VERSION=7.3devel > PLATFORM=hppa-hp-hpux10.20, compiled by GCC 2.95.3 > BLCKSZ=8192 > MULTIBYTE=yes > etc etc etc etc This assumes that compilation options only matter in the server and that only SQL users would be interested in them. In fact, compilation options affect client and utility programs as well, and it's not unusual to have a wild mix (if only unintentional). IMHO, the best place to put this information is in the version output, as in: $ ./psql --version psql (PostgreSQL) 7.3devel contains support for: readline An SQL interface in addition to that would be OK, too. But let's not dump everything into one SHOW command. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Vote on SET in aborted transaction
On Wed, 24 Apr 2002, Michael Loftis wrote: > Vote number 1 -- ROLL BACK I agree.. Number 1 - ROLL BACK > > Bruce Momjian wrote: > > >OK, would people please vote on how to handle SET in an aborted > >transaction? This vote will allow us to resolve the issue and move > >forward if needed. > > > >In the case of: > > > > SET x=1; > > BEGIN; > > SET x=2; > > query_that_aborts_transaction; > > SET x=3; > > COMMIT; > > > >at the end, should 'x' equal: > > > > 1 - All SETs are rolled back in aborted transaction > > 2 - SETs are ignored after transaction abort > > 3 - All SETs are honored in aborted transaction > > ? - Have SETs vary in behavior depending on variable > > > >Our current behavior is 2. > > > >Please vote and I will tally the results. > > > > > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster > Vince. -- == Vince Vielhaber -- KA8CSHemail: [EMAIL PROTECTED]http://www.pop4.net 56K Nationwide Dialup from $16.00/mo at Pop4 Networking Online Campground Directoryhttp://www.camping-usa.com Online Giftshop Superstorehttp://www.cloudninegifts.com == ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Parser translations and schemas
Tom Lane writes: > You write You get > > char(N) pg_catalog.bpchar > pg_catalog.char pg_catalog.char (not bpchar) > real pg_catalog.float4 > myschema.real myschema.real (not float4) > trim(BOTH foo)pg_catalog.btrim(foo) > pg_catalog.trim(BOTH foo) an error (since the special production > allowing BOTH won't be used) Exactly my thoughts. > A corner case that maybe requires more discussion is what about type and > function names that are reserved per spec, but which we do not need any > special transformation for? For example, the spec thinks that > OCTET_LENGTH() is a keyword, but our implementation treats it as an > ordinary function name. I feel that the grammar should not prefix > "pg_catalog." to any name that it hasn't transformed or treated > specially in any way, even if that name is reserved per spec. I agree. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] "make report"
Igor Kovalenko writes: > It depends. QNX4 may be used with GCC, in which case it does have long long. > I am not sure if that combination will play along with Postgres, but it > should not be assumed impossible. The point is, it should not be assumed possible. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Implement a .NET Data
Thanks Dave. I will create the project at gborg.postgresql.org on Friday :) > You might want to look at > http://gborg.postgresql.org. > > Regards, Dave. ___ Yahoo! Empregos O trabalho dos seus sonhos pode estar aqui. Cadastre-se hoje mesmo no Yahoo! Empregos e tenha acesso a milhares de vagas abertas! http://br.empregos.yahoo.com/ ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Inefficient handling of LO-restore + Patch
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Tom Lane writes: >> select compilation_options(); > This assumes that compilation options only matter in the server and that > only SQL users would be interested in them. In fact, compilation options > affect client and utility programs as well, and it's not unusual to have a > wild mix (if only unintentional). Good point. It'd be worthwhile to have some way of extracting such information from the clients as well. > IMHO, the best place to put this information is in the version output, as > in: > $ ./psql --version > psql (PostgreSQL) 7.3devel > contains support for: readline Is that sufficient? The clients probably are not affected by quite as many config options as the server, but they still have a nontrivial list. (Multibyte, SSL, Kerberos come to mind at once.) I'd not like to see us assume that a one-line output format will do the job. A way to interrogate the libpq being used by psql might be good too. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] PostgreSQL index usage discussion.
We have had several threads about index usage, specifically when PostgreSQL has the choice of using one or not. There seems to be a few points of view: (1) The planner and statistics need to improve, so that erroneously using an index (or not) happens less frequently or not at all. (2) Use programmatic hints which allow coders specify which indexes are used during a query. (ala Oracle) (3) It is pretty much OK as-is, just use enable_seqscan=false in the query. My point of view is about this subject is one from personal experience. I had a database on which PostgreSQL would always (erroneously) choose not to use an index. Are my experiences typical? Probably not, but are experiences like it very common? I don't know, but we see a number "Why won't PostgreSQL use my index" messages to at least conclude that it happens every now and then. In my experience, when it happens, it is very frustrating. I think statement (1) is a good idea, but I think it is optimistic to expect that a statistical analysis of a table will contain enough information for all possible cases. Statement (2) would allow the flexibility needed, but as was pointed out, the hints may become wrong over time as characteristics of the various change. Statement (3) is not good enough because disabling sequential scans affect whole queries and sub-queries which would correctly not use an index would be forced to do so. My personal preference is that some more specific mechanism than enable_seqscan be provided for the DBA to assure an index is used. Working on the statistics and the planner is fine, but I suspect there will always be a strong argument for manual override in the exceptional cases where it will be needed. What do you all think? What would be a good plan of attack? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Vote on SET in aborted transaction
Bruce Momjian wrote: > > OK, would people please vote on how to handle SET in an aborted > transaction? This vote will allow us to resolve the issue and move > forward if needed. > > In the case of: > > SET x=1; > BEGIN; > SET x=2; > query_that_aborts_transaction; > SET x=3; > COMMIT; > > at the end, should 'x' equal: > > 1 - All SETs are rolled back in aborted transaction > 2 - SETs are ignored after transaction abort > 3 - All SETs are honored in aborted transaction > ? - Have SETs vary in behavior depending on variable > > Our current behavior is 2. > > Please vote and I will tally the results. Is it a vote in the first place ? I will vote the current(2 + 3 + ?). regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Tom Lane wrote: > > Right offhand, I am not seeing anything here for which there's a > compelling case not to roll it back on error. > > In fact, I have yet to hear *any* plausible example of a variable > that we would really seriously want not to roll back on error. Honetsly I don't understand what kind of example you expect. How about the following ? [The curren schema is schema1] begin; create schema foo; set search_path = foo; create table t1 (); . [error occurs] rollback; insert into t1 select * from schema1.t1; Should the search_path be put back in this case ? As I mentioned already many times, it doesn't seem *should be* kind of thing. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Vote totals for SET in aborted transaction
OK, the votes are in: #1 Lamar Owen Jan Wieck Tom Lane Bruce Momjian Joe Conway Curt Sampson Michael Loftis Vince Vielhaber Sander Steffann #2 Bradley McLean #3 #? Thomas Lockhart Hiroshi Inoue Looks like #1 is the clear winner. --- Bruce Momjian wrote: > OK, would people please vote on how to handle SET in an aborted > transaction? This vote will allow us to resolve the issue and move > forward if needed. > > In the case of: > > SET x=1; > BEGIN; > SET x=2; > query_that_aborts_transaction; > SET x=3; > COMMIT; > > at the end, should 'x' equal: > > 1 - All SETs are rolled back in aborted transaction > 2 - SETs are ignored after transaction abort > 3 - All SETs are honored in aborted transaction > ? - Have SETs vary in behavior depending on variable > > Our current behavior is 2. > > Please vote and I will tally the results. > > -- > Bruce Momjian| http://candle.pha.pa.us > [EMAIL PROTECTED] | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > Tom Lane wrote: > > > > > Right offhand, I am not seeing anything here for which there's a > > compelling case not to roll it back on error. > > > > In fact, I have yet to hear *any* plausible example of a variable > > that we would really seriously want not to roll back on error. > > Honetsly I don't understand what kind of example you > expect. How about the following ? > > [The curren schema is schema1] > > begin; > create schema foo; > set search_path = foo; > create table t1 (); > . >[error occurs] > rollback; > insert into t1 select * from schema1.t1; > > Should the search_path be put back in this case ? > As I mentioned already many times, it doesn't seem > *should be* kind of thing. Sure should it! You gave an example for the need to roll back, because otherwise you would end up with an invalid search path "foo". I still believe that rolling back is the only right thing to do. What if your application doesn't even know that some changes happened? Have a trigger that set's seqscan off, does some stuff and intends to reset it later again. Now it elog's out before, so your application will have to live with this mis-setting on this pooled DB connection until the end? I don't think so! Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Jan Wieck wrote: > > Hiroshi Inoue wrote: > > Tom Lane wrote: > > > > > > > > Right offhand, I am not seeing anything here for which there's a > > > compelling case not to roll it back on error. > > > > > > In fact, I have yet to hear *any* plausible example of a variable > > > that we would really seriously want not to roll back on error. > > > > Honetsly I don't understand what kind of example you > > expect. How about the following ? > > > > [The curren schema is schema1] > > > > begin; > > create schema foo; > > set search_path = foo; > > create table t1 (); > > . > >[error occurs] > > rollback; > > insert into t1 select * from schema1.t1; > > > > Should the search_path be put back in this case ? > > As I mentioned already many times, it doesn't seem > > *should be* kind of thing. > > Sure should it! You gave an example for the need to roll > back, because > otherwise you would end up with an invalid > search path "foo". What's wrong with it ? The insert command after *rollback* would fail. It seems the right thing to me. Otherwise the insert command would try to append the data of the table t1 to itself. The insert command is for copying schema1.t1 to foo.t1 in case the previous create schema command suceeded. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue <[EMAIL PROTECTED]> writes: > Honetsly I don't understand what kind of example you > expect. How about the following ? > [The curren schema is schema1] > begin; > create schema foo; > set search_path = foo; > create table t1 (); > . >[error occurs] > rollback; > insert into t1 select * from schema1.t1; > Should the search_path be put back in this case ? Sure it should be. Otherwise it's pointing at a nonexistent schema. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > > OK, the votes are in: > > #1 > Lamar Owen > Jan Wieck > Tom Lane > Bruce Momjian > Joe Conway > Curt Sampson > Michael Loftis > Vince Vielhaber > Sander Steffann > > #2 > Bradley McLean > > > > #3 > > #? > Thomas Lockhart > Hiroshi Inoue > > Looks like #1 is the clear winner. I voted not only ? but also 2 and 3. And haven't I asked twice or so if it's a vote ? Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote totals for SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > OK, the votes are in: > > > > #1 > > Lamar Owen > > Jan Wieck > > Tom Lane > > Bruce Momjian > > Joe Conway > > Curt Sampson > > Michael Loftis > > Vince Vielhaber > > Sander Steffann > > > > #2 > > Bradley McLean > > > > > > > > #3 > > > > #? > > Thomas Lockhart > > Hiroshi Inoue > > > > Looks like #1 is the clear winner. > > I voted not only ? but also 2 and 3. > And haven't I asked twice or so if it's a vote ? Yes, it is a vote, and now that we see how everyone feels, we can decide what to do. Hiroshi, you can't vote for 2, 3, and ?. Please pick one. I picked '?' for you because it seemed the closest to your intent. I can put you down for 1/3 of a vote for all three if you wish. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > > Hiroshi Inoue wrote: > > Bruce Momjian wrote: > > > > > > OK, the votes are in: > > > > > > #1 > > > Lamar Owen > > > Jan Wieck > > > Tom Lane > > > Bruce Momjian > > > Joe Conway > > > Curt Sampson > > > Michael Loftis > > > Vince Vielhaber > > > Sander Steffann > > > > > > #2 > > > Bradley McLean > > > > > > > > > > > > #3 > > > > > > #? > > > Thomas Lockhart > > > Hiroshi Inoue > > > > > > Looks like #1 is the clear winner. > > > > I voted not only ? but also 2 and 3. > > And haven't I asked twice or so if it's a vote ? > > Yes, it is a vote, and now that we see how everyone feels, we can > decide what to do. > > Hiroshi, you can't vote for 2, 3, and ?. Why ? I don't think the items are exclusive. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote totals for SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > Hiroshi Inoue wrote: > > > Bruce Momjian wrote: > > > > > > > > OK, the votes are in: > > > > > > > > #1 > > > > Lamar Owen > > > > Jan Wieck > > > > Tom Lane > > > > Bruce Momjian > > > > Joe Conway > > > > Curt Sampson > > > > Michael Loftis > > > > Vince Vielhaber > > > > Sander Steffann > > > > > > > > #2 > > > > Bradley McLean > > > > > > > > > > > > > > > > #3 > > > > > > > > #? > > > > Thomas Lockhart > > > > Hiroshi Inoue > > > > > > > > Looks like #1 is the clear winner. > > > > > > I voted not only ? but also 2 and 3. > > > And haven't I asked twice or so if it's a vote ? > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > decide what to do. > > > > Hiroshi, you can't vote for 2, 3, and ?. > > Why ? > I don't think the items are exclusive. Well, 2 says roll back only after transaction aborts, 3 says honor all SET's, and ? says choose the behavior depending on the variable. How can you have 2, 3, and ?. Seems ? is the catch-all vote because it doesn't predefine the same behavior for all variables. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > > > > > > > > > I voted not only ? but also 2 and 3. > > > > And haven't I asked twice or so if it's a vote ? > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > > decide what to do. > > > > > > Hiroshi, you can't vote for 2, 3, and ?. > > > > Why ? > > I don't think the items are exclusive. > > Well, 2 says roll back only after transaction aborts, Sorry for my poor understanding. Isn't it 1 ? regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote totals for SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > > > > > > > > > I voted not only ? but also 2 and 3. > > > > > And haven't I asked twice or so if it's a vote ? > > > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > > > decide what to do. > > > > > > > > Hiroshi, you can't vote for 2, 3, and ?. > > > > > > Why ? > > > I don't think the items are exclusive. > > > > Well, 2 says roll back only after transaction aborts, > > Sorry for my poor understanding. > Isn't it 1 ? OK, original email attached. 1 rolls back all SETs in an aborted transaction. 2 ignores SETs after transaction aborts, but SETs before the transaction aborted are honored. 3 honors all SETs. --- In the case of: SET x=1; BEGIN; SET x=2; query_that_aborts_transaction; SET x=3; COMMIT; at the end, should 'x' equal: 1 - All SETs are rolled back in aborted transaction 2 - SETs are ignored after transaction abort 3 - All SETs are honored in aborted transaction ? - Have SETs vary in behavior depending on variable Our current behavior is 2. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: >What's wrong with it ? The insert command after *rollback* >would fail. It seems the right thing to me. Otherwise >the insert command would try to append the data of the >table t1 to itself. The insert command is for copying >schema1.t1 to foo.t1 in case the previous create schema >command suceeded. > Exactly, in this example shows exactly why SETs should be part of the transaction and roll back. Heck the insert may not even fail after all anyway and insert into the wrong schema. If the insert depends on the schema create succeeding it should be in the same transaction. (IE it would get rolled back or not happen at all) > > >regards, >Hiroshi Inoue > http://w2422.nsk.ne.jp/~inoue/ > >---(end of broadcast)--- >TIP 2: you can get off all lists at once with the unregister command >(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] PostgreSQL index usage discussion.
* mlw ([EMAIL PROTECTED]) [020424 18:51]: > > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) We would certainly use this if it were available. Hopefully not to shoot ourselves in the foot, but for the rather common case of having a small set of important predefined queries that run over data sets that neither grow significantly nor change in characteristics (for example, a table of airline routes and fares, with a few million rows). We want to squeeze every last bit of performance out of certain queries, and we're willing to spend the time to verify that the manual tuning beats the planner. > What do you all think? What would be a good plan of attack? I dunno. If someone comes up with one that I can reasonably contribute to, I will. -Brad ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > > Hiroshi Inoue wrote: > > Bruce Momjian wrote: > > > > > > > > > > > > > > > I voted not only ? but also 2 and 3. > > > > > > And haven't I asked twice or so if it's a vote ? > > > > > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > > > > decide what to do. > > > > > > > > > > Hiroshi, you can't vote for 2, 3, and ?. > > > > > > > > Why ? > > > > I don't think the items are exclusive. > > > > > > Well, 2 says roll back only after transaction aborts, > > > > Sorry for my poor understanding. > > Isn't it 1 ? > > OK, original email attached. 1 rolls back all SETs in an aborted > transaction. > 2 ignores SETs after transaction aborts, but SETs before > the transaction aborted are honored. Must I understand this from your previous posting (2 says roll back only after transaction aborts,) or original posting ? What I understood was 2 only says that SET fails between a failure and the subsequenct ROLLBACK call. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Sequential Scan Read-Ahead
On Wed, 24 Apr 2002, Bruce Momjian wrote: > We expect the file system to do re-aheads during a sequential scan. > This will not happen if someone else is also reading buffers from that > table in another place. Right. The essential difficulties are, as I see it: 1. Not all systems do readahead. 2. Even systems that do do it cannot always reliably detect that they need to. 3. Even when the read-ahead does occur, you're still doing more syscalls, and thus more expensive kernel/userland transitions, than you have to. Has anybody considered writing a storage manager that uses raw partitions and deals with its own buffer caching? This has the potential to be a lot more efficient, since the database server knows much more about its workload than the operating system can guess. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote totals for SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > Hiroshi Inoue wrote: > > > Bruce Momjian wrote: > > > > > > > > > > > > > > > > > > I voted not only ? but also 2 and 3. > > > > > > > And haven't I asked twice or so if it's a vote ? > > > > > > > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > > > > > decide what to do. > > > > > > > > > > > > Hiroshi, you can't vote for 2, 3, and ?. > > > > > > > > > > Why ? > > > > > I don't think the items are exclusive. > > > > > > > > Well, 2 says roll back only after transaction aborts, > > > > > > Sorry for my poor understanding. > > > Isn't it 1 ? > > > > OK, original email attached. 1 rolls back all SETs in an aborted > > transaction. > > > 2 ignores SETs after transaction aborts, but SETs before > > the transaction aborted are honored. > > Must I understand this from your previous posting > (2 says roll back only after transaction aborts,) > or original posting ? What I understood was 2 only > says that SET fails between a failure and the > subsequenct ROLLBACK call. Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK call, which is current behavior. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Vote on SET in aborted transaction
Michael Loftis wrote: > > Hiroshi Inoue wrote: > > >What's wrong with it ? The insert command after *rollback* > >would fail. It seems the right thing to me. Otherwise > >the insert command would try to append the data of the > >table t1 to itself. The insert command is for copying > >schema1.t1 to foo.t1 in case the previous create schema > >command suceeded. > > > Exactly, in this example shows exactly why SETs should be part of the > transaction and roll back. Heck the insert may not even fail after all > anyway and insert into the wrong schema. If the insert depends on the > schema create succeeding it should be in the same transaction. (IE it > would get rolled back or not happen at all) Where's the restriction that all objects in a schema must be created in an transaction ? Each user has his reason and would need various kind of command call sequence. What I've mainly insisted is what to do with errors is users' responsibilty but I've never seen the agreement for it. So my current understanding is you all are thinking what to do with errors is system's responsibilty. Then no matter how users call commands the dbms must behave appropriately, mustn't it ? regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Sequential Scan Read-Ahead
Curt Sampson wrote: > On Wed, 24 Apr 2002, Bruce Momjian wrote: > > > We expect the file system to do re-aheads during a sequential scan. > > This will not happen if someone else is also reading buffers from that > > table in another place. > > Right. The essential difficulties are, as I see it: > > 1. Not all systems do readahead. If they don't, that isn't our problem. We expect it to be there, and if it isn't, the vendor/kernel is at fault. > 2. Even systems that do do it cannot always reliably detect that > they need to. Yes, seek() in file will turn off read-ahead. Grabbing bigger chunks would help here, but if you have two people already reading from the same file, grabbing bigger chunks of the file may not be optimal. > 3. Even when the read-ahead does occur, you're still doing more > syscalls, and thus more expensive kernel/userland transitions, than > you have to. I would guess the performance impact is minimal. > Has anybody considered writing a storage manager that uses raw > partitions and deals with its own buffer caching? This has the potential > to be a lot more efficient, since the database server knows much more > about its workload than the operating system can guess. We have talked about it, but rejected it. Look in TODO.detail in optimizer and performance for 'raw'. Also interesting info there about optimizer cost estimates we have been talking about. Specificially see: http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html Also see: http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > Michael Loftis wrote: > > > > Hiroshi Inoue wrote: > > > > >What's wrong with it ? The insert command after *rollback* > > >would fail. It seems the right thing to me. Otherwise > > >the insert command would try to append the data of the > > >table t1 to itself. The insert command is for copying > > >schema1.t1 to foo.t1 in case the previous create schema > > >command suceeded. > > > > > Exactly, in this example shows exactly why SETs should be part of the > > transaction and roll back. Heck the insert may not even fail after all > > anyway and insert into the wrong schema. If the insert depends on the > > schema create succeeding it should be in the same transaction. (IE it > > would get rolled back or not happen at all) > > Where's the restriction that all objects in a schema > must be created in an transaction ? Each user has his > reason and would need various kind of command call sequence. > What I've mainly insisted is what to do with errors is > users' responsibilty but I've never seen the agreement > for it. So my current understanding is you all > are thinking what to do with errors is system's > responsibilty. Then no matter how users call commands > the dbms must behave appropriately, mustn't it ? Hiroshi, we need a psql solution too. People are feeding query files into psql all the time and we should have an appropriate behavior for them. I now understand your point that from a ODBC perspective, you may not want SETs rolled back and you would rather ODBC handle what to do with SETs. Not sure I like pushing that job off to the application programmer, but I think I see your point. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > Hiroshi Inoue wrote: > > Bruce Momjian wrote: > > > > > > Hiroshi Inoue wrote: > > > > Bruce Momjian wrote: > > > > > > > > > > > > > > > > > > > > > I voted not only ? but also 2 and 3. > > > > > > > > And haven't I asked twice or so if it's a vote ? > > > > > > > > > > > > > > Yes, it is a vote, and now that we see how everyone feels, we can > > > > > > > decide what to do. > > > > > > > > > > > > > > Hiroshi, you can't vote for 2, 3, and ?. > > > > > > > > > > > > Why ? > > > > > > I don't think the items are exclusive. > > > > > > > > > > Well, 2 says roll back only after transaction aborts, > > > > > > > > Sorry for my poor understanding. > > > > Isn't it 1 ? > > > > > > OK, original email attached. 1 rolls back all SETs in an aborted > > > transaction. > > > > > 2 ignores SETs after transaction aborts, but SETs before > > > the transaction aborted are honored. > > > > Must I understand this from your previous posting > > (2 says roll back only after transaction aborts,) > > or original posting ? What I understood was 2 only > > says that SET fails between a failure and the > > subsequenct ROLLBACK call. > > Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK > call, which is current behavior. What about a SET variable that controls the behaviour of SET variables :-) Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: >Michael Loftis wrote: > >>Hiroshi Inoue wrote: >> >>>What's wrong with it ? The insert command after *rollback* >>>would fail. It seems the right thing to me. Otherwise >>>the insert command would try to append the data of the >>>table t1 to itself. The insert command is for copying >>>schema1.t1 to foo.t1 in case the previous create schema >>>command suceeded. >>> >>Exactly, in this example shows exactly why SETs should be part of the >>transaction and roll back. Heck the insert may not even fail after all >>anyway and insert into the wrong schema. If the insert depends on the >>schema create succeeding it should be in the same transaction. (IE it >>would get rolled back or not happen at all) >> > >Where's the restriction that all objects in a schema >must be created in an transaction ? Each user has his >reason and would need various kind of command call sequence. >What I've mainly insisted is what to do with errors is >users' responsibilty but I've never seen the agreement >for it. So my current understanding is you all >are thinking what to do with errors is system's >responsibilty. Then no matter how users call commands >the dbms must behave appropriately, mustn't it ? > IMHO as a user and developer it's more important to behave consistently. A rollback should cause everything inside of a transaciton block to rollback. If you need to keep something then it should either be done in it's own transaction, or outside of an explicit transaction entirely. There is no restriction. The system is handling an error in the way instructed by the user either ROLLBACK or COMMIT. If you COMMIT with errors, it's your problem. But if you askt he system to ROLLBACK it's the users expectation that the DBMS will ROLLBACK. Not ROLLBACK this and that, but leave another thing alone. You say BEGIN ... COMMIT you expect a COMMIT, you say BEGIN ... ROLLBACK you expect a ROLLBACK. You say BEGIN ... END the DBMS should 'do the right thing' (IE COMMIT if successfull, ROLLBACK if not). Thats the behaviour I'd expect from ANY transactional system. The user will (and rightfully so) expect a ROLLBACK to do just that for everything. Yes this will break the way things work currently, but on the whole, and going forward, it makes the system consistent. Right now we roll back SELECTs, CREATEs, UPDATEs, etc., but not SETs (or atleast from what I can tell that's what we do.) I understand what you're saying Hiroshi-san, but really, it's a very weak reason. If you (as a programmer/developer) do something like in your earlier example (perform an insert after ROLLBACK) then you know an error occurred, and it's your own fault for inserting into the wrong table outside of the transaction. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Vote totals for SET in aborted transaction
Bruce Momjian wrote: > > Hiroshi Inoue wrote: > > > > Must I understand this from your previous posting > > (2 says roll back only after transaction aborts,) > > or original posting ? What I understood was 2 only > > says that SET fails between a failure and the > > subsequenct ROLLBACK call. > > Yes, 2 says that SET fails between failure query and COMMIT/ROLLBACK > call, which is current behavior. Oh I see. It was my mistake to have participated this vote. I'm not qualified from the first because I wasn't able to understand your vote list. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > > Sure should it! You gave an example for the need to roll > > back, because > > otherwise you would end up with an invalid > > search path "foo". > > What's wrong with it ? The insert command after *rollback* > would fail. It seems the right thing to me. Otherwise > the insert command would try to append the data of the > table t1 to itself. The insert command is for copying > schema1.t1 to foo.t1 in case the previous create schema > command suceeded. Wrong about your entire example is that the rollback is sheer wrong placed to make up your case ;-p There is absolutely no need to put the insert outside of the transaction that is intended to copy schema1.t1 to foo.t1. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] referential integrity problem
It seems we can create a forein key using REFERENCES privilege but cannot drop the table if its owner is not same as the referenced table. Is this a feature or bug? -- create a table as user foo \c - foo create table t1(i int primary key); -- grant reference privilege to user bar grant references on t1 to bar; -- create a table as user bar \c - bar create table t2(i int references t1); -- cannot drop t2 as user bar? drop table t2; NOTICE: DROP TABLE implicitly drops referential integrity trigger from table "t1" ERROR: t1: Must be table owner. -- Tatsuo Ishii ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote on SET in aborted transaction
Bruce Momjian wrote: >Hiroshi, we need a psql solution too. People are feeding query files >into psql all the time and we should have an appropriate behavior for >them. > >I now understand your point that from a ODBC perspective, you may not >want SETs rolled back and you would rather ODBC handle what to do with >SETs. Not sure I like pushing that job off to the application >programmer, but I think I see your point. > Ahhh Hiroshi is talkign formt he aspect of ODBC? Well, thats an ODBC issue, should be handled by the ODBC driver. Compliance with ODBC spec (or non-compliance) is not the issue of PostgreSQL proper. Thats the issue of the ODBC driver and it's maintainers (sorry if I'm sounding like a bastard but heh). If we start catering to all the different driver layers then we'll end up with a huge mess. What we're 'catering' to is the SQLxx specs, and the expectations of a user when running and developing programs, am I right? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote on SET in aborted transaction
Michael Loftis wrote: > > Hiroshi Inoue wrote: > > >Where's the restriction that all objects in a schema > >must be created in an transaction ? Each user has his > >reason and would need various kind of command call sequence. > >What I've mainly insisted is what to do with errors is > >users' responsibilty but I've never seen the agreement > >for it. So my current understanding is you all > >are thinking what to do with errors is system's > >responsibilty. Then no matter how users call commands > >the dbms must behave appropriately, mustn't it ? > > > IMHO as a user and developer it's more important to behave consistently. > A rollback should cause everything inside of a transaciton block to > rollback. Where does the *should* come from ? The standard says that changes to the database should be put back but doesn't say everything should be put back. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Vote on SET in aborted transaction
Jan Wieck wrote: > > Hiroshi Inoue wrote: > > > Sure should it! You gave an example for the need to roll > > > back, because > > > otherwise you would end up with an invalid > > > search path "foo". > > > > What's wrong with it ? The insert command after *rollback* > > would fail. It seems the right thing to me. Otherwise > > the insert command would try to append the data of the > > table t1 to itself. The insert command is for copying > > schema1.t1 to foo.t1 in case the previous create schema > > command suceeded. > > Wrong about your entire example is that the rollback is sheer > wrong placed to make up your case ;-p Is this issue on the wrong(? not preferable) sequnence of calls ? Please don't miss the point. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Vote on SET in aborted transaction
Bruce Momjian wrote: > > Hiroshi Inoue wrote: > Hiroshi, we need a psql solution too. People are feeding query files > into psql all the time and we should have an appropriate behavior for > them. What are you expecting for psql e.g. the following wrong(?) example ? [The curren schema is schema1] begin; create schema foo; set search_path = foo; create table t1 (); [error occurs] commit; insert into t1 select * from schema1.t1; regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Vote on SET in aborted transaction
Michael Loftis wrote: > > Bruce Momjian wrote: > > >Hiroshi, we need a psql solution too. People are feeding query files > >into psql all the time and we should have an appropriate behavior for > >them. > > > >I now understand your point that from a ODBC perspective, you may not > >want SETs rolled back and you would rather ODBC handle what to do with > >SETs. Not sure I like pushing that job off to the application > >programmer, but I think I see your point. > > > > Ahhh Hiroshi is talkign formt he aspect of ODBC? Well, thats an ODBC > issue, should be handled by the ODBC driver. No. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > Hiroshi Inoue wrote: > > Hiroshi, we need a psql solution too. People are feeding query files > > into psql all the time and we should have an appropriate behavior for > > them. > > What are you expecting for psql e.g. the following > wrong(?) example ? > > [The curren schema is schema1] > begin; > create schema foo; > set search_path = foo; > create table t1 (); [error occurs] > commit; > insert into t1 select * from schema1.t1; I am expecting the INSERT will use the search_path value that existed before the error transaction began. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vote on SET in aborted transaction
Bruce Momjian wrote: > > Hiroshi Inoue wrote: > > > > What are you expecting for psql e.g. the following > > wrong(?) example ? > > > > [The curren schema is schema1] > > begin; > > create schema foo; > > set search_path = foo; > > create table t1 (); [error occurs] > > commit; > > insert into t1 select * from schema1.t1; > > I am expecting the INSERT will use the search_path value that existed > before the error transaction began. > So you see foo.t1 which is a copy of schema1.t1 if all were successful and you may be able to see the doubled schema1.t1 in case of errors. regards, Hiroshi Inoue http://w2422.nsk.ne.jp/~inoue/ ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Inefficient handling of LO-restore + Patch
Mario Weilguni wrote: > Am Mittwoch, 24. April 2002 16:03 schrieb Bruce Momjian: > > OK, I have applied the following patch to fix these warnings. However, > > I need Mario to confirm these are the right changes. Thanks. > > I've checked it and works fine, but the memcpy() prototype says it should be > void pointers. Will this give errors with non-gcc compilers? No, it is fine. Anything can be cast _to_ a void pointer. You just can't do arithmetic on them. Are you sure you want to use 'void *' in your code. Looking at the backend large object code, I see char *: extern int inv_read(LargeObjectDesc *obj_desc, char *buf, int nbytes); extern int inv_write(LargeObjectDesc *obj_desc, char *buf, int nbytes); I guess my point is that these are just streams of bytes, _not_ really streams of items of unknown length. We know the length, and the length is char. This may simplify the code. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Vote on SET in aborted transaction
Hiroshi Inoue wrote: > Bruce Momjian wrote: > > > > Hiroshi Inoue wrote: > > > > > > What are you expecting for psql e.g. the following > > > wrong(?) example ? > > > > > > [The curren schema is schema1] > > > begin; > > > create schema foo; > > > set search_path = foo; > > > create table t1 (); [error occurs] > > > commit; > > > insert into t1 select * from schema1.t1; > > > > I am expecting the INSERT will use the search_path value that existed > > before the error transaction began. > > > > So you see foo.t1 which is a copy of schema1.t1 > if all were successful and you may be able to see > the doubled schema1.t1 in case of errors. Yes, I think that is how it would behave. If you don't roll back 'set search_path', you are pointing to a non-existant schema. Probably the proper thing here would be to have the INSERT in the transaction too. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Vote on SET in aborted transaction
Bruce Momjian wrote: > > > > > > > > > What are you expecting for psql e.g. the following > > > > wrong(?) example ? > > > > > > > > [The curren schema is schema1] > > > > begin; > > > > create schema foo; > > > > set search_path = foo; > > > > create table t1 (); [error occurs] > > > > commit; > > > > insert into t1 select * from schema1.t1; > > > > > > I am expecting the INSERT will use the search_path value that existed > > > before the error transaction began. > > > > > > > So you see foo.t1 which is a copy of schema1.t1 > > if all were successful and you may be able to see > > the doubled schema1.t1 in case of errors. > > Yes, I think that is how it would behave. If you don't roll back 'set > search_path', you are pointing to a non-existant schema. OK I see your standpoint. If Tom agrees with Bruce I don't object any more. regards, Hiroshi Inoue ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Sequential Scan Read-Ahead
On Wed, 24 Apr 2002, Bruce Momjian wrote: > > 1. Not all systems do readahead. > > If they don't, that isn't our problem. We expect it to be there, and if > it isn't, the vendor/kernel is at fault. It is your problem when another database kicks Postgres' ass performance-wise. And at that point, *you're* at fault. You're the one who's knowingly decided to do things inefficiently. Sorry if this sounds harsh, but this, "Oh, someone else is to blame" attitude gets me steamed. It's one thing to say, "We don't support this." That's fine; there are often good reasons for that. It's a completely different thing to say, "It's an unrelated entity's fault we don't support this." At any rate, relying on the kernel to guess how to optimise for the workload will never work as well as well as the software that knows the workload doing the optimization. The lack of support thing is no joke. Sure, lots of systems nowadays support unified buffer cache and read-ahead. But how many, besides Solaris, support free-behind, which is also very important to avoid blowing out your buffer cache when doing sequential reads? And who at all supports read-ahead for reverse scans? (Or does Postgres not do those, anyway? I can see the support is there.) And even when the facilities are there, you create problems by using them. Look at the OS buffer cache, for example. Not only do we lose efficiency by using two layers of caching, but (as people have pointed out recently on the lists), the optimizer can't even know how much or what is being cached, and thus can't make decisions based on that. > Yes, seek() in file will turn off read-ahead. Grabbing bigger chunks > would help here, but if you have two people already reading from the > same file, grabbing bigger chunks of the file may not be optimal. Grabbing bigger chunks is always optimal, AFICT, if they're not *too* big and you use the data. A single 64K read takes very little longer than a single 8K read. > > 3. Even when the read-ahead does occur, you're still doing more > > syscalls, and thus more expensive kernel/userland transitions, than > > you have to. > > I would guess the performance impact is minimal. If it were minimal, people wouldn't work so hard to build multi-level thread systems, where multiple userland threads are scheduled on top of kernel threads. However, it does depend on how much CPU your particular application is using. You may have it to spare. > http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html Well, this message has some points in it that I feel are just incorrect. 1. It is *not* true that you have no idea where data is when using a storage array or other similar system. While you certainly ought not worry about things such as head positions and so on, it's been a given for a long, long time that two blocks that have close index numbers are going to be close together in physical storage. 2. Raw devices are quite standard across Unix systems (except in the unfortunate case of Linux, which I think has been remedied, hasn't it?). They're very portable, and have just as well--if not better--defined write semantics as a filesystem. 3. My observations of OS performance tuning over the past six or eight years contradict the statement, "There's a considerable cost in complexity and code in using "raw" storage too, and it's not a one off cost: as the technologies change, the "fast" way to do things will change and the code will have to be updated to match." While optimizations have been removed over the years the basic optimizations (order reads by block number, do larger reads rather than smaller, cache the data) have remained unchanged for a long, long time. 4. "Better to leave this to the OS vendor where possible, and take advantage of the tuning they do." Well, sorry guys, but have a look at the tuning they do. It hasn't changed in years, except to remove now-unnecessary complexity realated to really, really old and slow disk devices, and to add a few thing that guess workload but still do a worse job than if the workload generator just did its own optimisations in the first place. > http://candle.pha.pa.us/mhonarc/todo.detail/optimizer/msg00011.html Well, this one, with statements like "Postgres does have control over its buffer cache," I don't know what to say. You can interpret the statement however you like, but in the end Postgres very little control at all over how data is moved between memory and disk. BTW, please don't take me as saying that all control over physical IO should be done by Postgres. I just think that Posgres could do a better job of managing data transfer between disk and memory than the OS can. The rest of the things (using raw paritions, read-ahead, free-behind, etc.) just drop out of that one idea. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://
Re: [HACKERS] Sequential Scan Read-Ahead
Curt Sampson <[EMAIL PROTECTED]> writes: > Grabbing bigger chunks is always optimal, AFICT, if they're not > *too* big and you use the data. A single 64K read takes very little > longer than a single 8K read. Proof? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Sequential Scan Read-Ahead
> Curt Sampson <[EMAIL PROTECTED]> writes: > > Grabbing bigger chunks is always optimal, AFICT, if they're not > > *too* big and you use the data. A single 64K read takes very little > > longer than a single 8K read. > > Proof? Long time ago I tested with the 32k block size and got 1.5-2x speed up comparing ordinary 8k block size in the sequential scan case. FYI, if this is the case. -- Tatsuo Ishii ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] Sequential Scan Read-Ahead
Well, this is a very interesting email. Let me comment on some points. --- Curt Sampson wrote: > On Wed, 24 Apr 2002, Bruce Momjian wrote: > > > > 1. Not all systems do readahead. > > > > If they don't, that isn't our problem. We expect it to be there, and if > > it isn't, the vendor/kernel is at fault. > > It is your problem when another database kicks Postgres' ass > performance-wise. > > And at that point, *you're* at fault. You're the one who's knowingly > decided to do things inefficiently. It is just hard to imagine an OS not doing read-ahead, at least in simple cases. > Sorry if this sounds harsh, but this, "Oh, someone else is to blame" > attitude gets me steamed. It's one thing to say, "We don't support > this." That's fine; there are often good reasons for that. It's a > completely different thing to say, "It's an unrelated entity's fault we > don't support this." Well, we are guilty of trying to push as much as possible on to other software. We do this for portability reasons, and because we think our time is best spent dealing with db issues, not issues then can be deal with by other existing software, as long as the software is decent. > At any rate, relying on the kernel to guess how to optimise for > the workload will never work as well as well as the software that > knows the workload doing the optimization. Sure, that is certainly true. However, it is hard to know what the future will hold even if we had perfect knowledge of what was happening in the kernel. We don't know who else is going to start doing I/O once our I/O starts. We may have a better idea with kernel knowledge, but we still don't know 100% what will be cached. > The lack of support thing is no joke. Sure, lots of systems nowadays > support unified buffer cache and read-ahead. But how many, besides > Solaris, support free-behind, which is also very important to avoid We have free-behind on our list. I think LRU-K will do this quite well and be a nice general solution for more than just sequential scans. > blowing out your buffer cache when doing sequential reads? And who > at all supports read-ahead for reverse scans? (Or does Postgres > not do those, anyway? I can see the support is there.) > > And even when the facilities are there, you create problems by > using them. Look at the OS buffer cache, for example. Not only do > we lose efficiency by using two layers of caching, but (as people > have pointed out recently on the lists), the optimizer can't even > know how much or what is being cached, and thus can't make decisions > based on that. Again, are you going to know 100% anyway? > > > Yes, seek() in file will turn off read-ahead. Grabbing bigger chunks > > would help here, but if you have two people already reading from the > > same file, grabbing bigger chunks of the file may not be optimal. > > Grabbing bigger chunks is always optimal, AFICT, if they're not > *too* big and you use the data. A single 64K read takes very little > longer than a single 8K read. There may be validity in this. It is easy to do (I think) and could be a win. > > > 3. Even when the read-ahead does occur, you're still doing more > > > syscalls, and thus more expensive kernel/userland transitions, than > > > you have to. > > > > I would guess the performance impact is minimal. > > If it were minimal, people wouldn't work so hard to build multi-level > thread systems, where multiple userland threads are scheduled on > top of kernel threads. > > However, it does depend on how much CPU your particular application > is using. You may have it to spare. I assume those apps are doing tons of kernel calls. I don't think we really do that many. > > http://candle.pha.pa.us/mhonarc/todo.detail/performance/msg9.html > > Well, this message has some points in it that I feel are just incorrect. > > 1. It is *not* true that you have no idea where data is when > using a storage array or other similar system. While you > certainly ought not worry about things such as head positions > and so on, it's been a given for a long, long time that two > blocks that have close index numbers are going to be close > together in physical storage. SCSI drivers, for example, are pretty smart. Not sure we can take advantage of that from user-land I/O. > 2. Raw devices are quite standard across Unix systems (except > in the unfortunate case of Linux, which I think has been > remedied, hasn't it?). They're very portable, and have just as > well--if not better--defined write semantics as a filesystem. Yes, but we are seeing some db's moving away from raw I/O. Our performance numbers beat most of the big db's already, so we must be doing something right. In fact, our big failing is more is missing features and limitations of our db, rather than performance. > 3. My observations of OS performance tu
Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE
On Wed, 24 Apr 2002, mlw wrote: > I am not arguing about whether or not they do it, I am saying it is > not always possible. I/O requests do not remain in queue waiting for > reordering indefinitely. It doesn't matter. When they go out to the disk they go out in order. On every Unix-based OS I know of, and Novell Netware, if you submit a single read request for consecutive blocks, those blocks *will* be read sequentially, no matter what the system load. So to get back to the original arugment: > > >The supposed advantage of a sequential read over an random read, in > > >an active multitasking system, is a myth. If you are executing one > > >query and the system is doing only that query, you may be right. No, it's very real, because your sequential read will not be broken up. If you think it will, let me know which operating systems this happens on, and how exactly it happens. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
[HACKERS] md5 passwords and pg_shadow
Hi all, Why does the password_encryption GUC variable default to false? AFAICT there shouldn't be any issues with client compatibility -- in fact, I'd be inclined to rip out all support for storing cleartext passwords... Cheers, Neil -- Neil Conway <[EMAIL PROTECTED]> PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] Sequential Scan Read-Ahead
Tom Lane wrote: >Curt Sampson <[EMAIL PROTECTED]> writes: > >>Grabbing bigger chunks is always optimal, AFICT, if they're not >>*too* big and you use the data. A single 64K read takes very little >>longer than a single 8K read. >> > >Proof? > I contend this statement. It's optimal to a point. I know that my system settles into it's best read-speeds @ 32K or 64K chunks. 8K chunks are far below optimal for my system. Most systems I work on do far better at 16K than at 8K, and most don't see any degradation when going to 32K chunks. (this is across numerous OSes and configs -- results are interpretations from bonnie disk i/o marks). Depending on what you're doing it is more efficiend to read bigger blocks up to a point. If you're multi-thread or reading in non-blocking mode, take as big a chunk as you can handle or are ready to process in quick order. If you're picking up a bunch of little chunks here and there and know oyu're not using them again then choose a size that will hopeuflly cause some of the reads to overlap, failing that, pick the smallest usable read size. The OS can never do that stuff for you. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] md5 passwords and pg_shadow
Neil Conway wrote: > Hi all, > > Why does the password_encryption GUC variable default to false? > > AFAICT there shouldn't be any issues with client compatibility -- in > fact, I'd be inclined to rip out all support for storing cleartext > passwords... It is false so passwords can be handled by pre-7.2 clients. Once you encrypt them, you can't use passwords on pre-7.2 clients because they don't understand the double-md5 hash required. We will set it to true, but when are most pre-7.2 clients gone? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup.| Drexel Hill, Pennsylvania 19026 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE
A Block-sized read will not be rboken up. But if you're reading ina size bigger than the underlying systems block sizes then it can get broken up. So yes a sequential read will get broken up. A single read request for a block may or may not get broken up. If you're freading with set block sizes you'll see the set sizes of blocks come through, but what the underlyign OS does is undefined, same for writing. If the underlying block size is 8KB and you dump 4MB down on it, the OS may (and in many cases does) decide to write part of it, do a read ona nearby sector, then write the rest. This happens when doing long writes that end up spanning block groups because the inodes must be allocated. So the write WILL get broken up. Reads are under the same gun. IT all depends on how big. To the application you may or may not see this (probably not, unless you're set non-blocking, because the kernel will just sleep you until your data is ready). Further large read requests can of course be re-ordered by hardware. Tagged Command Queueing on SCSI drives and RAIDs. The ICP Vortex cards I use ina number of systems have 64MB on-board cache. They quite happily, and often re-order reads and writes when queueing them to keep things moving as fast as possible (Intel didn't buy them for their cards, they use the i960 as it is, Intel swiped them for their IP rights). The OS also tags commands it fires to the ICP, which can be re-ordered on block-sized chunks. Curt Sampson wrote: >On Wed, 24 Apr 2002, mlw wrote: > >>I am not arguing about whether or not they do it, I am saying it is >>not always possible. I/O requests do not remain in queue waiting for >>reordering indefinitely. >> > >It doesn't matter. When they go out to the disk they go out in >order. On every Unix-based OS I know of, and Novell Netware, if >you submit a single read request for consecutive blocks, those >blocks *will* be read sequentially, no matter what the system load. > >So to get back to the original arugment: > The supposed advantage of a sequential read over an random read, in an active multitasking system, is a myth. If you are executing one query and the system is doing only that query, you may be right. > >No, it's very real, because your sequential read will not be broken up. > >If you think it will, let me know which operating systems this >happens on, and how exactly it happens. > >cjs > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Index Scans become Seq Scans after VACUUM ANALYSE
On Wed, 24 Apr 2002, Michael Loftis wrote: > A Block-sized read will not be rboken up. But if you're reading ina > size bigger than the underlying systems block sizes then it can get > broken up. In which operating systems, and under what circumstances? I'll agree that some OSs may not coalesce adjacent reads into a single read, but even so, submitting a bunch of single reads for consecutive blocks is going to be much, much faster than if other, random I/O occured between those reads. > If the underlying > block size is 8KB and you dump 4MB down on it, the OS may (and in many > cases does) decide to write part of it, do a read ona nearby sector, > then write the rest. This happens when doing long writes that end up > spanning block groups because the inodes must be allocated. Um...we're talking about 64K vs 8K reads here, not 4 MB reads. I am certainly not suggesting Posgres ever submit 4 MB read requests to the OS. I agree that any single-chunk reads or writes that cause non-adjacent disk blocks to be accessed may be broken up. But in my sense, they're "broken up" anyway, in that you have no choice but to take a performance hit. > Further large read requests can of course be re-ordered by hardware. > ...The OS also tags ICP, which can be re-ordered on block-sized chunks. Right. All the more reason to read in larger chunks when we know what we need in advance, because that will give the OS, controllers, etc. more advance information, and let them do the reads more efficiently. cjs -- Curt Sampson <[EMAIL PROTECTED]> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] PostgreSQL index usage discussion.
On Thu, 2002-04-25 at 00:46, mlw wrote: > We have had several threads about index usage, specifically when PostgreSQL has > the choice of using one or not. > > There seems to be a few points of view: > > (1) The planner and statistics need to improve, so that erroneously using an > index (or not) happens less frequently or not at all. > > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) > > (3) It is pretty much OK as-is, just use enable_seqscan=false in the query. > > My point of view is about this subject is one from personal experience. I had a > database on which PostgreSQL would always (erroneously) choose not to use an > index. Are my experiences typical? Probably not, but are experiences like it > very common? I have currently 2 databases that run with enable_seqscan=false to avoid choosing plans that take forever. > I don't know, but we see a number "Why won't PostgreSQL use my > index" messages to at least conclude that it happens every now and then. In my > experience, when it happens, it is very frustrating. > > I think statement (1) is a good idea, but I think it is optimistic to expect > that a statistical analysis of a table will contain enough information for all > possible cases. Perhaps we can come up with some special rules to avoid grossly pessimal plans. Hannu ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] PostgreSQL index usage discussion.
> > (2) Use programmatic hints which allow coders specify which indexes are used > during a query. (ala Oracle) > As I said before it would be useful a way to improve(not force) using indexes on particular queries, i.e. lowering the cost of using this index on this query. Regards ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] PostgreSQL index usage discussion.
On Thu, 2002-04-25 at 08:42, Luis Alberto Amigo Navarro wrote: > > > > (2) Use programmatic hints which allow coders specify which indexes are > used > > during a query. (ala Oracle) > > > As I said before it would be useful a way to improve(not force) using > indexes on particular queries, i.e. lowering the cost of using this index on > this query. > Regards I was told that DB2 has per-table (or rather per-tablespace) knowledge of disk speeds, so keeping separate random and seqsqan costs for each table and index could be a good way here (to force use of a particular index make its use cheap) Hannu ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] PostgreSQL index usage discussion.
> I was told that DB2 has per-table (or rather per-tablespace) knowledge > of disk speeds, so keeping separate random and seqsqan costs for each > table and index could be a good way here (to force use of a particular > index make its use cheap) > I was wondering something even easier, keeping 1 cost per index, 1 cost per seqscan, but being allowed to scale cost for each index on each query(recommended, null or unrecommended) Regards ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster