Re: [HACKERS] New trigger option of pg_standby
Hi Simon, Thanks for the comments! On Thu, Apr 16, 2009 at 2:56 AM, Simon Riggs wrote: > > On Wed, 2009-04-15 at 17:02 +0900, Fujii Masao wrote: > >> On Tue, Apr 14, 2009 at 2:41 PM, Fujii Masao wrote: >> > I'd like to propose another simple idea; pg_standby deletes the >> > trigger file *whenever* the nextWALfile is a timeline history file. >> > A timeline history file is restored at the end of recovery, so it's >> > guaranteed that the trigger file is deleted whether nextWALfile >> > exists or not. >> > >> > A timeline history file is restored also at the beginning of >> > recovery, so the accidentally remaining trigger file is deleted >> > in early warm-standby as a side-effect of this idea. >> >> Here is the revised patch as above. >> >> If you notice something, please feel free to comment. > > Deleting the trigger file when we request a history file works in most > cases, but not in all. We also request a history file when we switch > timelines, so code comments need slight modification. > > If take a base backup, switchover and then try to regen the primary from > the base backup we would need to switch timelines, which could be > problematic. That is unlikely, so we should at least very clearly > document the actual behaviour, as we do in the code comments. "switch timelines" means that a new timeline ID is assigned at the end of archive recovery? If so, even in this case, there is no problem with deleting the trigger file, I think. Or, am I misunderstanding? > I think your wording that smart mode guarantees no data will be lost is > a little strong. I'd say "on successful completion all WAL records will > be replayed resulting in zero data loss". Sounds good. I'll change the wording. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Postgres SQL specification (tests)
Hi, i have built an SQL interface using rule system which supports row versioning and i would like to test it against Postgres SQL specification. Is there something like test cases for postgres SQL interface? Or do you have any ideas how to build a group of all possible cases of table structure and generate appropriate queries, which can be tested against postgres? thanks -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote: > On 4/15/09, Tom Lane wrote: > > Given Martijn's complaint about more-than-16-bit code points, I think > > the \u proposal is not mature enough to go into 8.4. We can think > > about some version of that later, if there's enough interest. > > I think it would be good idea. Basically we should pick one from > couple of pre-existing sane schemes. Here is quick summary > of Python, Perl and Java: > > Python [1]: > > \u - 16-bit codepoint > \U - 32-bit codepoint > \N{char-name} - Characted by name Microsoft have also gone this way in C#, named code points are not supported however. > Perl [2]: > > \x{..} - {} contains hexadecimal codepoint > \N{char-name} - Unicode char name Looks OK, but the 'x' seems somewhat redundant. Why not just: \{} This would be following the BitC[2] project, especially if it was more like: \{U+} e.g. \{U+03BB} would be the lowercase lambda character. Added appeal is in the fact that this (i.e. U+03BB) is how the Unicode consortium spells code points. > Java [3]: > > \u - 16-bit codepoint AFAIK, Java isn't the best reference to choose; it assumed from an early point in its design that Unicode characters were at most 16bits and hence had to switch its internal representation to UTF-16. I don't program much Java these days to know how it's all worked out, but it would be interesting to hear from people who regularly have to deal with characters outside the BMP (i.e. code points greater than 65535). -- Sam http://samason.me.uk/ [1] http://msdn.microsoft.com/en-us/library/aa664669(VS.71).aspx [2] http://www.bitc-lang.org/docs/bitc/spec.html#stringlit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
2009/4/16 Simon Riggs : > On Wed, 2009-04-15 at 20:58 -0400, Robert Haas wrote: >> On Wed, Apr 15, 2009 at 7:39 PM, Tom Lane wrote: >> > The output of EXPLAIN is nowhere near stable enough to use within the >> > current exact-match regression test framework. I'm not sure it would >> > be stable even if we suppressed the rowcount and cost figures. Those >> > figures vary across platforms (because of alignment effects and probably >> > other things) and are also sensitive to the timing of autovacuums. It >> > is known that a nontrivial fraction of the existing regression test >> > cases do suffer from uninteresting plan changes across platforms or >> > as a result of various phase-of-the-moon effects; that's why we keep >> > having to add "ORDER BY" clauses now and then. >> >> Interesting. I suppose you could insulate yourself from this somewhat >> by populating pg_statistic with a particular set of values rather than >> relying on ANALYZE to gather them, but this would have the substantial >> downside of being way more work to maintain, especially if anyone ever >> changed pg_statistic. >> >> On a more practical level, I do think we need to give real >> consideration to some kind of options syntax for EXPLAIN, maybe >> something as simple as: >> >> EXPLAIN (option_name, ...) query >> >> Or maybe: >> >> EXPLAIN (option_name = value, ...) query >> >> It may or may not be the case that generating a useful regression test >> suite for the planner is too much work for anyone to bother, but they >> certainly won't if the tools aren't available. It seems we get at >> least one request a month for some kind of explain-output option: >> suppress row counts, suppress costs, gather I/O statistics, show >> outputs, show # of batches for a hash join, and on and on and on. I >> think we should implement a very basic version that maybe does nothing >> more than let you optionally suppress some of the existing output, but >> which provides an extensible syntax for others to build on. > > I think the way to do this is to introduce plan output in XML (that > matches the node structure of the plan). We can then filter away any > junk we don't want to see for regression tests, or better still augment > the exact-match framework with a fuzzy-match spec that allows us to > specify a range of values. I think XML explain output is a good idea, but I don't think it's a substitute for better options to control the human-readable form. But the nice thing is that with an extensible syntax, this is not an either/or proposition. > The skill would be in constructing a set of tests that was not sensitive > to minor changes. The OP's join for example had a huge cost range > difference that would have clearly shown up in a regression test. > > This will only move forward if it adds value directly for Tom, so if > it's worth doing then he needs to specify it and ask for someone to do > it. There will be someone available if the task is well defined. I'm not sure if by this you mean the EXPLAIN changes or the regression tests, but either way I think you're half right: it's probably not necessary for Tom to provide the spec, but it would sure be nice if he could at least indicate his lack of objection to accepting a well-designed patch in one of these areas - because no one is going to want to go to the trouble of doing either of these things and then have Tom say "well, I never liked that idea anyway". ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/16/09, Sam Mason wrote: > On Wed, Apr 15, 2009 at 11:19:42PM +0300, Marko Kreen wrote: > > On 4/15/09, Tom Lane wrote: > > > > Given Martijn's complaint about more-than-16-bit code points, I think > > > the \u proposal is not mature enough to go into 8.4. We can think > > > about some version of that later, if there's enough interest. > > > > I think it would be good idea. Basically we should pick one from > > couple of pre-existing sane schemes. Here is quick summary > > of Python, Perl and Java: > > > > Python [1]: > > > > \u - 16-bit codepoint > > \U - 32-bit codepoint > > \N{char-name} - Characted by name > > > Microsoft have also gone this way in C#, named code points are not > supported however. And it handles also non-BMP codepoints with \u escape similarly: http://en.csharp-online.net/ECMA-334:_9.4.1_Unicode_escape_sequences This makes it even more standard. > > Perl [2]: > > > > \x{..} - {} contains hexadecimal codepoint > > \N{char-name} - Unicode char name > > > Looks OK, but the 'x' seems somewhat redundant. Why not just: > > \{} > > This would be following the BitC[2] project, especially if it was more > like: > > \{U+} > > e.g. > > \{U+03BB} > > would be the lowercase lambda character. Added appeal is in the fact > that this (i.e. U+03BB) is how the Unicode consortium spells code > points. We already got yet-another-unique-way-of-escaping-unicode with U&. Now let's try to support some actual standard also. > > Java [3]: > > > > \u - 16-bit codepoint > > > AFAIK, Java isn't the best reference to choose; it assumed from an early > point in its design that Unicode characters were at most 16bits and > hence had to switch its internal representation to UTF-16. I don't > program much Java these days to know how it's all worked out, but it > would be interesting to hear from people who regularly have to deal with > characters outside the BMP (i.e. code points greater than 65535). You did not read my mail carefully enough - the Java and also Python/C# already support non-BMP chars with '\u' and exactly the same (utf16) way. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Tatsuo Ishii wrote: I could live with either. Wikipedia says: "The characters outside the first plane usually have very specialized or rare use." For years we rejected all characters beyond the first plane, and while that's fixed now, the volume of complaints wasn't huge. I you mean "first plane" as BMP (i.e. 16bit range), above is not true for PostgreSQL 7.3 or later at least. Oops. I meant 8.2 or later. Umm, that's what I said. We used to do it like that, but we don't any more. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres SQL specification (tests)
On Apr 16, 3:14 am, mito wrote: > Hi, > > i have built an SQL interface using rule system which supports row > versioning and i would like to test it against Postgres SQL specification. > > Is there something like test cases for postgres SQL interface? > > Or do you have any ideas how to build a group of all possible cases of > table structure and generate appropriate queries, which can be tested > against postgres? > > thanks For unit testing, I use pgTap, it's pretty handy for scripting your tests. What do you mean by all possible cases of table structure? That would be infinite... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] NOTIFY / LISTEN silently parses and discards schema-ish portion of notification name ...
On Apr 16, 2009, at 6:51 AM, pgsql-hackers-ow...@postgresql.org wrote: Considered unexpected behavior, or at least in its undocumented form. If value given to NOTIFY seems schema-qualified, the schema qualification is eroded by the time it is presented to the listener -- See: http://archives.postgresql.org//pgsql-patches/2007-02/msg00437.php Pretty sure 8.4 will not use "relation". Thanks for pointer on a proposed patch, Greg. That patch looks like it hasn't been applied to the 8.2 maintenance stream. Looks like in 8.3 the grammar changed the name argument to ColId production, which goes unparsed. Shame on me for using such an old version [ 8.2 ], but hey, it works. James Robinson Socialserve.com
Re: [HACKERS] Yet another regression issue with Fedora-10 + PG 8.4 beta1
On Tue, 2009-04-14 at 18:32 -0400, Tom Lane wrote: > Any ideas what might have caused it? > > QEMU screwing up the handling of floating-point traps, perhaps? Probably. Let me dig it. Thanks. -- Devrim GÜNDÜZ, RHCE devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Unicode string literals versus the world
On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote: > On 4/16/09, Sam Mason wrote: > > Microsoft have also gone this way in C#, named code points are not > > supported however. > > And it handles also non-BMP codepoints with \u escape similarly: > > http://en.csharp-online.net/ECMA-334:_9.4.1_Unicode_escape_sequences > > This makes it even more standard. I fail to see what you're pointing out here; as far as I understand it, \u is for BMP code points and \U extends the range out to 32bit code points. I can't see anything about non-BMP and \u in the above link, you appear free to write your own surrogate pairs but that seems like an independent issue. I'd not realised before that C# is specified to use UTF-16 as its internal encoding. > > This would be following the BitC[2] project, especially if it was more > > like: > > > > \{U+} > > We already got yet-another-unique-way-of-escaping-unicode with U&. > > Now let's try to support some actual standard also. That comes across *very* negatively; I hope it's just a language issue. I read your parent post as soliciting opinions on possible ways to encode Unicode characters in PG's literals. The U&'lit' was criticised, you posted some suggestions, I followed up with what I hoped to be a useful addition. It seems useful here to separate "de jure" from "de facto" standards; implementing U&'lit' would be following the de jure standard, anything else would be de facto. A survey of existing SQL implementations would seem to be more appropriate as well: Oracle: UNISTR(string-literal) and \ It looks as though Oracle originally used UCS-2 internally (i.e. BMP only) but more recently Unicode support has been improved to allow other planes. MS-SQL Server: can't find anything remotely useful; best seems to be to use NCHAR(integer-expression) which looks somewhat unmaintainable. DB2: U&string-literal and \xx i.e. it follows the SQL-2003 spec FireBird: can't find much either; support looks somewhat low on the ground MySQL: same again, seems to assume query is encoded in UTF-8 Summary seems to be that either I'm bad at searching or support for Unicode doesn't seem very complete in the database world and people work around it somehow. > You did not read my mail carefully enough - the Java and also Python/C# > already support non-BMP chars with '\u' and exactly the same (utf16) way. Again, I think this may be a language issue; if not then more verbose explanations help, maybe something like "sorry, I obviously didn't explain that very well". You will of course felt you explained it perfectly well, but everybody enters a discussion with different intuitions and biases, email has a nasty habit of accentuating these differences and compounding them with language problems. I'd never heard of UTF-16 surrogate pairs before this discussion and hence didn't realise that it's valid to have a surrogate pair in place of a single code point. The docs say that corresponds to U+10302, Python would appear to follow my intuitions in that: ord(u'\uD800\uDF02') results in an error instead of giving back 66306, as I'd expect. Is this a bug in Python, my understanding, or something else? -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
Sam Mason writes: > I'd never heard of UTF-16 surrogate pairs before this discussion and > hence didn't realise that it's valid to have a surrogate pair in place > of a single code point. The docs say that corresponds to > U+10302, Python would appear to follow my intuitions in that: > ord(u'\uD800\uDF02') > results in an error instead of giving back 66306, as I'd expect. Is > this a bug in Python, my understanding, or something else? I might be wrong, but I think surrogate pairs are expressly forbidden in all representations other than UTF16/UCS2. We definitely forbid them when validating UTF-8 strings --- that's per an RFC recommendation. It sounds like Python is doing the same. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
On Thu, Apr 16, 2009 at 06:12:10AM +0100, Simon Riggs wrote: > > > > EXPLAIN (option_name, ...) query > > > > Or maybe: > > > > EXPLAIN (option_name = value, ...) query > > > > It may or may not be the case that generating a useful regression > > test suite for the planner is too much work for anyone to bother, > > but they certainly won't if the tools aren't available. It seems > > we get at least one request a month for some kind of > > explain-output option: suppress row counts, suppress costs, gather > > I/O statistics, show outputs, show # of batches for a hash join, > > and on and on and on. I think we should implement a very basic > > version that maybe does nothing more than let you optionally > > suppress some of the existing output, but which provides an > > extensible syntax for others to build on. > > I think the way to do this is to introduce plan output in XML If we're going with a serialization, which I think would be an excellent idea, how about one that's light-weight and human-readable like JSON? Cheers, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Thu, Apr 16, 2009 at 10:54:16AM -0400, Tom Lane wrote: > Sam Mason writes: > > I'd never heard of UTF-16 surrogate pairs before this discussion and > > hence didn't realise that it's valid to have a surrogate pair in place > > of a single code point. The docs say that corresponds to > > U+10302, Python would appear to follow my intuitions in that: > > > ord(u'\uD800\uDF02') > > > results in an error instead of giving back 66306, as I'd expect. Is > > this a bug in Python, my understanding, or something else? > > I might be wrong, but I think surrogate pairs are expressly forbidden in > all representations other than UTF16/UCS2. We definitely forbid them > when validating UTF-8 strings --- that's per an RFC recommendation. > It sounds like Python is doing the same. OK, that's good. I thought I was missing something. A minor point is that in UCS2 each 16bit value is exactly one character and characters outside the BMP aren't supported, hence the need for UTF-16. I've failed to keep up with the discussion so I'm not sure where this conversation has got to! Is the consensus for 8.4 to enable SQL2003 style U&lit escaped literals if and only if standard_conforming_strings is set? This seems easiest for client code as it can use this exclusively for knowing what to do with backslashes. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/16/09, Sam Mason wrote: > On Thu, Apr 16, 2009 at 02:47:20PM +0300, Marko Kreen wrote: > > On 4/16/09, Sam Mason wrote: > > > Microsoft have also gone this way in C#, named code points are not > > > supported however. > > > > And it handles also non-BMP codepoints with \u escape similarly: > > > > http://en.csharp-online.net/ECMA-334:_9.4.1_Unicode_escape_sequences > > > > This makes it even more standard. > > > I fail to see what you're pointing out here; as far as I understand it, > \u is for BMP code points and \U extends the range out to 32bit code > points. I can't see anything about non-BMP and \u in the above link, > you appear free to write your own surrogate pairs but that seems like an > independent issue. Ok, maybe I glanced too quickly over that page. I can't find definite deference only hint on several pages: \U \U Unicode escape sequence for surrogate pairs. Which hints that you can aswell enter the pairs directly: \uxx\uxx. If I'd be language designer, I would not see any reason to disallow it. And anyway, at least mono seems to support it: using System; public class HelloWorld { public static void Main() { Console.WriteLine("<\uD800\uDF02>\n"); } } It will output single UTF8 character. I think this should settle it. > I'd not realised before that C# is specified to use UTF-16 as its > internal encoding. > > > > This would be following the BitC[2] project, especially if it was more > > > like: > > > > > > \{U+} > > > > > We already got yet-another-unique-way-of-escaping-unicode with U&. > > > > Now let's try to support some actual standard also. > > > That comes across *very* negatively; I hope it's just a language issue. > > I read your parent post as soliciting opinions on possible ways to > encode Unicode characters in PG's literals. The U&'lit' was criticised, > you posted some suggestions, I followed up with what I hoped to be a > useful addition. It seems useful here to separate "de jure" from "de > facto" standards; implementing U&'lit' would be following the de jure > standard, anything else would be de facto. > > A survey of existing SQL implementations would seem to be more appropriate > as well: > > Oracle: UNISTR(string-literal) and \ > > It looks as though Oracle originally used UCS-2 internally (i.e. BMP > only) but more recently Unicode support has been improved to allow > other planes. > > MS-SQL Server: > > can't find anything remotely useful; best seems to be to use > NCHAR(integer-expression) which looks somewhat unmaintainable. > > DB2: U&string-literal and \xx > > i.e. it follows the SQL-2003 spec > > FireBird: > > can't find much either; support looks somewhat low on the ground > > MySQL: > > same again, seems to assume query is encoded in UTF-8 > > Summary seems to be that either I'm bad at searching or support for > Unicode doesn't seem very complete in the database world and people work > around it somehow. The de-facto about Postgres is stdstr=off. Even if not, E'' strings are still better for various things, so it would be good if they also aquired unicode-capabilities. > > You did not read my mail carefully enough - the Java and also Python/C# > > already support non-BMP chars with '\u' and exactly the same (utf16) way. > > > Again, I think this may be a language issue; if not then more verbose > explanations help, maybe something like "sorry, I obviously didn't > explain that very well". You will of course felt you explained it > perfectly well, but everybody enters a discussion with different > intuitions and biases, email has a nasty habit of accentuating these > differences and compounding them with language problems. > > I'd never heard of UTF-16 surrogate pairs before this discussion and > hence didn't realise that it's valid to have a surrogate pair in place > of a single code point. The docs say that corresponds to > U+10302, Python would appear to follow my intuitions in that: > > ord(u'\uD800\uDF02') > > results in an error instead of giving back 66306, as I'd expect. Is > this a bug in Python, my understanding, or something else? Python's internal representation is *not* UTF-16, but plain UCS2/UCS4, that is - plain 16 or 32-bit values. Seems your python is compiled with UCS2, not UCS4. As I understand, in UCS2 mode it simply takes surrogate values as-is. From ord() docs: If a unicode argument is given and Python was built with UCS2 Unicode, then the character’s code point must be in the range [0..65535] inclusive; otherwise the string length is two, and a TypeError will be raised. So only in UCS4 mode it detects surrogates and converts them to internal representation. (Which in Postgres case would be UTF8.) Or perhaps it is partially UTF16 aware - eg. I/O routines do unterstand UTF16 but low-level string routines do not: print "<%s>" % u'\uD800\uDF02' seems to handle it properly. -- marko -- Se
Re: [HACKERS] Unicode string literals versus the world
Tom Lane wrote: Sam Mason writes: I'd never heard of UTF-16 surrogate pairs before this discussion and hence didn't realise that it's valid to have a surrogate pair in place of a single code point. The docs say that corresponds to U+10302, Python would appear to follow my intuitions in that: ord(u'\uD800\uDF02') results in an error instead of giving back 66306, as I'd expect. Is this a bug in Python, my understanding, or something else? I might be wrong, but I think surrogate pairs are expressly forbidden in all representations other than UTF16/UCS2. We definitely forbid them when validating UTF-8 strings --- that's per an RFC recommendation. It sounds like Python is doing the same. You mustn't encode the surrogate, but it's up to us how we allow people to designate a given code point. Frankly, I think we shouldn't provide for using surrogates at all. I would prefer something like \u for BMP items and \U as the straight 32bit designation of a higher codepoint. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
On Thu, Apr 16, 2009 at 11:21 AM, David Fetter wrote: > If we're going with a serialization, which I think would be an > excellent idea, how about one that's light-weight and human-readable > like JSON? Wow, that's a great idea for another option to EXPLAIN. Wouldn't it be nice if EXPLAIN supported an options syntax?!!! :-) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Postgres SQL specification (tests)
By table structure i mean table definition options. Kevin Field wrote: On Apr 16, 3:14 am, mito wrote: Hi, i have built an SQL interface using rule system which supports row versioning and i would like to test it against Postgres SQL specification. Is there something like test cases for postgres SQL interface? Or do you have any ideas how to build a group of all possible cases of table structure and generate appropriate queries, which can be tested against postgres? thanks For unit testing, I use pgTap, it's pretty handy for scripting your tests. What do you mean by all possible cases of table structure? That would be infinite... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On 4/16/09, Tom Lane wrote: > Sam Mason writes: > > I'd never heard of UTF-16 surrogate pairs before this discussion and > > hence didn't realise that it's valid to have a surrogate pair in place > > of a single code point. The docs say that corresponds to > > U+10302, Python would appear to follow my intuitions in that: > > > ord(u'\uD800\uDF02') > > > results in an error instead of giving back 66306, as I'd expect. Is > > this a bug in Python, my understanding, or something else? > > > I might be wrong, but I think surrogate pairs are expressly forbidden in > all representations other than UTF16/UCS2. We definitely forbid them > when validating UTF-8 strings --- that's per an RFC recommendation. > It sounds like Python is doing the same. The point here is that Python/Java/C# allow them for escaping non-BMP unicode values, irrespective of their interal encoding. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
Robert Haas writes: > I think XML explain output is a good idea, but I don't think it's a > substitute for better options to control the human-readable form. Yeah. I think a well-designed XML output format for EXPLAIN is a fine thing to work on, but I don't believe it would make the "create a planner test suite" problem noticeably easier. I see the purpose of an XML format as being to allow tools like Red Hat's old Visual Explain (now maintained by EDB IIRC) to parse EXPLAIN's output with somewhat better odds of not breaking from one release to the next. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unicode string literals versus the world
On Thu, Apr 16, 2009 at 06:34:06PM +0300, Marko Kreen wrote: > Which hints that you can aswell enter the pairs directly: \uxx\uxx. > If I'd be language designer, I would not see any reason to disallow it. > > And anyway, at least mono seems to support it: > > using System; > public class HelloWorld { > public static void Main() { > Console.WriteLine("<\uD800\uDF02>\n"); > } > } > > It will output single UTF8 character. I think this should settle it. I don't have any .net stuff installed so can't test; but C# is defined to use UTF-16 as its internal representation so it would make sense if the above gets treated as a single character internally. However, if it used any other encoding the above should be treated as an error. > The de-facto about Postgres is stdstr=off. Even if not, E'' strings > are still better for various things, so it would be good if they also > aquired unicode-capabilities. OK, this seems independent of the U&'lit' discussion that started the thread. Note that PG already supports UTF8; if you want the character I've been using in my examples up-thread, you can do: SELECT E'\xF0\x90\x8C\x82'; I have a feeling that this is predicated on the server_encoding being set to "utf8" and this can only be done at database creation time. Another alternative would be to use the convert_from function, i.e: SELECT convert_from(E'\xF0\x90\x8C\x82', 'UTF8'); Never had to do this though, so there may be better options available. > Python's internal representation is *not* UTF-16, but plain UCS2/UCS4, > that is - plain 16 or 32-bit values. Seems your python is compiled with > UCS2, not UCS4. Cool, I didn't know that. I believe mine is UCS4 as I can do: ord(u'\U00010302') and I get 66306 back rather than an error. > As I understand, in UCS2 mode it simply takes surrogate > values as-is. UCS2 doesn't have surrogate pairs, or at least I believe it's considered a bug if you don't get an error when you present it with one. > From ord() docs: > > If a unicode argument is given and Python was built with UCS2 Unicode, > then the character’s code point must be in the range [0..65535] > inclusive; otherwise the string length is two, and a TypeError will > be raised. > > So only in UCS4 mode it detects surrogates and converts them to internal > representation. (Which in Postgres case would be UTF8.) I think you mean UTF-16 instead of UCS4; but otherwise, yes. > Or perhaps it is partially UTF16 aware - eg. I/O routines do unterstand > UTF16 but low-level string routines do not: > > print "<%s>" % u'\uD800\uDF02' > > seems to handle it properly. Yes, I get this as well. It's all a bit weird, which is why I was asking if "this a bug in Python, my understanding, or something else". When I do: python
Re: [HACKERS] Unicode string literals versus the world
Sam Mason writes: > I've failed to keep up with the discussion so I'm not sure where this > conversation has got to! Is the consensus for 8.4 to enable SQL2003 > style U&lit escaped literals if and only if standard_conforming_strings > is set? That was Peter's proposal, and no one's shot a hole in it yet ... I think the discussion about whether/how to add a Unicode extension to E''-style literals is 8.5 material. We are in beta so now is not the time to add new features, especially ones that weren't even on the TODO list before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [rfc] unicode escapes for extended strings
Seems I'm bad at communicating in english, so here is C variant of my proposal to bring \u escaping into extended strings. Reasons: - More people are familiar with \u escaping, as it's standard in Java/C#/Python, probably more.. - U& strings will not work when stdstr=off. Syntax: \u - 16-bit value \U - 32-bit value Additionally, both \u and \U can be used to specify UTF-16 surrogate pairs to encode characters with value > 0x. This is exact behaviour used by Java/C#/Python. (except that Java does not have \U) I'm ok with this patch left to 8.5. -- marko diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index a559d75..fdb0cc5 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -394,6 +394,14 @@ SELECT 'foo' 'bar'; hexadecimal byte value + + + \u, + \U + (x = 0 - 9, A - F) + +16 or 32-bit hexadecimal Unicode character value. + @@ -407,6 +415,14 @@ SELECT 'foo' 'bar'; + The Unicode escape syntax works fully only when the server encoding is UTF8. + When other server encodings are used, only code points in the ASCII range + (up to \u007F) can be specified. Both \u and \U + can also be used to specify UTF-16 surrogate pair to escape characters + with value larger than \u. + + + It is your responsibility that the byte sequences you create are valid characters in the server character set encoding. When the server encoding is UTF-8, then the alternative Unicode escape diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l index a070e85..c0695f1 100644 --- a/src/backend/parser/scan.l +++ b/src/backend/parser/scan.l @@ -98,6 +98,11 @@ static char *scanbuf; static unsigned char unescape_single_char(unsigned char c); +/* first part of unicode surrogate */ +static unsigned long xeu_surrogate1; + +static void addunicode(pg_wchar c); + %} %option 8bit @@ -128,6 +133,7 @@ static unsigned char unescape_single_char(unsigned char c); * $foo$ quoted strings * quoted identifier with Unicode escapes * quoted string with Unicode escapes + * Unicode surrogate escape in extended string */ %x xb @@ -139,6 +145,7 @@ static unsigned char unescape_single_char(unsigned char c); %x xdolq %x xui %x xus +%x xeu /* * In order to make the world safe for Windows and Mac clients as well as @@ -217,6 +224,7 @@ xeinside [^\\']+ xeescape [\\][^0-7] xeoctesc [\\][0-7]{1,3} xehexesc [\\]x[0-9A-Fa-f]{1,2} +xeunicode [\\](u[0-9A-Fa-f]{4}|U[0-9A-Fa-f]{8}) /* Extended quote * xqdouble implements embedded quote, @@ -506,6 +514,37 @@ other . {xeinside} { addlit(yytext, yyleng); } +{xeunicode} { + pg_wchar c = strtoul(yytext+2, NULL, 16); + + check_escape_warning(); + + if (c >= 0xD800 && c < 0xDC00) + { + xeu_surrogate1 = c; + BEGIN(xeu); + } + else if (c >= 0xDC00 && c < 0xE000) + yyerror("invalid Unicode escape value"); + + addunicode(c); +} +{xeunicode} { + pg_wchar c = strtoul(yytext+2, NULL, 16); + + if (c < 0xDC00 || c >= 0xE000) + yyerror("invalid Unicode surrogate pair"); + + c = ((xeu_surrogate1 & 0x3FF) << 10) | (c & 0x3FF); + + addunicode(c + 0x1); + + BEGIN(xe); +} +. { + yyerror("invalid Unicode surrogate pair"); +} + {xeescape} { if (yytext[1] == '\'') { @@ -1153,3 +1192,18 @@ check_escape_warning(void) lexer_errposition())); warn_on_first_escape = false; /* warn only once per string */ } + +static void +addunicode(pg_wchar c) +{ + char buf[8]; + + if (c == 0 || c > 0x10) + yyerror("invalid Unicode escape value"); + if (c > 0x7F && GetDatabaseEncoding() != PG_UTF8) + yyerror("Unicode escape values cannot be used for code point values above 007F when the server encoding is not UTF8"); + + unicode_to_utf8(c, (unsigned char *)buf); + addlit(buf, pg_mblen(buf)); +} + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
On 16 Apr 2009, at 16:21, David Fetter wrote: On Thu, Apr 16, 2009 at 06:12:10AM +0100, Simon Riggs wrote: I think the way to do this is to introduce plan output in XML If we're going with a serialization, which I think would be an excellent idea, how about one that's light-weight and human-readable like JSON? +1 xml/json is machine readable. I don't think, personaly that explain (analyze) is not easy to read by human, quite contrary. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
On Thu, Apr 16, 2009 at 2:04 PM, Grzegorz Jaskiewicz wrote: > > On 16 Apr 2009, at 16:21, David Fetter wrote: > >> On Thu, Apr 16, 2009 at 06:12:10AM +0100, Simon Riggs wrote: >>> >>> I think the way to do this is to introduce plan output in XML >> >> If we're going with a serialization, which I think would be an >> excellent idea, how about one that's light-weight and human-readable >> like JSON? > > +1 > > xml/json is machine readable. > I don't think, personaly that explain (analyze) is not easy to read by > human, quite contrary. Is that because of how the output is formatted though, or because the concepts are difficult to express? (I agree though, json is better especially for structures that are possibly highly nested). merlni -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [rfc] unicode escapes for extended strings
On Thu, Apr 16, 2009 at 08:48:58PM +0300, Marko Kreen wrote: > Seems I'm bad at communicating in english, I hope you're not saying this because of my misunderstandings! > so here is C variant of > my proposal to bring \u escaping into extended strings. Reasons: > > - More people are familiar with \u escaping, as it's standard > in Java/C#/Python, probably more.. > - U& strings will not work when stdstr=off. > > Syntax: > > \u - 16-bit value > \U - 32-bit value > > Additionally, both \u and \U can be used to specify UTF-16 surrogate > pairs to encode characters with value > 0x. This is exact behaviour > used by Java/C#/Python. (except that Java does not have \U) Are you sure that this handling of surrogates is correct? The best answer I've managed to find on the Unicode consortium's site is: http://unicode.org/faq/utf_bom.html#utf16-7 it says: They are invalid in interchange, but may be freely used internal to an implementation. I think this means they consider the handling of them you noted above, in other languages, to be an error. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Performance of full outer join in 8.3
On 16 Apr 2009, at 19:41, Merlin Moncure wrote: Is that because of how the output is formatted though, or because the concepts are difficult to express? (I agree though, json is better especially for structures that are possibly highly nested). What I mean is that what postgresql displays currently as explain(analyze[verbose]) is clear and understandable. Also, it is getting better and better from version to version. So I don't personally agree, that it is unreadable - and I am up for (and I am sure many users like me are) JSON, or XML output. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [rfc] unicode escapes for extended strings
Sam Mason wrote: Are you sure that this handling of surrogates is correct? The best answer I've managed to find on the Unicode consortium's site is: http://unicode.org/faq/utf_bom.html#utf16-7 it says: They are invalid in interchange, but may be freely used internal to an implementation. It says that about non-characters, not about the use of surrogate pairs, unless I am misreading it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [rfc] unicode escapes for extended strings
On Thu, Apr 16, 2009 at 03:04:37PM -0400, Andrew Dunstan wrote: > Sam Mason wrote: > >Are you sure that this handling of surrogates is correct? The best > >answer I've managed to find on the Unicode consortium's site is: > > > > http://unicode.org/faq/utf_bom.html#utf16-7 > > > >it says: > > > > They are invalid in interchange, but may be freely used internal to an > > implementation. > > It says that about non-characters, not about the use of surrogate pairs, > unless I am misreading it. No, I think you're probably right and I was misreading it. I went back and forth several times to explicitly check I was interpreting this correctly and still failed to get it right. Not sure what I was thinking and sorry for the hassle Marko! I've already asked on the Unicode list about this (no response yet), but I have a feeling I'm getting worked up over nothing. -- Sam http://samason.me.uk/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [rfc] unicode escapes for extended strings
On 4/16/09, Sam Mason wrote: > On Thu, Apr 16, 2009 at 08:48:58PM +0300, Marko Kreen wrote: > > Seems I'm bad at communicating in english, > > > I hope you're not saying this because of my misunderstandings! > > > > so here is C variant of > > my proposal to bring \u escaping into extended strings. Reasons: > > > > - More people are familiar with \u escaping, as it's standard > > in Java/C#/Python, probably more.. > > - U& strings will not work when stdstr=off. > > > > Syntax: > > > > \u - 16-bit value > > \U - 32-bit value > > > > Additionally, both \u and \U can be used to specify UTF-16 surrogate > > pairs to encode characters with value > 0x. This is exact behaviour > > used by Java/C#/Python. (except that Java does not have \U) > > > Are you sure that this handling of surrogates is correct? The best > answer I've managed to find on the Unicode consortium's site is: > > http://unicode.org/faq/utf_bom.html#utf16-7 > > it says: > > They are invalid in interchange, but may be freely used internal to an > implementation. > > I think this means they consider the handling of them you noted above, > in other languages, to be an error. It's up to UTF8 validator whether to consider non-characters as error. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] postgres 8.4 beta1
FYI, compiles on OpenSuSE 11.1 client works against 8.3.7 more information as soon as i will migrate date. -- kofeman
Re: [HACKERS] HashJoin w/option to unique-ify inner rel
> Upon further review, it appears that a big part of this problem is > that cost_hashjoin() doesn't understand that it needs cost semi-joins > differently from inner or left joins. The bogus logic looks to be > right here: > startup_cost += hash_qual_cost.startup; > run_cost += hash_qual_cost.per_tuple * > outer_path_rows * clamp_row_est(inner_path_rows * > innerbucketsize) * 0.5; > > Of course, when the join type is JOIN_SEMI, we're going to stop > looking after we find the first match, so this estimate is really far > off. The 8.3 version of cost_hashjoin() had a line like this: joininfactor = join_in_selectivity(&path->jpath, root); and a cost function like this: run_cost += hash_qual_cost.per_tuple * outer_path_rows * clamp_row_est(inner_path_rows * innerbucketsize) * joininfactor * 0.5; This compensated for IN joins being able to stop scanning a bucket once a match is found. You may consider something similar for a semi-join. Having experimented with a lot of this code recently, there is some potential for improvement on the bucket sizes, etc., but it is a non-trivial problem. I tested a similar query on TPCH 100M1Z on version 8.3: select * from customer where c_custkey in (select o_custkey from orders) and found that hash aggregate was marginally faster. If you turn off aggregation, it selects an IN hash join which is about 5% slower and the planner is not too far off. So, it would be definitely possible to modify the cost function appropriately. > >>> It's tempting to have Hash cheat and just peek at the node beneath it > >>> to see if it's a HashAggregate, in which case it could call a special > >>> method to request the whole hash. But it would have to know that it's > >>> just a plain uniquify and not implementing a GROUP BY. If HashAggregate is faster, then the question is can you make it better by avoiding building the hash structure twice. I haven't considered all the possibilities, but the situation you have used as an example, an IN query, seems workable. Instead of translating to a hash aggregate/hash/hash join query plan, it may be possible to create a special hash join node that does uniquefy. The benefit is that the planner knows about it (instead of changing the execution plan), you can be more accurate on costs for the hash join, and you can optimize by using only one hash table construction. A challenge that must be dealt with is handling the multi-batch case. It appears that hash aggregate does not currently handle this, but I may be mistaken. -- Ramon Lawrence -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HashJoin w/option to unique-ify inner rel
Robert Haas writes: > Upon further review, it appears that a big part of this problem is > that cost_hashjoin() doesn't understand that it needs cost semi-joins > differently from inner or left joins. Yeah, I have a note to look into that before 8.4 final. The same is true for nestloops: stopping after hitting one match to the current outer can make a big difference, and that's not reflected in costsize.c yet. I'm not sure whether cost_mergejoin needs to care. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HashJoin w/option to unique-ify inner rel
On Thu, Apr 16, 2009 at 7:26 PM, Tom Lane wrote: > Robert Haas writes: >> Upon further review, it appears that a big part of this problem is >> that cost_hashjoin() doesn't understand that it needs cost semi-joins >> differently from inner or left joins. > > Yeah, I have a note to look into that before 8.4 final. The same is > true for nestloops: stopping after hitting one match to the current > outer can make a big difference, and that's not reflected in costsize.c > yet. I'm not sure whether cost_mergejoin needs to care. Cool. It's worth noting that this is also a problem for anti-joins, which can also stop after hitting one match. For merge join, it looks like you might save something if there are duplicates in both the inner and outer lists. If the outer side contains 1,1,1,1,1,2 and the inner side contains 1,1,1,2,2, then an INNER JOIN or LEFT JOIN will rescan the first three tuples of the inner side three times, whereas a SEMI or ANTI JOIN will scan the first tuple three times and then the next two only once. But I'm not sure if we can estimate this accurately enough to matter. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] HashJoin w/option to unique-ify inner rel
> If HashAggregate is faster, then the question is can you make it better > by avoiding building the hash structure twice. I haven't considered all > the possibilities, but the situation you have used as an example, an IN > query, seems workable. Instead of translating to a hash > aggregate/hash/hash join query plan, it may be possible to create a > special hash join node that does uniquefy. Yeah, that's what I was looking at. The problem is that unique-ify is not free either - we have to invoke the appropriate comparison operators for every tuple in the bucket for which the hash values match exactly. So, for example if the input has K copies each of N items, I'll need to do (K - 1) * N comparisons, assuming no hash collisions. In return, the number of tuples in each bucket will be reduced by a factor of K, but that doesn't actually save very much, because I can reject all of those with an integer comparison anyway, again assuming no hash collisions, so it's pretty cheap. If the hash join was on track to go multi-batch, then unique-ifying it on the fly makes a lot of sense... otherwise, I'm not sure it's really going to be a win. Anyhow, further analysis needed... ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Lifetime of FmgrInfo
I was browsing PL/pgSQL source, and saw this line (pl_comp.c:151): function = (PLpgSQL_function *) fcinfo->flinfo->fn_extra It then does some work to determine whether the result in "function" is valid or not. So I got to wondering, what's the lifetime of the FunctionCallInfoinfo object passed to the call handler function? Or in other words, what memory context is it in? And is there some way I could find that out more easily than digging through the source? - Josh / eggyknap signature.asc Description: Digital signature
[HACKERS] [PATCH] unalias of ACL_SELECT_FOR_UPDATE
Currently, the ACL_SELECT_FOR_UPDATE privilege is defined as an alias of ACL_UPDATE as follows: at src/include/nodes/parsenodes.h: : /* Currently, SELECT ... FOR UPDATE/FOR SHARE requires UPDATE privileges */ #define ACL_SELECT_FOR_UPDATE ACL_UPDATE : It is unconfortable for us because SE-PostgreSQL have two individual permissions for updates (db_table:{update}) and explicit table locks (db_table:{lock}), but it unables to discriminate whether the given relation is actually used for UPDATE or SELECT FOR UPDATE. So, I would like to define the bit as (1<<12) which is not in use. I think we have two options to solve the matter. The one is to check ACL_SELECT_FOR_UPDATE by UPDATE privilege, as the attached patch doing. It implicitly expands the users privilege when he has ACL_UPDATE, so it enables GRANT UPDATE to cover both of ACL_UPDATE and ACL_SELECT_FOR_UPDATE as the current implementation doing. The other is to add a new privilege for explicit table locks, such as something like LOCK privilege. It is a straightforward approach, but we need to have a user visible changes. Which is more preferable design? Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei *** base/src/include/nodes/parsenodes.h 2009-04-09 00:13:21.0 +0900 --- base_ex/src/include/nodes/parsenodes.h 2009-04-09 00:41:07.0 +0900 *** typedef uint32 AclMode; /* a bitmask o *** 71,80 #define ACL_CREATE (1<<9) /* for namespaces and databases */ #define ACL_CREATE_TEMP (1<<10) /* for databases */ #define ACL_CONNECT (1<<11) /* for databases */ - #define N_ACL_RIGHTS 12 /* 1 plus the last 1ai_grantee == roleid) { ! result |= aidata->ai_privs & mask; if ((how == ACLMASK_ALL) ? (result == mask) : (result != 0)) return result; } --- 1141,1156 for (i = 0; i < num; i++) { AclItem*aidata = &aidat[i]; + AclMode aiprivs = aidata->ai_privs; + + /* fixup ACL_SELECT_FOR_UPDATE */ + if (aiprivs & ACL_UPDATE) + aiprivs |= ACL_SELECT_FOR_UPDATE; if (aidata->ai_grantee == ACL_ID_PUBLIC || aidata->ai_grantee == roleid) { ! result |= aiprivs & mask; if ((how == ACLMASK_ALL) ? (result == mask) : (result != 0)) return result; } *** aclmask(const Acl *acl, Oid roleid, Oid *** 1162,1167 --- 1167,1177 for (i = 0; i < num; i++) { AclItem*aidata = &aidat[i]; + AclMode aiprivs = aidata->ai_privs; + + /* fixup ACL_SELECT_FOR_UPDATE */ + if (aiprivs & ACL_UPDATE) + aiprivs |= ACL_SELECT_FOR_UPDATE; if (aidata->ai_grantee == ACL_ID_PUBLIC || aidata->ai_grantee == roleid) *** aclmask(const Acl *acl, Oid roleid, Oid *** 1170,1176 if ((aidata->ai_privs & remaining) && has_privs_of_role(roleid, aidata->ai_grantee)) { ! result |= aidata->ai_privs & mask; if ((how == ACLMASK_ALL) ? (result == mask) : (result != 0)) return result; remaining = mask & ~result; --- 1180,1186 if ((aidata->ai_privs & remaining) && has_privs_of_role(roleid, aidata->ai_grantee)) { ! result |= aiprivs & mask; if ((how == ACLMASK_ALL) ? (result == mask) : (result != 0)) return result; remaining = mask & ~result; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Lifetime of FmgrInfo
Joshua Tolley writes: > It then does some work to determine whether the result in "function" is > valid or not. So I got to wondering, what's the lifetime of the > FunctionCallInfoinfo object passed to the call handler function? Query lifespan, usually. There are counterexamples on both sides, but as a rule it's expected that it's worthwhile for a function to cache anything it can in the fcinfo->flinfo->fn_extra structure. If it's not worthwhile, that's the caller's fault not the function's. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] unalias of ACL_SELECT_FOR_UPDATE
KaiGai Kohei wrote: > Currently, the ACL_SELECT_FOR_UPDATE privilege is defined as an alias > of ACL_UPDATE as follows: > > at src/include/nodes/parsenodes.h: >: > /* Currently, SELECT ... FOR UPDATE/FOR SHARE requires UPDATE privileges */ > #define ACL_SELECT_FOR_UPDATE ACL_UPDATE >: > > It is unconfortable for us because SE-PostgreSQL have two individual > permissions for updates (db_table:{update}) and explicit table locks > (db_table:{lock}), but it unables to discriminate whether the given > relation is actually used for UPDATE or SELECT FOR UPDATE. What's the point of doing SELECT FOR UPDATE if you're not actually going to UPDATE the row? Having separate permissions for SELECT FOR UPDATE and UPDATE seems useless. A separate permission for SELECT FOR SHARE makes more sense, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] unalias of ACL_SELECT_FOR_UPDATE
Heikki Linnakangas wrote: > KaiGai Kohei wrote: >> Currently, the ACL_SELECT_FOR_UPDATE privilege is defined as an alias >> of ACL_UPDATE as follows: >> >> at src/include/nodes/parsenodes.h: >>: >> /* Currently, SELECT ... FOR UPDATE/FOR SHARE requires UPDATE privileges */ >> #define ACL_SELECT_FOR_UPDATE ACL_UPDATE >>: >> >> It is unconfortable for us because SE-PostgreSQL have two individual >> permissions for updates (db_table:{update}) and explicit table locks >> (db_table:{lock}), but it unables to discriminate whether the given >> relation is actually used for UPDATE or SELECT FOR UPDATE. > > What's the point of doing SELECT FOR UPDATE if you're not actually going > to UPDATE the row? Having separate permissions for SELECT FOR UPDATE and > UPDATE seems useless. I wonder why SELECT FOR UPDATE need ACL_UPDATE, although the statement itself does not modify any of the given relation. Indeed, it normally leads UPDATE statements, but I think ACL_UPDATE should be checked on the later phase. > A separate permission for SELECT FOR SHARE makes more sense, though. It is my major concern rather than exclusive locks. The SELECT FOR SHARE statement also requires ACL_SELECT_FOR_UPDATE, although it is a read only operation. It makes us hard to set up a table with foreign-key which refers a primary-key on read-only table, for example. Thanks, -- OSS Platform Development Division, NEC KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] SE-PostgreSQL for v8.5 development (r1819)
The following list of patches are the latest SE-PostgreSQL (r1819). http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-02-core-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-03-writable-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-04-rowlevel-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-05-perms-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-06-utils-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-07-tests-8.4beta1-r1819.patch http://sepgsql.googlecode.com/files/sepgsql-08-docs-8.4beta1-r1819.patch List of updates: * The base version was updated to the latest CVS HEAD. * The code to receice notifications from the kernelspace via netlink socket was simplified using the new avc_netlink_xxx() APIs. * It enables to handle permissive domain on the upcoming linux-2.6.31. * It enables to handle undefined permissions in the policy correctly. The purpose of every patches are not changed. Thanks, KaiGai Kohei wrote: > The following list of patches are the initial revision of SE-PostgreSQL > on the v8.5 development cycle. > These are separated into several functional components to help review > and commit in earlier phase. Every patches (except for the core) have > abour 1KL scales. It is far smaller than them in a year ago. :-) > > http://sepgsql.googlecode.com/files/sepgsql-01-sysatt-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-02-core-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-03-writable-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-04-rowlevel-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-05-perms-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-06-utils-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-07-tests-8.4devel-r1769.patch > http://sepgsql.googlecode.com/files/sepgsql-08-docs-8.4devel-r1769.patch > > Needless to say, it is now designed on 8.4devel tree, so anyone who want > to build/install SE-PostgreSQL can apply these patches by hand. > I'll also update and fix them with the progress of v8.4 development. > Before you apply them, please confirm whether they are the latest, or not. > > Bruice, > | KaiGai-san, the only option I can offer is perhaps to list a URL for > | your SE-PostgreSQL patch to be applied by people who want to use SE-PG. > > Does it mean I need to submit a patch to add an introduction under doc/ ? > If so, I'll submit it as soon as possible. > > Thanks, > > > 01) Security system attribute support > scale: 38 files changed, 853 insertions(+), 1 deletion(-), 113 > modifications(!) > This patch adds a new system catalog "pg_security" and enables to store > security identifier associated to a text representation within padding > area of HeapTupleHeader, as object identifier doing. > It is a foundation of any other facilities. > > 02) Core facilities of SE-PostgreSQL > scale: 55 files changed, 3588 insertions(+), 10 deletions(-), 736 > modifications(!) > This patch adds a mandatory access control feature collaborating with > SELinux in table, column, procedure level granurality. Most of this > patch is same as I proposed in the v8.4 development cycle, except for > it is designed on the basis of security system attribute support. > > 03) Writable system column support > scale: 7 files changed, 298 insertions(+), 199 modifications(!) > This patch enables users to update/insert on system columns > ("security_label" > and "security_acl") with explicit values. This feature is necessary to > provide > a user interface for row-level access controls. > > 04) Row-level access controls support > scale: 31 files changed, 1101 insertions(+), 231 modifications(!) > This patch enables to apply mandatory/discretionary access control in > row-level > granularity also. > > 05) Advanced permission checks support > scale: 18 files changed, 858 insertions(+), 3 deletions(-), 43 > modifications(!) > This patch add some of advanced permission checks: >- file:{read write} on server side filesystem accesses >- db_procedure:{install} on user defined functions as system internal ones >- db_database:{load_module install_module} on binary shared library files > In the v8.4 development, these are suggested to separate from the core. > > 06) Security options in utilities > scale: 4 files changed, 95 insertions(+), 116 modifications(!) > This patch adds options on utilities >- "--enable-selinux" option for initdb >- "--security-label" option for pg_dump and pg_dumpall > > 07) Testcases of SE-PostgreSQL > scale: 18 files changed, 1819 insertions(+), 2 modifications(!) > This patch adds testcases for SE-PostgreSQL. > > 08) Documentation of SE-PostgreSQL > scale: 16 files changed, 1595 insertions(+), 42 modifications