Re: Discarding DISCARD ALL
Simon>It seems strange to me that we put this work onto the pooler, forcing Simon>poolers to repeatedly issue the same command What if poolers learn to manage connections and prepared statements better? Then poolers won't have to reset the session every time, and everyone wins. Simon>This has an additional side benefit: if we know we will clean up at Simon>the end of the transaction, then all temp tables become effectively ON Simon>COMMIT DROP The ability to use the temporary tables sounds cool, however, server-prepared statements allow improve performance significantly (both at frontend and backend), so I would not treat "server_reset_query=discard all" as a best practice. That is why transaction_cleanup=on (discard everything at transaction finish) seems to be a workaround rather than a best practice for the future. Just to clarify: the patch seems to be small, so it looks harmless, however, I would not be that enthusiastic with tying new features with transaction_cleanup=on. --- What do you mean by "session cleanup"? Does that always have to be the same as "discard all"? What if the user wants "deallocate all" behavior? Should transaction_cleanup be an enum rather than on/off? Vladimir
Re: is ErrorResponse possible on Sync?
>Or, does the backend swallow the error, and return only the ReadyForQuery (I hope not). What is your backend version? Here's a well-known case when the backend did swallow the error: "Error on failed COMMIT" https://www.postgresql.org/message-id/b9fb50dc-0f6e-15fb-6555-8ddb86f4aa71%40postgresfriends.org I don't remember if the behavior has been fixed or not. The expected behavior was "commit" returned "rollback" status without any error. Vladimir
Re: Adding CI to our tree
>Just add those files to the global gitignore on your machine While global gitignore is a nice feature, it won't protect users who do not know they need to create a global ignore file. Adding explicit excludes for well-known temporary files into PostgreSQL sources makes it easier to work with the sources for everybody. Less manual configuration is better for having a productive environment. On top of that, there is even a use-case for having .idea folder in Git: .idea/icon.png and .idea/icon_dark.png files are displayed in JetBrains IDEs so the list of projects becomes easier to distinguish. AFAIK, a standard icon configuration does not yet exist: https://github.com/editorconfig/editorconfig/issues/425 Here's are samples: https://github.com/junit-team/junit5/blob/4ddc786728bc3fbc68d6a35d2eeeb63eb3e85609/.idea/icon.png , https://github.com/gradle/gradle/tree/1be71a9cd8882b08a9f8728d44eac8f65a33fbda/.idea Vladimir
max_memory_per_backend GUC to limit backend's memory usage
Hi, I've got a problem with PostgreSQL 9.6.5: backend gets killed by OOM killer, and it shuts the DB down. Of course, the OOM case is to be investigated (MemoryContextStatsDetail, etc), however I wonder if DB can be more robust. The sad thing is a single backend crash results in the DB shutdown, so it interrupts lots of transactions. I wonder if a GUC can be implemented, so it could fail just a single backend by limiting its memory use. For instance: max_mememory_per_backend=100MiB. The idea is to increase stability by limiting each process. Of course it would result in "out of memory" in case a single query requires 100500MiB (e.g. it misunderestimates the hash join). As far as I understand, it should be safer to terminate just one bad backend rather than kill all the processes. I did some research, and I have not found the discussion of this idea. Vladimir Rusinov> FWIW, lack of per-connection and/or global memory limit for work_mem is major PITA > https://www.postgresql.org/message-id/CAE1wr-ykMDUFMjucDGqU-s98ARk3oiCfhxrHkajnb3f%3DUp70JA%40mail.gmail.com Vladimir
Backend memory dump analysis
Hi, I investigate an out of memory-related case for PostgreSQL 9.6.5, and it looks like MemoryContextStatsDetail + gdb are the only friends there. MemoryContextStatsDetail does print some info, however it is rarely possible to associate the used memory with business cases. For insance: CachedPlanSource: 146224 total in 8 blocks; 59768 free (3 chunks); 86456 used CachedPlanQuery: 130048 total in 7 blocks; 29952 free (2 chunks); 100096 used It does look like a 182KiB has been spent for some SQL, however there's no clear way to tell which SQL is to blame. Another case: PL/pgSQL function context: 57344 total in 3 blocks; 17200 free (2 chunks); 40144 used It is not clear what is there inside, which "cached plans" are referenced by that pgsql context (if any), etc. It would be great if there was an ability to dump the memory in a machine-readable format (e.g. Java's HPROF). Eclipse Memory Analyzer (https://www.eclipse.org/mat/) can visualize Java memory dumps quite well, and I think HPROF format is trivial to generate (the generation is easy, the hard part is to parse memory contents). That is we could get analysis UI for free if PostgreSQL produces the dump. Is it something welcome or non-welcome? Is it something worth including in-core? Vladimir
Re: Backend memory dump analysis
Andres>The overhead required for it (in cycles, in higher memory usage due to additional bookeeping Does that mean the memory contexts are unparseable? (there's not enough information to enumerate contents) What if memory dump is produced by walking the C structures? For instance, I assume statament cache is stored in some sort of a hash table, so there should be a way to enumerate it in a programmatic way. Of course it would take time, however I do not think it creates cpu/memory overheads. The overhead is to maintain "walker" code. Vladimir
Re: Backend memory dump analysis
It looks much better. >While I didn't do anything about it here, I think it'd likely be a >good idea for MemoryContextStats printout to truncate the context ID >strings at 100 characters or so It would be great if there was an option to show full sql. For instance, current statistics is not sorted (should it be?), so it just prints the first 100 items no matter the size of entries. Luckily there's a function that accepts "number of nodes to print" as an argument. It would be better if it printed the 100 top consumers (including descendants) though. I think it makes sense to move all the numerics to the front, so the numbers are located in the more or less the same columns. Current output is hard to read as you basically have to search for "free" and "used" markers. What do you think of (number of bytes are printed with 6 characters, and number of chunks with 2 characters) 16384 total in 2 blocks; 12448 used; 3936 free ( 5 chunks): PL/pgSQL function tg_pslot_biu() instead of PL/pgSQL function tg_pslot_biu(): 16384 total in 2 blocks; 3936 free (5 chunks); 12448 used ? I think "used memory" is more important than "free". As far as I understand, the main use-case is "analyze memory consumption", so one cares "what is consuming the memory" more than "what context have enough free space". PS. "TopMemoryContext: 2143904 total" and "Grand total: 13115912 bytes" does confuse. It basically says "TopMemoryContext is 2MiB, and grant total is somehow 12MiB". It does not clarify if totals are "self memory" or "including descendant contexts". In your case the difference is 10MiB, and it is not that obvious why is that. Vladimir
Re: Backend memory dump analysis
Tom>Well, as I said, you can do anything you want now in an extension. That is true. However it basically means "everybody who cares to troubleshoot the memory use of a production system should install an extension". Should https://wiki.postgresql.org/wiki/Developer_FAQ#Examining_backend_memory_use provide a link to the extension then? Tom>Actually the key number is the one that already is printed Tom>first, ie the total space consumed by the context The space used is more important than the context name itself. What do you think of 8192 (2 blocks) CachedPlan: 1504 free (0 chunks); 6688 used: SELECT (SELECT COUNT(*)FROM (SELECT * FROM new_test UNION ALL SELECT * FROM new_test) ss) ? PS. "1504 free (0 chunks)" reads odd. Tom>Very occasionally, you might be interested in spotting contexts that have Tom>a disproportionate amount of free space, but IME that's seldom the main Tom>issue. Fully agree. That is why I suggest "total, used, free" order so it matches the likelihood of usage. Vladimir
Re: Backend memory dump analysis
Tom>One concrete objection to the above is it'd obscure hierarchical relationships in the context tree, What is the problem with relationships? Context names are aligned as well provided 8192 is justified to 6-7-8-9 (you pick) characters. Tom>But given the lack of previous complaints 1) Here it is: https://www.postgresql.org/message-id/547cee32.9000...@fuzzy.cz The log lists "TopMemoryContext: 136614192 total", so everybody follows "ah, there's 130MiB used" route. Nobody in the thread mentions 300MiB taken by "ExecutorState: 318758960 total". It takes just a single pass to compute "total" (and it takes no memory), so it would be much better if "TopMemoryContext: ..." was replaced with "Total memory used by all contexts is XXX bytes" Current TopMemoryContext row is extremely misleading. 2) Here is a complain on "100 contexts" limit: https://www.postgresql.org/message-id/55D7F9CE.3040904%402ndquadrant.com Note: 100 was invented "at random" in response to "let's not print everything by default". I do agree with having limit by default, however it would be so much better if it selected the rows to print even at a cost of increased CPU cycles for the print procedure. For instance: pgjdbc limits to 256 server-prepared statements by default (per backend). That is current "...Stats" would just ignore at least half of the prepared statements. 3) If you care so much on the number of passes (frankly speaking, I think one can easily wait for 5-10 seconds for debugging/troubleshooting stuff), then aggregate summary can still be computed and printed with no additional passes (and very limited memory) if the tree is printed in "child-parent" order. That is print "parent context" information after children iteration is done. PS. SQL text might involve sensitive information (e.g. logins, passwords, personal IDs), so there might be security issues with printing SQL by default. Vladimir
Re: Proposal: http2 wire format
Hi, >If anyone finds the idea of Postgres speaking http2 appealing HTTP/2 sounds interesting. What do you think of https://grpc.io/ ? Have you evaluated it? It does sound like a ready RPC on top of HTTP/2 with support for lots of languages. The idea of reimplementing the protocol for multiple languages from scratch does not sound too appealing. Vladimir
Re: Proposal: http2 wire format
Damir> * What are the criteria for getting this into the core? Craig>Mine would be: +1 There's a relevant list as well: https://github.com/pgjdbc/pgjdbc/blob/master/backend_protocol_v4_wanted_features.md Vladimir
Re: Proposal: http2 wire format
Damir>Postgres doesn’t know about grpc, s I'm afraid you are missing the point. I would say PostgreSQL doesn't know about HTTP/2. It is the same as "PostgreSQL doesn't know about grpc". Here's a quote from your pg_h2 repo: >What we need is to really build a request object and correctly extract > the full payload and parameters from the request. For example, >maybe we want to implement a QUERY method, similar to POST or PUT, > and pass the query text as the body of the request, with parameters > in the query string or in the headers It basically suggests to implement own framing on top of HTTP/2. When I say GRPC, I mean "implement PostgreSQL-specific protocol via GRPC messages". Let's take current message formats: https://www.postgresql.org/docs/current/static/protocol-message-formats.html If one defines those message formats via GRPC, then GRPC would autogenerate parsers and serializers for lots of languages "for free". For instance Query (F) Byte1('Q') Identifies the message as a simple query. Int32 Length of message contents in bytes, including self. String The query string itself. can be defined via GPRC as message Query { string queryText = 1; } This is trivial to read, trivial to write, trivial to maintain, and it automatically generates parsers/generators for lots of languages. Parsing of the current v3 protocol has to be reimplemented for each and every language, and it would be pain to implement parsing for v4. Are you going to create "http/2" clients for Java, C#, Ruby, Swift, Dart, etc, etc? I am not saying that a mere redefinition of v3 messages as GRPC would do the trick. I am saying that you'd better consider frameworks that would enable transparent implementation of client libraries. Damir>and will talk to any HTTP2 conforming client I do not see where are you heading to. Is "curl as PostgreSQL client" one of the key objectives for you? True clients (the ones that are used by the majority of applications) should support things like "prepared statements", "data types", "cursors" (resultset streaming), etc. I can hardly imagine a case when one would use "curl" and operate with prepared statements. I think psql is pretty good client, so I see no point in implementing HTTP/2 for a mere reason of using curl to fetch data from the DB. Vladimir
Re: Proposal: http2 wire format
Tom>But starting from the assumption that HTTP2 solves our problems seems to me to be "Here's a hammer. Agree. Just a side note: if v4 is ever invented I wish client language support is considered. It does take resources to implement message framing, and data parsing (e.g. int, timestamp, struct, array, ...) for each language independently. Vladimir
Re: Proposal: http2 wire format
It could make sense to arrange a Google Hangouts conversation (or alike). Hangouts allows to record sessions with up to 10 speakers and unlimited listeners. The recording can be shared via YouTube. Damir>Funny you agree with that—for someone having the experience of writing a driver and having a long list of things that you find wrong and frustrating I've been contributing to pgjdbc since 2014, so I do know how "great" v3 is. I wish there was a machine-readable protocol definition (e.g. Apache Thrift and/or GRPC and/or ProtoBuf etc). However, I fully agree with Tom that a mere usage of HTTP/2 would just render havoc with little outcome. Just to make it clear: HTTP/2 does not solve "hard to parse", "hard to load-balance", "hard to proxy" v3 problems, and it creates a huge problem of "let's reimplement HTTP2 client for all the clients/languages". Of course HTTP/2 stream multiplexing could be extremely helpful. For QueryCancel commands and/or for hearbeat messages. However those things alone do not justify the use of low-level HTTP/2. Damir>Why parse some obscure Postgres-specific binary data types when you can have the database send you the data in the serialization format of your client language? Suppose you want to use prepared statement. Then you need to do it in sequence: 1) Parse SQL text and convert it to a handle 2) Bind parameter value to the handle 3) Execute the statement 4) Receive rows 5) Receive error if any Currently it is implemented via different v3 messages (parse, bind, execute, row description, data row, etc etc). The claim is *any* implementation "on top of HTTP/2" would basically require to implement those "parse, bind, execute, row data, etc" *messages*. Say you pick to use "/parse" url with SQL text in body instead of "parse message". It does not make the whole thing "just HTTP/2". It just means you've created "your own protocol on top of HTTP/2". Clients would have to know the sequence of valid messages, clients would have to know if SQL should be present in body or in URL or in form post data, etc, etc. I believe Andres means exactly the same thing as he says By the way: you cannot just "load balance" "parse/bind/exec" to different backends, so the load balancer should be aware of meaning of those "parse/bind/exec" messages. I believe that is one of the requirements Craig meant by "Is practical to implement in connection pooler proxies". Andres>You *still* need a full blown protocol ontop of it. So no, this doesn't change that Damir> Did you know that Go has HTTP2 support in the standard library? And so does java, too? Java has TCP implementation in the standard library. Does it help implementing v3 protocol? In the same way HTTP/2 "as a library" helps implementing v4. The problem is it does not. Developer would have to somehow code the coding rules (e.g. header names, body formats). HTTP/2 is just too low level. Damir>Why parse some obscure Postgres-specific binary data types when you can have the database send you the data in the serialization format of your client language? >From my own experience, automatic use of server-prepared statements (see https://github.com/pgjdbc/pgjdbc/pull/319 ) did cut end-user response times of our business application in half. That is clients would have to know the way to use prepared statements in order to get decent performance. If you agree with that, then "v3 parse message", "v3 bind message", "v3 execute message" is not that different from "HTTP/2 POST to /parse", "HTTP/2 POST to /bind", "HTTP/2 POST to /execute". It is still "obscure PostgreSQL-specific HTTP/2 calls". Even if you disagree (really?) you would still have to know PostgreSQL-specific way to encode SQL text and "number of rows returned" and "wire formats for the columns" even for a single "HTTP POST /just/execute/sql" kind of API. Even that is "a full blown protocol ontop of HTTP2" (c) Andres. Vladimir
Re: Proposal: http2 wire format
Damir>Wouldn’t that be protocol semantics? Framing is already taken care of by the wire protocol. Apparently I'm using the wrong word. I do mean protocol semantics. Damir>But can you pull off grpc.. Would it be the only protocol supported? Of course there will be lots of "older clients" For instance, pgjdbc has been supporting v2 and v3 for quite a while. Now it supports just v3. Damir>Can you imagine the reaction and discussion if I came up with this? I would +1 for that :) Damir>how can I do something about the status quo of FEBE protocol that would be defensible in front of the Postgres community?” What would be your answer? I would definitely check if frameworks like GRPC, Apache Thrift, and similar are suitable. Damir>What if I wanted JSON or CSV returned That is not up to the protocol. It is not up to the backend. It looks like a task for the application since CSV could mean lots of stuff (e.g. commas vs tabs vs quotes etc). You don't want to integrate all that stuff in core. Damir>Wouldn’t you need to first solve the problem of content negotiation? Currently content negotiation is done at application level (e.g. at bind message). I don't think it makes sense to negotiate the content for each and every protocol message. It does sound like an over-engineering. Damir>Wouldn’t HTTP2 framing still allow prepared statements and cursors? It will require to implement parse/bind/execute "protocol messages" again and again. Damir>’m just proposing a layer under it that gets rid of a lot of pain. The thing is it is not clear whose pain are you treating. Application developers just use drivers, so they don't care if there's HTTP/2 under the covers. Driver developers have more-or-less tested v3 implementations, so they don't have that pain of "implementing v3 parser". PostgreSQL core developers ..., well, you know. Vladimir
Re: [HACKERS] Insert values() per-statement overhead
Andres>I think the biggestoverhead here is that the executor startup includes Andres>too many indirect (linked lists) datastructured, that allocated each Andres>round The case is very common: batch inserts are popular in Java, and ORMs use batch API automatically. However, there's high per-backend-message overhead, and that overhead is very noticeable. What is the approach to handle this? Folding multiple DML statements into one with a help of CTE does not work either (see https://github.com/pgjdbc/pgjdbc/issues/1165 ): CTE doc>Trying to update the same row twice in a single statement is not supported. Only one of the modifications takes place, but it is not easy (and sometimes not possible) to reliably predict which one Vladimir
Re: Built-in connection pooling
Christopher>One of the things that they find likable is that by having the connection pool live Christopher>in the framework alongside the application is that this makes it easy to attach Christopher>hooks so that the pool can do intelligent things based on application-aware logic. I'm afraid I do not follow you. Can you please provide an example? TL;DR: 1) I think in-application pooling would be required for performance reasons in any case. 2) Out-of-application pooling (in-backend or in-the-middle) is likely needed as well JDBC clients use client-side connection pooling for performance reasons: 1) Connection setup does have overhead: 1.1) TCP connection takes time to init/close 1.2) Startup queries involve a couple of roundrips: "startup packet", then "SET extra_float_digits = 3", then "SET application_name = '...' " 2) Binary formats on the wire are tied to oids. Clients have to cache the oids somehow, and "cache per connection" is the current approach. 3) Application threads tend to augment "application_name", "search_path", etc for its own purposes, and it would slow the application down significantly if JDBC driver reverted application_name/search_path/etc for each and every "connection borrow". 4) I believe there's non-zero overhead for backend process startup As Konstantin lists in the initial email, the problem is backend itself does not scale well with lots of backend processes. In other words: it is fine if PostgreSQL is accessed by a single Java application since the number of connections would be reasonable (limited by the Java connection pool). That, however, is not the case when the DB is accessed by lots of applications (==lots of idle connections) and/or in case the application is using short-lived connections (==in-app pool is missing that forces backend processes to come and go). Vladimir
Re: Built-in connection pooling
>Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql.builtin_pool.git The branch (as of 0020c44195992c6dce26baec354a5e54ff30b33f) passes pgjdbc tests: https://travis-ci.org/vlsi/pgjdbc/builds/368997672 Current tests are mostly single-threaded, so the tests are unlikely to trigger lots of "concurrent connection" uses. The next step might be to create multiple schemas, and execute multiple tests in parallel. Vladimir
Re: Protect syscache from bloating with negative cache entries
Robert> This email thread is really short on clear demonstrations that X or Y Robert> is useful. It is useful when the whole database does **not** crash, isn't it? Case A (==current PostgeSQL mode): syscache grows, then OOMkiller chimes in, kills the database process, and it leads to the complete cluster failure (all other PG processes terminate themselves). Case B (==limit syscache by 10MiB or whatever as Tsunakawa, Takayuki asks): a single ill-behaved process works a bit slower and/or consumers more CPU than the other ones. The whole DB is still alive. I'm quite sure "case B" is much better for the end users and for the database administrators. So, +1 to Tsunakawa, Takayuki, it would be so great if there was a way to limit the memory consumption of a single process (e.g. syscache, workmem, etc, etc). Robert> However, memory usage is quite unpredictable. It depends on how many Robert> backends are active The number of backends can be limited by ensuring a proper limits at application connection pool level and/or pgbouncer and/or things like that. Robert>how many copies of work_mem and/or Robert> maintenance_work_mem are in use There might be other patches to cap the total use of work_mem/maintenance_work_mem, Robert>I don't think we Robert> can say that just imposing a limit on the size of the system caches is Robert> going to be enough to reliably prevent an out of memory condition The less possibilities there are for OOM the better. Quite often it is much better to fail a single SQL rather than kill all the DB processes. Vladimir
Re: portal pinning
> committed I'm afraid it causes regressions for pgjdbc. Here's CI log: https://travis-ci.org/pgjdbc/pgjdbc/jobs/327327402 The errors are: testMetaData[typeName = REF_CURSOR, cursorType = 2,012](org.postgresql.test.jdbc2.RefCursorTest) Time elapsed: 0.032 sec <<< ERROR! org.postgresql.util.PSQLException: ERROR: cannot drop pinned portal "" It looks like "refcursor" expressions are somehow broken. The test code is to execute testspg__getRefcursor procedure that is defined as follows CREATE OR REPLACE FUNCTION testspg__getRefcursor () RETURNS refcursor AS ' declare v_resset refcursor; begin open v_resset for select id from testrs order by id; return v_resset; end;' LANGUAGE plpgsql; Would you please check that? Vladimir
Re: jdbc targetServerType=slave with readonly connection
>the slave replication server does not seem to be recognized correctly Could you elaborate on what do you mean by "does not seem to be recognized correctly"? You might be hitting https://github.com/pgjdbc/pgjdbc/pull/844 , so would you please check if pgjdbc 42.2.0-SNAPSHOT (e.g. https://oss.sonatype.org/content/repositories/snapshots/org/postgresql/postgresql/42.2.0-SNAPSHOT/ ) resolves your issue? >jdbc driver thinks it is connected to a slave pjgdbc uses "show transaction_read_only" query to tell master from secondary. "readOnly=true" connection option is used/activated _after_ the connection, so it should not impact master/secondary selection. PS. There's 10 second-long (hostRecheckSeconds) cache of the host status. That is, pgjdbc caches master/secondary state of the host for 10 seconds by default, so it might "fail" to identify recently activated secondary. Vladimir
Re: Built-in connection pooling
Bruce>Yes, it would impact applications and you are right most applications could not handle that cleanly. I would disagree here. We are discussing applications that produce "lots of idle" connections, aren't we? That typically comes from an application-level connection pool. Most of the connection pools have a setting that would "validate" connection in case it was not used for a certain period of time. That plays nicely in case server drops "idle, not in a transaction" connection. Of course, there are cases when application just grabs a connection from a pool and uses it in a non-transacted way (e.g. does some action once an hour and commits immediately). However that kind of application would already face firewalls, etc. I mean the application should already be prepared to handle "network issues". Bruce> It is probably better to look into Bruce>freeing resources for idle connections instead and keep the socket open. The application might expect for the session-specific data to be present, so it might be even worse if the database deallocates all the things but TCP connection. For instance: application might expect for the server-prepared statements to be there. Would you deallocate server-prepared statements for those "idle" connections? The app would just break. There's no way (currently) for the application to know that the statement expired unexpectedly. Vladimir
Re: Built-in connection pooling
Bruce>Well, we could have the connection pooler disconnect those, right? I agree. Do you think we could rely on all the applications being configured in a sane way? A fallback configuration at DB level could still be useful to ensure the DB keeps running in case multiple applications access it. It might be non-trivial to ensure proper configurations across all the apps. What I do like is the behaviour of dropping connections should already be considered by existing applications, so it should fit naturally to the existing apps. Alternative approach might be to dump to disk relevant resources for inactive sessions, so the session could be recreated in case the connection is requested again after a long pause (e.g. reprepare all the statements), however it sounds scary. Vladimir
Re: Built-in connection pooling
Konstantin>I have obtained more results with YCSB benchmark and built-in connection pooling Could you provide more information on the benchmark setup you have used? For instance: benchmark library versions, PostgreSQL client version, additional/default benchmark parameters. Konstantin>Postgres shows significant slow down with increasing number of connections in case of conflicting updates. Konstantin>Built-in connection pooling can somehow eliminate this problem Can you please clarify how connection pooling eliminates slow down? Is the case as follows? 1) The application updates multiple of rows in a single transaction 2) There are multiple concurrent threads 3) The threads update the same rows at the same time If that is the case, then the actual workload is different each time you vary connection pool size. For instance, if you use 1 thread, then the writes become uncontended. Of course, you might use just it as a "black box" workload, however I wonder if that kind of workload ever appears in a real-life applications. I would expect for the applications to update the same row multiple times, however I would expect the app is doing subsequent updates, not the concurrent ones. On the other hand, as you vary the pool size, the workload varies as well (the resulting database contents is different), so it looks like comparing apples to oranges. Vladimir >
Re: Built-in connection pooling
Konstantin>I have not built YCSB myself, use existed installation. Which pgjdbc version was in use? Konstantin>One of the main problems of Postgres is significant degrade of performance in case of concurrent write access by multiple transactions to the same sows. I would consider that a workload "problem" rather than PostgreSQL problem. That is, if an application (e.g. YCSB) is trying to update the same rows in multiple transactions concurrently, then the outcome of such updates is likely to be unpredictable. Does it make sense? At least, I do not see why Mongo would degrade in a different way there. Oleg's charts suggest that Mongo does not degrade there, so I wonder if we compare apples to apples in the first place. Vladimir
Re: Built-in connection pooling
> config/pgjsonb-local.dat Do you use standard "workload" configuration values? (e.g. recordcount=1000, maxscanlength=100) Could you share ycsb output (e.g. for workload a)? I mean lines like [TOTAL_GC_TIME], Time(ms), xxx [TOTAL_GC_TIME_%], Time(%), xxx >postgresql-9.4.1212.jar Ok, you have relevant performance fixes then. Konstantin>Just simple example: consider that you have something like AppStore and there is some popular application which is bought by a lot of users. Konstantin>From DBMS point of view a lot of clients perform concurrent update of the same record. I thought YCSB updated *multiple rows* per transaction. It turns out all the default YCSB workloads update just one row per transaction. There is no batching, etc. Batch-related parameters are used at "DB initial load" time only. Konstantin>Postgres locks tuples in very inefficient way in case of high contention Thank you for the explanation. Vladimir
Re: Built-in connection pooling
Konstantin>I do not have explanation of performance degradation in case of this particular workload. A) Mongo Java Client uses a connection-pool of 100 connections by default. That is it does not follow "connection per client" (in YCSB terms), but it is capped by 100 connections. I think it can be adjusted by adding ?maxPoolSize=100500 or ?maxpoolsize=100500 to the Mongo URL I wonder if you could try to vary that parameter and see if it changes Mongo results. B) There's a bug in JDBC client of YCSB (it might affect PostgreSQL results, however I'm not sure if the impact would be noticeable). The default configuration is readallfields=true, however Jdbc client just discards the results instead of accessing the columns. I've filed https://github.com/brianfrankcooper/YCSB/issues/1087 for that. C) I might miss something, however my local (Macbook) benchmarks show that PostgreSQL 9.6 somehow uses Limit->Sort->BitmapScan kind of plans. I have picked a "bad" userid value via auto_explain. Jdbc client uses prepared statements, so a single bind might spoil the whole thing causing bad plans for all the values afterwards. Does it make sense to disable bitmap scan somehow? For instance: explain (analyze, buffers) select * From usertable where YCSB_KEY>='user884845140610037639' order by YCSB_KEY limit 100; QUERY PLAN --- Limit (cost=320.99..321.24 rows=100 width=1033) (actual time=1.408..1.429 rows=100 loops=1) Buffers: shared hit=140 -> Sort (cost=320.99..321.33 rows=135 width=1033) (actual time=1.407..1.419 rows=100 loops=1) Sort Key: ycsb_key Sort Method: quicksort Memory: 361kB Buffers: shared hit=140 -> Bitmap Heap Scan on usertable (cost=9.33..316.22 rows=135 width=1033) (actual time=0.186..0.285 rows=167 loops=1) Recheck Cond: ((ycsb_key)::text >= 'user884845140610037639'::text) Heap Blocks: exact=137 Buffers: shared hit=140 -> Bitmap Index Scan on usertable_pkey (cost=0.00..9.29 rows=135 width=0) (actual time=0.172..0.172 rows=167 loops=1) Index Cond: ((ycsb_key)::text >= 'user884845140610037639'::text) Buffers: shared hit=3 Planning time: 0.099 ms Execution time: 1.460 ms vs explain (analyze, buffers) select * From usertable where YCSB_KEY>='user184845140610037639' order by YCSB_KEY limit 100; QUERY PLAN --- Limit (cost=0.28..89.12 rows=100 width=1033) (actual time=0.174..0.257 rows=100 loops=1) Buffers: shared hit=102 -> Index Scan using usertable_pkey on usertable (cost=0.28..2154.59 rows=2425 width=1033) (actual time=0.173..0.246 rows=100 loops=1) Index Cond: ((ycsb_key)::text >= 'user184845140610037639'::text) Buffers: shared hit=102 Planning time: 0.105 ms Execution time: 0.277 ms Vladimir
Re: [GSoC Idea Discussion] "Thrift datatype support" Project
Technically speaking, Thrift is "language-independent" serialization-deserialization format with clean approach to backward compatibility. I think Thrift (or something like that) can be useful, as it can generate serializers/deserializers for lots of languages. PostgreSQL's "binary" format is tied to the PostgreSQL and it has to be reimplemented for each and every client language. "text" format is non-trivial as well (e.g. it is hard to get quoting right for structs, and text is server-locale-dependent) Vladimir
Re: IDE setup and development features?
Mori>Was wondering if anyone has had luck getting these three set up for any IDE or editor configuration? Just a data point: CLion + CMake work just great. Step by step (just checked in macOS): 1) "Check out from Version Control" -> Git -> https://github.com/stalkerg/postgres_cmake.git -> clone 2) CLion asks if it should open CMakeLists.txt -> yes 3) Wait a bit 4) That's it It results in quite good IDE support for PostgreSQL code: jump to definition, find references, great autocomplete, parameter hints (including the ones for macros), refactoring (e.g. function rename with update of the references). It imports run/debug configurations from CMake as well, so one can run and debug tests/binaries right after project import. Vladimir
Re: Libpq support to connect to standby server as priority
Tom>Yes, we need either session open or reconnect it approach to find out Tom>the whether server is read-write or read-only. Just in case, pgjdbc has that feature for quite a while, and the behavior there is to keep the connection until it fails or application decides to close it. pgjdbc uses three parameters (since 2014): 1) targetServerType=(any | master | secondary | preferSecondary). Default is "any". When set to "master" it will look for "read-write" server. If set to "preferSecondary" it would search for "read-only" server first, then fall back to master, and so on. 2) loadBalanceHosts=(true | false). pgjdbc enables to load-balance across servers provided in the connection URL. When set to "false", pgjdbc tries connections in order, otherwise it shuffles the connections. 3) hostRecheckSeconds=int. pgjdbc caches "read/write" status of a host:port combination, so it don't re-check the status if multiple connections are created within hostRecheckSeconds timeframe. It is sad that essentially the same feature is re-implemented in core with different name/semantics. Does it make sense to align parameter names/semantics? Vladimir
Re: When extended query protocol ends?
>Is it possible for the JDBC >driver to issue a Sync message before sending SAVEPOINT in simple >query protocol? Apparently, sending an extra message would increase the overhead of the protocol, thus reducing the efficiency of the application. What is the benefit of sending extra Sync? https://www.postgresql.org/docs/current/protocol-overview.html#PROTOCOL-MESSAGE-CONCEPTS suggests that is is fine to mix both simple and extended messages depending on the needs of the application. https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-PIPELINING reads that clients can omit sending Sync to make the error handling the way they like. I am not that sure we must omit sync there, however, it might be the case. https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY reads "simple Query message also destroys the unnamed statement" and "simple Query message also destroys the unnamed portal" >Or you can send SAVEPOINT using the extended query protocol. I am afraid we can't. The purpose of savepoints at the driver's level is to enable migrating applications from other databases to PostgreSQL. In PostgreSQL any SQL exception fails the transaction, including errors like "prepared statement \"...\" does not exist", and so on. It might be unexpected for the users to unexpectedly get "prepared statement is no longer valid" errors in case somebody adds a column to a table. We can't send complete parse-bind-execute commands for every "savepoint" call as it would hurt performance. We can't cache the parsed statement as it could be discarded by a random "deallocate all". So the only way out I see is to use simple query mode for savepoint queries. Vladimir
Re: When extended query protocol ends?
>Would performance suffer that much? I have not benchmarked it much, however, the driver sends "autosave" queries once (savepoint) or twice(savepoint+release) for every user-provided query. If we use extended queries (parse+bind+exec) for every savepoint, that would result in 3 or 6 messages overhead for every user query. >From many measurements we know that insert into table(id, name) values(?,?),(?,?),(?,?) is much more efficient than sending individual bind-exec-bind-exec-bind-exec-sync messages like "insert into table(id, name) values(?,?)" For instance, here are some measurements: https://www.baeldung.com/spring-jdbc-batch-inserts#performance-comparisons Based on that measurements I assume there's a non-trivial per-message overhead. Vladimir
Re: When extended query protocol ends?
>When splitting a multi insert statement you're going to duplicate some work I do not understand why I am going to duplicate some work. I assume the database does its best to perform all the needed preparation when processing "parse" message, and it should perform only the minimum required work when processing bind+exec messages. Unfortunately, it is not completely the case, so using bind+exec+bind+exec is suboptimal even for trivial insert statements. Please, take into account the following sequence: One-time-only: parse S1 as insert into table(id, name) values(?,?) Some time later: bind S1 ... exec S1 bind S1 ... exec S1 bind S1 ... exec S1 bind S1 ... exec S1 sync I do not know how this could be made more efficient as I execute parse only once, and then I send bind+exec+bind+exec without intermediate sync messages, so the data should flow nicely in TCP packets. As I said above, the same flow for multi-value insert beats the bind+exec+bind+exec sequence at a cost of poor reporting. For instance, for multivalue insert we can't tell how many rows are generated for each statement. --- Here are some measurements regarding savepoints for simple vs extended Sure they are not very scientific, however, they show improvement for simple protocol $ cat svpnt BEGIN; SAVEPOINT PGJDBC_AUTOSAVE; RELEASE SAVEPOINT PGJDBC_AUTOSAVE; COMMIT; $ pgbench -f svpnt --protocol=extended --time=10 --progress=1 -r progress: 1.0 s, 4213.8 tps, lat 0.237 ms stddev 0.034, 0 failed progress: 2.0 s, 4367.9 tps, lat 0.229 ms stddev 0.024, 0 failed progress: 3.0 s, 4296.2 tps, lat 0.233 ms stddev 0.038, 0 failed progress: 4.0 s, 4382.0 tps, lat 0.228 ms stddev 0.026, 0 failed progress: 5.0 s, 4374.1 tps, lat 0.228 ms stddev 0.026, 0 failed progress: 6.0 s, 4305.7 tps, lat 0.232 ms stddev 0.035, 0 failed progress: 7.0 s, 4111.1 tps, lat 0.243 ms stddev 0.182, 0 failed progress: 8.0 s, 4245.0 tps, lat 0.235 ms stddev 0.042, 0 failed progress: 9.0 s, 4219.9 tps, lat 0.237 ms stddev 0.036, 0 failed progress: 10.0 s, 4231.1 tps, lat 0.236 ms stddev 0.031, 0 failed transaction type: svpnt scaling factor: 1 query mode: extended number of clients: 1 number of threads: 1 maximum number of tries: 1 duration: 10 s number of transactions actually processed: 42748 number of failed transactions: 0 (0.000%) latency average = 0.234 ms latency stddev = 0.065 ms initial connection time = 2.178 ms tps = 4275.562760 (without initial connection time) statement latencies in milliseconds and failures: 0.058 0 BEGIN; 0.058 0 SAVEPOINT PGJDBC_AUTOSAVE; 0.058 0 RELEASE SAVEPOINT PGJDBC_AUTOSAVE; 0.060 0 COMMIT; $ pgbench -f svpnt --protocol=simple --time=10 --progress=1 -r progress: 1.0 s, 4417.7 tps, lat 0.225 ms stddev 0.033, 0 failed progress: 2.0 s, 4446.0 tps, lat 0.225 ms stddev 0.079, 0 failed progress: 3.0 s, 4377.1 tps, lat 0.228 ms stddev 0.048, 0 failed progress: 4.0 s, 4485.0 tps, lat 0.223 ms stddev 0.024, 0 failed progress: 5.0 s, 4355.9 tps, lat 0.229 ms stddev 0.353, 0 failed progress: 6.0 s, .3 tps, lat 0.225 ms stddev 0.035, 0 failed progress: 7.0 s, 4530.7 tps, lat 0.220 ms stddev 0.020, 0 failed progress: 8.0 s, 4431.1 tps, lat 0.225 ms stddev 0.022, 0 failed progress: 9.0 s, 4497.1 tps, lat 0.222 ms stddev 0.027, 0 failed progress: 10.0 s, 4507.0 tps, lat 0.222 ms stddev 0.024, 0 failed transaction type: svpnt scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 maximum number of tries: 1 duration: 10 s number of transactions actually processed: 44493 number of failed transactions: 0 (0.000%) latency average = 0.224 ms latency stddev = 0.116 ms initial connection time = 2.690 ms tps = 4450.372095 (without initial connection time) statement latencies in milliseconds and failures: 0.056 0 BEGIN; 0.056 0 SAVEPOINT PGJDBC_AUTOSAVE; 0.056 0 RELEASE SAVEPOINT PGJDBC_AUTOSAVE; 0.057 0 COMMIT; Vladimir
Re: When extended query protocol ends?
Jelte> If the Execute causes an error, is Jelte> the Query still executed or not? And what about if the Query fails, is Jelte> the Execute before committed or rolled back? Frankly, if there's a requirement from the backend, I would like it to produce a corresponding error message. What do you think of making the backend respond with an error message if the requests come out of sequence? For instance, if execute comes without parse, backend responds with "ERROR: prepared statement "..." does not exist" Can we have a similar diagnostic to catch the unspoken rule of "sync should be there in-between extended messages and simple/functioncall messages and copy subprotocol"? If "extended query" is supposed to be like a subprotocol which allows no extra messages, then it would help if the documentation enumerated the subprotocols and their messages. For instance, for copy, the documentation is clear; it goes with a subprotocol, so only specific messages are allowed. - Jelte>And only extended protocol Jelte>messages are allowed to be part of a pipeline: "Use of the extended Jelte>query protocol allows pipelining" Frankly speaking, I think there's a misunderstanding of "pipelining" The full quote is pgdoc>"Use of the extended query protocol allows pipelining, pgdoc>which means sending a series of queries without waiting for earlier ones to complete" The key is "without waiting for earlier ones to complete". However, in Simple Query section, the documentation reads that the frontend does not need to wait for the response: pgdoc>It is not actually necessary for the frontend to wait for ReadyForQuery before issuing another command" In other words, SimpleQuery supports pipelining just as fine, and there's nothing special in "extended query" in that regard. Vladimir
Re: Error on failed COMMIT
Bruce, thanks for taking the time to summarize. Bruce>Fourth, it is not clear how many applications would break if COMMIT Bruce>started issuing an error rather than return success None. Bruce>applications that issue COMMIT and expect success after a transaction Bruce>block has failed An application must expect an exception from a COMMIT statement like any other SQL. Wire protocol specification explicitly says implementations must expect error messages at any time. --- Bruce>Do we know how other database systems handle this? Oracle DB produces an error from COMMIT if transaction can't be committed (e.g. failure in the processing of "on commit refresh materialized view"). --- The bug is "deferred constraint violation" and "non-deferred constraint violation" end up with **different** behavior for COMMIT. deferred violation produces an error while non-deferred violation produces "silent rollback". In other words, there are already cases in PostgreSQL when commit produces an error. It is nothing new. The new part is that PostgreSQL must not produce "silent rollbacks". Bruce>First, Vik reported that we don't follow the SQL spec +1 Bruce>Second, someone suggested that if COMMIT throws an error, that future Bruce>statements would be considered to be in the same transaction No. Please disregard that. That is ill. COMMIT (and/or ROLLBACK) must terminate the transaction in any case. The transaction must not exist after COMMIT finishes (successfully or not). The same for PREPARE TRANSACTION. If it fails, then the transaction must be clear. A litmus test is "deferred constraint violation". It works Ok in the current PostgreSQL. If the database can't commit, it should respond with a clear error that describes the reason for the failure. Bruce>Third, the idea that individual interfaces, e.g. JDBC, should throw Individual interfaces should not deviate from server behavior much. They should convert server-provided errors to the language-native format. They should not invent their own rules to convert server messages to errors. That would provide a uniform PostgreSQL experience for the end-users. Note: there are even multiple JDBC implementations for PostgreSQL, so slight differences in transaction handling is the very last "feature" people want from PostgreSQL database. Vladimir
Re: Experiments with Postgres and SSL
It would be great if PostgreSQL supported 'start with TLS', however, how could clients activate the feature? I would like to refrain users from configuring the handshake mode, and I would like to refrain from degrading performance when a new client talks to an old database. What if the server that supports 'fast TLS' added an extra notification in case client connects with a classic TLS? Then a capable client could remember host:port and try with newer TLS appoach the next time it connects. It would be transparent to the clients, and the users won't need to configure 'prefer classic or fast TLS' The old clients could discard the notification. Vladimir -- Vladimir
Re: Experiments with Postgres and SSL
>I don't think it's worth implementing a code path in > the server like this as it would then become cruft that would be hard > to ever get rid of. Do you think the server can de-support the old code path soon? > I think you can do the same thing, more or less, in the client. Like > if the driver tries to connect via SSL and gets an error it remembers > that host/port and connects using negotiation in the future. Well, I doubt everybody would instantaneously upgrade to the database that supports fast TLS, so there will be a timeframe when there will be a lot of old databases, and the clients will be new. In that case, going with "try fast, ignore exception" would degrade performance for old databases. I see you suggest caching, however, "degrading one of the cases" might be more painful than "not improving one of the cases". I would like to refrain from implementing "parallel connect both ways and check which is faster" in PG clients (e.g. https://en.wikipedia.org/wiki/Happy_Eyeballs ). Just wondering: do you consider back-porting the feature to all supported DB versions? > In practice though, by the time drivers support this it'll probably be > far enough in the future I think drivers release more often than the database, and we can get driver support even before the database releases. I'm from pgjdbc Java driver team, and I think it is unfair to suggest that "driver support is only far enough in the future". Vladimir
Re: Experiments with Postgres and SSL
>You could just hard code that servers newer than a > specific version would have this support Suppose PostgreSQL 21 implements "fast TLS" Suppose pgjdbc 43 supports "fast TLS" Suppose PgBouncer 1.17.0 does not support "fast TLS" yet If pgjdbc connects to the DB via balancer, then the server would respond with "server_version=21". The balancer would forward "server_version", so the driver would assume "fast TLS is supported". In practice, fast TLS can't be used in that configuration since the connection will fail when the driver attempts to ask "fast TLS" from the PgBouncer. > Or it could be done with a "protocol option" Would you please clarify what you mean by "protocol option"? >I guess a lot depends on the way the driver works and the way the > application is structured There are cases when applications pre-create connections on startup, so the faster connections are created the better. The same case happens when the admin issues "reset connection pool", so it discards old connections and creates new ones. People rarely know all the knobs, so I would like to have a "fast by default" design (e.g. server sending a notification "you may use fast mode the next time") rather than "keep old behaviour and require everybody to add fast=true to their configuration" (e.g. users having to configure "try_fast_tls_first=true") Vladimir
Re: run pgindent on a regular basis / scripted manner
Andres> Of course, but I somehow feel a change of formatting should be reviewable to Andres> at least some degree One way of reviewing the formatting changes is to compare the compiled binaries. If the binaries before and after formatting are the same, then there's a high chance the behaviour is the same. Vladimir
PostgreSQL Limits: maximum number of columns in SELECT result
Hi, Today I hit "ERROR: target lists can have at most 1664 entries", and I was surprised the limit was not documented. I suggest that the limit of "1664 columns per tuple" (or whatever is the right term) should be added to the list at https://www.postgresql.org/docs/current/limits.html e.g. after "columns per table". Could someone please commit that 1-2 line doc improvement or do you need a patch for it? Vladimir
Re: PostgreSQL Limits: maximum number of columns in SELECT result
>ost readers are going to spend more time >wondering what the difference is between "columns per table" and "columns >per tuple" "tuple" is already mentioned 10 times on "limits" page, so adding "columns per tuple" is not really obscure. The comment could be like "for instance, max number of expressions in each SELECT clause" I know I visited current/limits.html many times (mostly for things like "max field length") However, I was really surprised there's an easy to hit limit on the number of expressions in SELECT. I don't ask to lift the limit, however, I am sure documenting the limit would make it clear for the application developers that the limit exists and they should plan for it in advance. I bumped into "target lists can have at most 1664 entries" when I was trying to execute a statement with 65535 parameters. I know wire format uses unsigned int2 for the number of parameters, so I wanted to test if the driver supports that. a) My first test was like select ? c1, ? c2, ? c3, ..., ? c65535 Then it failed with "ERROR: target lists can have at most 1664 entries". I do not think "columns per table" is applicable to select like that b) Then I tried select ?||?||?||?||||? I wanted to verify that the driver sent all the values properly, so I don't want to just ignore them and I concatenated the values. Unfortunately, it failed with "stack depth limit exceeded. Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate" Finally, I settled on select ARRAY[?, ?, ... ?] which worked up to 65535 parameters just fine. Please, do not suggest me avoid 65535 parameters. What I wanted was just to test that the driver was able to handle 65535 parameters. Vladimir
Re: speed up verifying UTF-8
>Attached is v23 incorporating the 32-bit transition table, with the necessary comment adjustments 32bit table is nice. Would you please replace https://github.com/BobSteagall/utf_utils/blob/master/src/utf_utils.cpp URL with https://github.com/BobSteagall/utf_utils/blob/6b7a465265de2f5fa6133d653df0c9bdd73bbcf8/src/utf_utils.cpp in the header of src/port/pg_utf8_fallback.c? It would make the URL more stable in case the file gets renamed. Vladimir
Re: speed up verifying UTF-8
Have you considered shift-based DFA for a portable implementation https://gist.github.com/pervognsen/218ea17743e1442e59bb60d29b1aa725 ? Vladimir >
Re: speed up verifying UTF-8
Thank you, It looks like it is important to have shrx for x86 which appears only when -march=x86-64-v3 is used (see https://github.com/golang/go/issues/47120#issuecomment-877629712 ). Just in case: I know x86 wound not use fallback implementation, however, the sole purpose of shift-based DFA is to fold all the data-dependent ops into a single instruction. An alternative idea: should we optimize for validation of **valid** inputs rather than optimizing the worst case? In other words, what if the implementation processes all characters always and uses a slower method in case of validation failure? I would guess it is more important to be faster with accepting valid input rather than "faster to reject invalid input". In shift-DFA approach, it would mean the validation loop would be simpler with fewer branches (see https://godbolt.org/z/hhMxhT6cf ): static inline int pg_is_valid_utf8(const unsigned char *s, const unsigned char *end) { uint64 class; uint64 state = BGN; while (s < end) { // clang unrolls the loop class = ByteCategory[*s++]; state = class >> (state & DFA_MASK); // <-- note that AND is fused into the shift operation } return (state & DFA_MASK) != ERR; } Note: GCC does not seem to unroll "while(s> (state & DFA_MASK); } } while(s < end) { class = ByteCategory[*s++]; state = class >> (state & DFA_MASK); } return (state & DFA_MASK) != ERR; } static int pg_utf8_verifystr2(const unsigned char *s, int len) { if (pg_is_valid_utf8(s, s+len)) { // fast path: if string is valid, then just accept it return s + len; } // slow path: the string is not valid, perform a slower analysis return s + ; } Vladimir
Re: speed up verifying UTF-8
>I'm pretty confident this improvement is architecture-independent. Thanks for testing it with different architectures. It looks like the same utf8_advance function is good for both fast-path and for the slow path. Then pg_utf8_verifychar could be removed altogether along with the corresponding IS_*_BYTE_LEAD macros. Vladimir
Re: speed up verifying UTF-8
Just wondering, do you have the code in a GitHub/Gitlab branch? >+ utf8_advance(s, state, len); >+ >+ /* >+ * If we saw an error during the loop, let the caller handle it. We treat >+ * all other states as success. >+ */ >+ if (state == ERR) >+ return 0; Did you mean state = utf8_advance(s, state, len); there? (reassign state variable) >I wanted to try different strides for the DFA Does that (and "len >= 32" condition) mean the patch does not improve validation of the shorter strings (the ones less than 32 bytes)? It would probably be nice to cover them as well (e.g. with 4 or 8-byte strides) Vladimir
Re: Direct SSL connection and ALPN loose ends
I reviewed the documentation for "direct ALPN connections' ', and it looks like it could be improved. Here's the link: https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-SSL The currently suggested values for "sslnegotiations" are "direct" and "postgres". The project name is PostgreSQL and the ALPN name is postgresql. Is there a reason why property value uses "postgres"? Can the value be renamed to postgresql for consistency? "SSL". Technically, the proper term is TLS, and even the document refers to "IANA TLS ALPN Protocol IDs" (TLS, not SSL). I would not die on that hill, however, going for tlsnegotiation would look better than sslnegotiation. Vladimir
SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762
Hi, Pgjdbc test suite identified a SIGSEGV in the recent HEAD builds of PostgreSQL, Ubuntu 14.04.5 LTS Here's a call stack: https://travis-ci.org/github/pgjdbc/pgjdbc/jobs/691794110#L7484 The crash is consistent, and it reproduces 100% of the cases so far. The CI history shows that HEAD was good at 11 May 13:27 UTC, and it became bad by 19 May 14:00 UTC, so the regression was introduced somewhere in-between. Does that ring any bells? In case you wonder: Program terminated with signal SIGSEGV, Segmentation fault. #0 XLogSendPhysical () at walsender.c:2762 2762 if (!WALRead(xlogreader, (gdb) #0 XLogSendPhysical () at walsender.c:2762 SendRqstPtr = 133473640 startptr = 133473240 endptr = 133473640 nbytes = 400 segno = 1 errinfo = {wre_errno = 988942240, wre_off = 2, wre_req = -1, wre_read = -1, wre_seg = {ws_file = 4714224, ws_segno = 140729887364688, ws_tli = 0}} __func__ = "XLogSendPhysical" #1 0x0087fa8a in WalSndLoop (send_data=0x88009d ) at walsender.c:2300 __func__ = "WalSndLoop" #2 0x0087d65a in StartReplication (cmd=0x299bda8) at walsender.c:750 buf = {data = 0x0, len = 3, maxlen = 1024, cursor = 87} FlushPtr = 133473640 __func__ = "StartReplication" #3 0x0087eddc in exec_replication_command ( cmd_string=0x2916e68 "START_REPLICATION 0/7F4A3D8") at walsender.c:1643 cmd = 0x299bda8 parse_rc = 0 cmd_node = 0x299bda8 cmd_context = 0x299bc60 old_context = 0x2916d50 qc = {commandTag = 988942640, nprocessed = 140729887363520} __func__ = "exec_replication_command" Vladimir
Re: SIGSEGV from START_REPLICATION 0/XXXXXXX in XLogSendPhysical () at walsender.c:2762
Kyotaro>It seems to me that that crash means Pgjdbc is initiating a logical Kyotaro>replication connection to start physical replication. Well, it used to work previously, so it might be a breaking change from the client/application point of view. Vladimir
Re: [Patch][WiP] Tweaked LRU for shared buffers
Benjamin> A related and helpful patch would be to capture the access log and Benjamin> provide anonymized traces. The traces can be captured via DTrace scripts, so no patch is required here. For instance: https://www.postgresql.org/message-id/CAB%3DJe-F_BhGfBu1sO1H7u_XMtvak%3DBQtuJFyv8cfjGBRp7Q_yA%40mail.gmail.com or https://www.postgresql.org/message-id/CAH2-WzmbUWKvCqjDycpCOSF%3D%3DPEswVf6WtVutgm9efohH0NfHA%40mail.gmail.com The missing bit is a database with more-or-less relevant workload. Vladimir
Re: extended query protcol violation?
Tatsuo>responses of a simple query do not include CloseComplete Tatsuo, where do you get the logs from? I guess you are just confused by the PRINTED order of the messages in the log. Note: wire order do not have to be exactly the same as the order in the log since messages are buffered, then might be read in batches. In other words, an application might just batch (send all three) close(s2), close(s1), query(begin) messages, then read the responses. How does it break protocol? Vladimir
Re: extended query protcol violation?
>Curious what client is this that is violating the protocol. I stand corrected: that is not a violation, however client might get unexpected failure of query(begin) in a rare case of close(s1) or close(s2) fail. Vladimir
Re: Error on failed COMMIT
Shay> Asking drivers to do this at the client have the exact same breakage impact as the server change, since the user-visible behavior changes in the same way +1 Dave>While we can certainly code around this in the client drivers I don't believe they should be responsible for fixing the failings of the server. Application developers expect that the database feels the same no matter which driver is used, so it would be better to avoid a case when half of the drivers create exceptions on non-committing-commit, and another half silently loses data. Vladimir
Re: Error on failed COMMIT
Merlin>My biggest sense of alarm with the proposed change is that it could Merlin>leave applications in a state where the transaction is hanging there How come? The spec says commit ends the transaction. Can you please clarify where the proposed change leaves a hanging transaction? Just in case, the proposed change is as follows: postgres=# begin; BEGIN postgres=# aslkdfasdf; ERROR: syntax error at or near "aslkdfasdf" LINE 1: aslkdfasdf; ^ postgres=# commit; ROLLBACK <-- this should be replaced with "ERROR: can't commit the transaction because ..." postgres=# commit; WARNING: there is no transaction in progress <-- this should be as it is currently. Even if commit throws an error, the transaction should be terminated. COMMIT No-one on the thread suggests the transaction must hang forever. Of course, commit must terminate the transaction one way or another. The proposed change is to surface the exception if user tries to commit or prepare a transaction that can't be committed. Note: the reason does not matter much. If deferred constraint fails on commit, then commit itself throws an error. Making commit throw an error in case "current transaction is aborted" makes perfect sense. Note: the same thing is with PREPARE TRANSACTION 'txname`. Apparently it silently responses with ROLLBACK which is strange as well. Vladimir
Re: Error on failed COMMIT
>do you think most common connection poolers would continue to >work after making this change? Of course, they should. There are existing cases when commit responds with an error: deferrable constraints. There's nothing new except it is suggested to make the behavior of commit/prepare failure (e.g. "can't commit the transaction because...") consistent with other commit failures (e.g. deferred violation). Vladimir
Re: Error on failed COMMIT
Robert>Now, of course, it's also true that if what the server does makes Robert>users sad, maybe the server should do something different The server makes users sad as it reports the same end result (=="commit failed") differently. Sometimes the server produces ERROR, and sometimes the server produces "OK, the transaction was rolled back". The users do expect that commit might fail, and they don't really expect that sometimes commit can be silently converted to a rollback. Robert>BEGIN; Robert>-- do stuff Robert>COMMIT; Robert>BEGIN; Robert>-- do more stuff Robert>COMMIT; Robert>...and they run these scripts by piping them into psql. Now, if the Robert>COMMIT leaves the session in a transaction state, Noone suggested that "commit leaves the session in a transaction state". Of course, every commit should terminate the transaction. However, if a commit fails (for any reason), it should produce the relevant ERROR that explains what went wrong rather than silently doing a rollback. Vladimir
Re: Error on failed COMMIT
Tom>I think we still end up concluding that altering this behavior has more Tom>downside than upside. What is the downside? Applications, drivers, and poolers already expect that commit might produce an error and terminate the transaction at the same time. "The data is successfully committed to the database if and only if commit returns without error". ^^^ the above is way easier to reason about than "user must check multiple unrelated outcomes to tell if the changes are committed or not". Vladimir
Re: Error on failed COMMIT
Just one more data point: drivers do allow users to execute queries in a free form. Shat is the user might execute /*comment*/commit/*comment*/ as a free-form SQL, and they would expect that the resulting behaviour should be exactly the same as .commit() API call (==silent rollback is converted to an exception). That is drivers can add extra logic into .commit() API implementation, however, turning free-form SQL into exceptions is hard to do consistently from the driver side. It is not like "check the response from .commit() result". It is more like "don't forget to parse user-provided SQL and verify if it is semantically equivalent to commit" Pushing full SQL parser to the driver is not the best idea taking into the account the extensibility the core has. Vladimir
Re: Error on failed COMMIT
But if the SQL is /*commit*/rollback, then the driver should not raise an exception. The exception should be only for the case when the client asks to commit and the database can't do that. The resulting command tag alone is not enough. Vladimir
Re: Questions about the CI process and proposal
Andy>1). The test cases may succeed locally but Andy> may be failed Andy> in CI for some reasons Peter> This is not a problem I would disagree. A patch might easily make the database incompatible with clients like JDBC. Do current PostgreSQL tests catch that? I don't think so. However, that can be configured in PR CI. Peter>You can do this now by sticking in your own travis or appveyor files and Peter>pushing to your own github account. I do this from time to time Do you expect that everybody reinvents the wheel? --- I've recently come across https://gitgitgadget.github.io/ It is a tool that converts GitHub PRs into mailing list messages (and back). What it does it enables contributors to send patches to the mailing list by creating PRs. Of course, it does not replace the mailing list, however, it cross-posts comments (e.g. it posts email responses as GitHub comments). Sample PR: https://github.com/gitgitgadget/git/pull/525 It could significantly help contributors in the following ways: 1) One can create PR without really sending a patch (e.g. to estimate the number of broken tests) 2) It would be much easier to test patches since the number of CI checks can easily exceed the number of tests in make check-*. It would help reviewers as well: 1) GitHub shows colored diffs which help to understand the patch 2) There's a "suggest change" feature which helps for cases like "fixing typos". 3) PR shows if the patch applies at all, and it shows which tests fail 4) It opens a way to trigger extra checks. For example, PR CI could trigger tests for **clients** like Java, C#, Ruby, etc, etc WDYT on configuring gitgitgadget (or something like that) for PostgreSQL? Vladimir
Re: Stored procedures and out parameters
Andrew>does it fail on the latest 11-stable 1) Current "build from Git/master PostgreSQL" produces the data row for both simple and extended queries. 2) Just a side note: `CALL my_proc()` is not suitable for functions. That looks weird. Is the client expected to lookup system catalogs in order to tell if `my_proc` is procedure or function and use either `call my_proc` or `select * from my_proc()`? Issuing `call my_function()` fails with 42883 ERROR: my_func(unknown) is not a procedure Note: JDBC defines two options to call a stored procedure: {?= call [(,, ...)]} {call [(,, ...)]} see https://docs.oracle.com/javase/9/docs/api/java/sql/CallableStatement.html There's no notion if the called object is a procedure or function. Note: PostgreSQL can have a function that `returns void`, and it is hard to tell if {call test()} refers to a function or procedure. Can functions and procedures be unified at the backend level? For instance, support "CALL" for both of them. Or support "select * ..." for both of them. Vladimir
Re: Stored procedures and out parameters
>to compensate for the decisions made by the JDBC driver PostgreSQL It is not a decision by PostgreSQL JDBC driver. It is a decision by JDBC (Java Database Connectivity) specification itself. pgjdbc just follows the specification there. Well, there's an initiative for a new JDBC specification (ADBA https://blogs.oracle.com/java/jdbc-next:-a-new-asynchronous-api-for-connecting-to-a-database ), and they seem to avoid "JDBC-specific syntax" in favour of native-for-the-database syntax. However, ADBA is in its early development, and there are lots of existing applications that use { call my_proc() } syntax for a good reason. >IMO JDBC will need to version branch the textual transform of {call} to "CALL" in v11 and continue with the transform to SELECT in v10 and earlier. Just one more time: it will break clients who use JDBC's {call ...} syntax to call functions in v11. In other words, JDBC specification does not distinguish procedures and functions, so pgjdbc would have to use either "CALL procedure()" or "SELECT procedure()" kind of native syntax, however pgjdbc has no clue which one to use. Current PostgreSQL 11 fails to execute functions via CALL, and it fails to execute procedures via SELECT. Of course, application developer can use native syntax directly so application can use CALL vs SELECT, however that has portability issues since native syntax is tied to a single DB. JDBC {call my_proc()} automatically expands to select... for PostgreSQL and to begin my_proc(); end; in Oracle DB. Vladimir
BLOB / CLOB support in PostgreSQL
Hi, According to Pgjdbc GitHub statistics, the most popular page is https://github.com/pgjdbc/pgjdbc/issues/1102 which is "org.postgresql.jdbc.PgConnection.createClob() is not yet implemented" issue (1600 visits from 1400 unique visitors per a fortnight). There are workarounds to silence the error, however I'm sure CLOB (as in "streaming text datatype") is not yet supported in PostgreSQL backend. I have browsed pgsql-hackers mailing list re CLOB, and it looks like there's no relevant discussion, so I'm quite sure I've done my homework on "check prior mails regarding the subject". **Issue**: there's no sensible way to map java.sql.Clob to the existing backend datatypes. `text` can't stream (it always loads the data fully), and it is 1GB limited. Java distinguishes java.sql.Blob and java.sql.Clob. Blob is a "binary stream with streaming features". It can be mapped to existing "Large Objects", and existing Large Object API somewhat suits for the implementation. There are glitches (like "pgjdbc has to perform 4 API calls tell/seek/tell/seek in order just to get LO length once"), however it is fine. Java Clob API is just a dozen of methods (13 to be exact), however there are two major issues there: 1) "Large Object" is just a binary object. There's no way to tell if the contents is a UTF-8 string or Windows-1251 string or protobuf-encoded message or whatever. That is if pgjdbc encodes java.sql.Clob (large string) into some form of binary (e.g. UTF-8) and store it as PostgreSQL Large Object, then this LO automatically becomes "pgjdbc-specific blob". There's no way to use the data in SQL or pl/pgsql or other applications. For instance, one can't perform " where clob_column like '%abcd%' " 2) "characters". For instance, `long length()` should return the number of characters in the string. If pgjdbc implements java.sql.Clob as a UTF-8 encoded binary, then it would have to **process the whole blob** in order to measure string length. The same thing goes for `String getSubString(long pos, int length)`. It would have to process all the bytes up to character `long pos` (how otherwise it would know byte position for character `pos`?). Currently pgjdbc encodes strings using client_encoding, stores them as LO, and has been like that for ages. Apparently that might easily produce garbage in the DB if clients use various encodings, however pgjdbc's default setting is to use UTF-8 so the problem should be not that visible. I fully understand LO has issues with "removing obsolete entries", however mapping java.sql.Clob to `text` seems to make less sense. For instance: suppose pgjdbc choses "Clob == text". Then a client meets "1GB" limit. "Streaming TOAST data" looks more like a research project rather than a clear thing to implement. What if there was a standard of storing strings in Large Objects? For instance: "CLOB is a UTF-8 encoded string stored as a single LO". When such an agreement exists, various applications could read and write the data. Of course, UTF-8 might not suit everybody, so a format might be "prefix that specifies encoding, then encoded string". Of course both variations above fail to support streaming (as in "need to process all the contents in order to get the last character"), so it might be better to use "prefix that specifies encoding + 'index block' (that specifies offsets for each 1M characters) + encoded string". I'm sure there are known algorithms to store strings in binary format that support subsequence / overwrite / length in reasonable time (O(1) or O(N) with reasonable constant). There might be an option to use UTF-16 (so each "character" becomes 2 bytes always), however it would come at a cost of space usage. **Here goes the question**: do you think such an implementation ("large string stored in Large Objects" could be merged into the core eventually)? Q2: any ideas/existing libraries for random access read-write large strings stored as binary? PS. Relevant pgjdbc PR is https://github.com/pgjdbc/pgjdbc/pull/1272 -- Regards, Vladimir Sitnikov
Re: Stored procedures and out parameters
Shay>Npgsql currently always sends a describe as part of statement execution (for server-prepared messages the describe is done only once, at preparation-time). Vladimir, are you doing things differently here? The same thing is for pgjdbc. It does use describe to identify result row format. However, "CALL my_proc()" works just fine with current git master for both simple and extended protocol. The missing part is "invoke functions via CALL statement". Vladimir
Re: BLOB / CLOB support in PostgreSQL
Let me please bump the thread. Just in case, I'm PgJDBC committer. PgJDBC receives requests to "support CLOB" from time to time, however, I believe it is impossible without the support from the database. To my best knowledge, the database does not have APIs for "streaming large text data". The only "streaming large binary data" API I know is LargeObject which seems to be old-fashioned. I believe Java is not the only client that wants streaming access for binary and text data. Here's a recent pull request to PgJDBC https://github.com/pgjdbc/pgjdbc/pull/1892 where Andrew suggests to add BLOB/CLOB support via bytea/text, and apparently, Andrew is surprised that the database lacks BLOB/CLOB support. Any ideas on the way to proceed here? I don't think it is right to implement Clob via text, especially in case the database provides its own "large text with streaming" datatype in the future. The concerns to avoid "Clob maps to text" could be: a) Once the behavior is implemented, it is hard to change. That is applications would rely on it (and it becomes a defacto standard), and it would be hard to move to the proper "text with streaming API" datatype. b) If we make «clob is text», then people might start using update/substring APIs (which is the primary motivation for Clob) without realizing there’s full value update behind the scenes. Currently, they can use setString/getString for text, and it is crystal clear that the text is updated fully on every update. Vladimir
Re: BLOB / CLOB support in PostgreSQL
Andrew>To the best of my knowledge there is no concrete proposal for the type Andrew>of data type / interface you suggest. To the best of my knowledge, one of the concrete proposals was in the very first message. Let me please copy it: Vladimir> **Here goes the question**: do you think such an implementation Vladimir> ("large string stored in Large Objects" could be merged into the core eventually)? --- Andrew>In the first place, what Andrew>I have proposed is an optional behaviour Adding optional behaviors makes support complicated, so flags should be avoided when a standard solution is possible. Even if you name the behavior "optional", people would still rely on it. For instance, you implement Clob.truncate(int) as in-driver-memory truncate operation, however, in practice, that API should truncate the value in the database. Just in case: the current behavior for LO-based lobs is that Blob.truncate(int) immediately truncates LO. So if the application works with the current LO mode, then they would get a behavior change if they flip the switch. The same for setString and other modifications. An escape hatch could be like "ok, let's throw unimplemented for clob modifications", however, it contradicts the spec which says: JDBC Spec> * All methods on the Clob interface must be fully implemented if the JDBC Spec> * JDBC driver supports the data type. What I say here is that the behavior you suggest spans far away from just "blob maps to bytea". Andrew>Furthermore, it's Andrew>consistent with how the MySQL driver behaves, AIUI, Andrew>and possibly others as well Oracle DB behaves differently. They have BLOB/CLOB column types, and Clob/Blob interfaces map to CLOB/BLOB. Andrew>That consistency is a major motivator for the work I've done. PostgreSQL supports large binary via LargeObject API, so the driver maps Blob to LO. If people want to use bytea, they can use `setBinaryStream(...)`. What does drive people to use Clob/Blob when the database lacks APIs for it? Vladimir
Re: BLOB / CLOB support in PostgreSQL
>100% compatible with the MySQL It is hardly a justification for a feature or for a change request. Vladimir
Re: BLOB / CLOB support in PostgreSQL
Andrew>You and I clearly have a different idea from what constitutes a concrete Andrew>proposal. This is hardly the ghost of a proposal. Can you please clarify what is a proposal from your point of view? Is it documented? I think I have read the relevant TODO items: https://wiki.postgresql.org/wiki/Developer_FAQ#What_do_I_do_after_choosing_an_item_to_work_on.3F Wiki clearly suggests posting a mail to pgsql-hackers before starting work. Vladimir
Re: BLOB / CLOB support in PostgreSQL
Andrew>It needs to contain a substantial implementation plan Here's an implementation plan, again, quoted from the very same mail: Vladimir>Of course both variations above fail to support streaming Vladimir> (as in "need to process all the contents in order to get the last character"), so it might be better to use Vladimir>"prefix that specifies encoding + 'index block' (that specifies offsets for each 1M characters) + encoded string". It does describe the data structure. Andrew>what APIS I believe it does not matter much. However, it might be similar to the existing LO APIs, except the indices are in characters rather than bytes. Andrew>protocol changes None. Andrew>and so on Well, it looks like I had everything you mentioned in the very first email. Vladimir
Re: [HACKERS] Clock with Adaptive Replacement
>If somebody produced a trace showing the buffer lookups in order To get things moving, I've created a DTrace script that captures buffer reads: https://github.com/vlsi/pgsqlstat/blob/pgsqlio/pgsqlio Is it something that can be used to capture live traces? Sample output can be seen here: https://github.com/vlsi/pgsqlstat/tree/pgsqlio#pgsqlio For instance (time units are nano-seconds): Timestamp Elapsed ForkNum BlockNum TableSp DataBase Relation Cached 3271837060212 156301 166316385 1259 1 32718388813748820500 16631638516604 0 3271840973321 436800 16631638516604 1 If DTrace is acceptable, trace format might be adjusted for easier consumption by https://github.com/ben-manes/caffeine/wiki/Simulator Vladimir
Re: GSoC 2018: thrift encoding format
>Personally I think raw data bytes are OK if functions for getting all keys and values from this data are provided What is the purpose of using Thrift "encoding" if it turns out to be a simple wrapper for existing binary data? Do you mean the goal is to have "get/set" functions to fetch data out of bytea field? Frankly speaking, I can hardly imagine why one would want to store MAP Datum inside Thrift inside bytea. Vladimir
Re: GSoC 2018: thrift encoding format
>I mean Charles is free to choose the interface for the extension he believes is right I'm just trying to figure out what are the use cases for using that Thrift extension. For instance, it would be interesting if Thrift was an alternative way to transfer data between client and the database. I guess it could simplify building efficient clients. Vladimir
Re: [HACKERS] Clock with Adaptive Replacement
>I don't have time to check this out just now, but it seems like an excellent idea There are a couple of sad things: 1) DTrace probes seem to be disabled by default. At least, I had to build PostgreSQL with --enable-dtrace in my macOS. Does that (the requirement to enable dtrace within PostgreSQL) block from capturing trace files from real-life applications? There's an option to use system-level IO trace points (e.g. https://github.com/omniti-labs/pgtreats/blob/master/tools/pg_file_stress ), however it would be harder to get DB/relation kind of ids. 2) (database, tablespace, relation) oids cannot be easily mapped from within a DTrace script. One can workaround that by using a SQL connection to the database, however it gets a bit complicated if there are multiple DB instances. What I mean is it is hard to tell which connection URL to use in order to resolve the oids in question. Vladimir
Re: [HACKERS] Clock with Adaptive Replacement
> Oh, looks like I'm inventing another kind of bicycle :-( Do you think you could capture a trace or two from a more-or-less representative application/database? Discussion of algorithms makes little sense as we all lack traces to compare/validate. Vladimir
Re: [HACKERS] Clock with Adaptive Replacement
>I have work loads that I can repeat, so I can help with testing. That would be great. Do you think you could use DTrace to capture the trace? For instance, https://github.com/vlsi/pgsqlstat/blob/pgsqlio/pgsqlio Vladimir
Re: Having query cache in core
> If any of the tables were modified, cache entries using the table must be removed. > (these are already implemented in Pgpool-II's in memory query cache) How do you identify updates made from a pl/pgsql procedure? Vladimir
Re: Subplan result caching
>I like the idea. +1 >Also, there probably should be a GUC for this, defaulting to "off". I think the feature could be on by default provided it can properly identify "volatile" functions/tables hidden behind views. Vladimir
Re: [JDBC] [HACKERS] PGSERVICEFILE as a connection string parameter
David>I'm guessing that the Java port wouldn't be too complicated. It's already well defined. Is encoding defined somewhere for the "service file"? I don't like the idea of using "a default" very much. Vladimir
Re: Stored procedures and out parameters
Peter>I am familiar with the Java {call} escape. But I think it's pretty Peter>useless. You're not going to get any compatibility with anything from Peter>it, since every SQL implementation does something different with it Would you please be more specific? { call ... } converts the SQL to the database-specific way of calling a stored procedure or whatever. For instance, in Oracle DB one can have both procedures and functions. { ? := call my_fun() } is converted to something like begin :b0 := my_fun(); end; { call my_proc(?) } is converted to begin my_proc(:b0); end; Then comes PostgreSQL, and says: you can't invoke functions/procedures like {call my_proc()}. This not fun at all. Note: it is not a speculation, but I have participated in extending OracleDB-kind-of-app to PostgreSQL, and {call ...} was used there for a good reason. {call ...} are used in lots of places exactly for the reason to call procedures. You don't need to correct application core, and one just needs to install the procedure/function at the database side, then it should just work. However, current state of PostgreSQL 11 blocks use of database-independent API. Peter>If you want to call Peter>a procedure (created with CREATE PROCEDURE), just invoke CALL directly Peter>without any escape syntax. That would truly surprise end users, and it would make application portability a pain. Vladimir
Re: Stored procedures and out parameters
Shay>On the other hand, releasing *without* this also has its consequence, namely setting in stone that the database-independent language API cannot be used for invoking the new stored procedures Shay>So I hope you at least consider going through the thought process about allowing this. +1 I wonder if "CALL ..." can be excluded from PostgreSQL 11 release. It is really important feature for the release? Vladimir
Re: Stored procedures and out parameters
Peter>AFAICT in no case does it involve allowing functions to be called as procedures or vice versa. Oracle DB uses the same way to execute both procedures and functions: pl/sql block. For instance: procedure) begin my_proc(); end; function) begin :result := my_fun(); end; Call like begin my_fun(); end; would fail. However there's no dedicated command to call procedures or a command to call functions. Vladimir
Re: Stored procedures and out parameters
David>JDBC driver or similar drivers to use the CALL command always from PG11 on, then the meaning of {call f1(a, b)} will have changed and a Note: technically speaking, JDBC has two flavours of syntax (however standard does not clarify the distinction): S1) {? := call my_proc(?,?) } S2) { call my_proc(?, ?) } Theoretically speaking, we could presume that S1 is for calling functions while S2 is for calling procedures. However, does not looks like a way out since 1) It does not provide a way to call void returning functions. Users might have lots of void-returning functions since procedures were missing. 2) Other languages might happen to have single syntax only. For instance, CommandType in .NET seems to be in (StoredProcedure, TableDirect, Text). There's no room for procedure vs function. Peter>JDBC driver or similar drivers to use the CALL command always from Peter>PG11 on, then the meaning of {call f1(a, b)} will have changed and a Peter>lot of things will break in dangerous ways. PG10 did not have procedures, so only "functions" could theoretically break. If CALL can be used to call functions, and the drivers could play their way to make the result-set look like before, then no breakage happens. Note: driver does not blindly rewrite {call f1(a,b)} to call f1(a,b). It does send prepare/describe/etc/etc messages. At the end of the day, the purpose of having a DB driver is to have a consistent API that works across DB versions/DB products. David>Not implementing this optimization in pg11 but supporting functions via call is something I could live with. +1 Vladimir
Re: Stored procedures and out parameters
Merlin>The workaround is to simply not do that and you can get Merlin>precise control of behavior You are absolutely right. On top of that, the whole concept of DB-drivers and libpq is useless. Users should just simply exchange wire messages for precise control of behavior. Vladimir
pg_prepared_xacts returns transactions that are foreign to the caller
Hi, "select * from pg_prepared_xacts" might produce transactions created by a different user, so the caller won't be able to issue "commit prepared". I think there should be a view that returns only the transactions that the caller can commit or rollback. Is it something that can be implemented at the backend? Currently, the backend checks owner and database when user calls "commit prepared", so I believe the similar filter should be implemented in pg_prepared_xacts view: https://github.com/postgres/postgres/blob/15afb7d61c142a9254a6612c6774aff4f358fb69/src/backend/access/transam/twophase.c#L583C32-L599 Vladimir
Add a property to automatically suspend portals as they produce given number of bytes
Hi, Applications often face an "out of memory" condition as they try to fetch "N rows" from the database. If N is small, then the execution becomes inefficient due to many roundtrips. If N is high, there's a risk that many rows would overflow the client's memory. Note: the client can't stop reading the data at arbitrary times as they might want to execute another query while processing the resultset, so "fetch all rows at once and read them from the socket as needed" is not really an option. The problem is the clients can't tell how much memory it would take to buffer the results of "fetch 100 rows". The row size might vary. Currently, the only case the database returns PortalSuspended seems to be when max_rows is reached when executing a select. I wonder if the database can suspend portals earlier in case it knows it produced a lot of data. In other words: 1) If the user calls execute(fetch all rows), then follow the current behavior. 2) If the user provides non-zero max_rows, then optionally suspend the portal as the result exceeds a pre-configured amount (GUC?). Then clients could configure the connection like "ok, please suspend portals if a single fetch exceeds 50MiB". To my understanding, * it should be a backward-compatible change * it would require no protocol changes, * it would make applications more robust when it comes to "out of memory while fetching many rows" * it might improve the performance overall as the applications could increase their pessimistic "max_rows" fetch size Is it ok to contribute a patch like that? Vladimir
Re: Add a property to automatically suspend portals as they produce given number of bytes
Kirill, cursor does not provide a way to limit the fetch size based on the memory consumption. Imagine a table like (id int8, value jsonb). If we use "fetch 1000", then it might require 1GiB on the client if every row contains 1MiB json. If the client plays defensively and goes for "fetch 10", it might take a lot of time if jsons are small. Neither cursor nor extended protocol solve the problem. Vladimir