Re: PostgreSQL extension for processing Graph queries (Apache AGE)
Hello Alvaro, With AGE running on PostgreSQL, you can store and query Graph data (in JSON document) using openCypher based graph queries, Graph data is a set of nodes (objects / entities / records) and edges (relationships between nodes). You can also run some Graph algorithms on these data such as Clustering, Partitioning, Search, Shortest Path and Page Rank. E.g. The Shortest Path algorithm is used to derive the fastest path connecting one node to another. It is used for social network analysis, logistic distribution path design, manufacturing process design, etc. AK On Wed, Nov 30, 2022 at 12:37 AM Alvaro Herrera wrote: > On 2022-Nov-29, Young Seung Andrew Ko wrote: > > > Hello PostgreSQL users, > > > > https://github.com/apache/age > > Apache AGE is an Apache 2-licensed open source PostgreSQL extension for > > storing Graph data. > > > > The current version of Apache AGE is to enable PostgreSQL users to use > > Neo4j's openCypher-based graph queries in unison with existing relational > > tables > > Can you show some examples of this feature in action? What sort of data > would I use it for, how would I query it? > > -- > Álvaro Herrera PostgreSQL Developer — > https://www.EnterpriseDB.com/ >
print in plpython not appearing in logs
I'm having problems wherein my print() statements inside my plpython stored proc are not appearing in postgresql log. I tried setting the file=sys.stderr To reproduce: CREATE OR REPLACE PROCEDURE p_ludwig_test() AS $$ import sys try: x = 1 / 0 except: plpy.log("hello") print("oink oink", file=sys.stderr) print("oink oink - v2") plpy.log("haha") $$ LANGUAGE plpython3u; call p_ludwig_test() Output: - 2022-12-02 11:46:11.324 UTC [19390] LOG: hello 2022-12-02 11:46:11.324 UTC [19390] CONTEXT: PL/Python procedure "p_ludwig_test" 2022-12-02 11:46:11.324 UTC [19390] STATEMENT: call p_ludwig_test(); 2022-12-02 11:46:11.324 UTC [19390] LOG: haha 2022-12-02 11:46:11.324 UTC [19390] CONTEXT: PL/Python procedure "p_ludwig_test" 2022-12-02 11:46:11.324 UTC [19390] STATEMENT: call p_ludwig_test(); Notice that the "oink oink" is not there. Relevant logging configuration: logging_collector = on log_directory = 'logs' log_min_messages = info log_min_error_statement = error PG version --- PostgreSQL 14.6 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), 64-bit The reason why I'm trying to use print() is because plpy logger is quite verbose. Each logging will output 3 line for example: plpy.log("hello") Will generate the following 3 lines in the log: 2022-12-02 11:46:11.324 UTC [19390] LOG: hello 2022-12-02 11:46:11.324 UTC [19390] CONTEXT: PL/Python procedure "p_ludwig_test" 2022-12-02 11:46:11.324 UTC [19390] STATEMENT: call p_ludwig_test(); Another thing is there is another stored procedure that I have wherein the print() to stderr actually works, so I'm not sure what I'm doing wrong. Thank you in advance, Ludwig
ERROR : cannot commit while a portal is pinned from plpython
Hello: I noticed the following lines in my PostgreSQL log file today: 2022-12-01 14:16:54.517 UTC [10073] ERROR: cannot commit while a portal is pinned 2022-12-01 14:16:54.517 UTC [10073] STATEMENT: call p_annotate_db_stats() The process ID can be traced to plpython procedure call. What is baffling is there's no commit (or rollback) inside the stored procedure call. The notable changes I made to the stored procedure is: adding this block of code try: --processing stuff except: exc_type, exc_value, tb = sys.exc_info() if tb is None: print("No exception / traceback information", file=sys.stderr) sys.exit(-1) prev = tb curr = tb.tb_next while curr is not None: prev = curr curr = curr.tb_next print(prev.tb_frame.f_locals, file=sys.stderr) traceback.print_exception(*sys.exc_info()) print("Exiting", file=sys.stderr) The stored procedure is executed via psql call using the following command in bash: nohup psql -c "call p_annotate_db_stats()" & Postgresql version : PostgreSQL 14.6 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), 64-bit Python version : Python 3.7.10 What could be the cause of the error? Thank you. Regards, Ludwig Lim
sort performance better with little memory than big memory
hello, I am running some sorting sql on my machine,test data is tpch100g, and sql is:explain analyze verbose select l_shipdate,l_orderkey from lineitem_0 order by l_shipdate,l_orderkey desc . I found that when I set work_mem to 65MB,sort method is external merge with disk,which cost 50s in my server. and when I set work_mem to 6GB,sort method is quicksort in memory, which cost 78s in same server. It is strange that more memory bring worse performance.I used perf and find that when work_mem is 6GB,L1-dcache-load-misses is much more than 64MB when qsort and tuplesort_gettuple_common. So,can we try to split memory to pieces and qsort every one,and merge than all in memory,I have tried this in my local code, and got about 12% improvement when memory is enough.
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
> On 2 Dec 2022, at 15:10, DAVID ROTH wrote: > > Is there a way to reverse engineer the original code (or its equivalent) from > what is saved in the database? I’m dumping an idea here… Treading back quite a bit with information from later in this thread. With the original procedure source code under version control and assuming the tokenization converts that source code consistently (since it’s done by a computer), you could store another level of diffs: From the source code you pushed, against the output of pg_get_functiondef. Assuming that changes to the tokenised code, when converted back to text, only involve renames of database objects, the result of a reversely applied diff could very well be comparable to the original source code. I suspect that would be sufficient for telling whether a developer is responsible for the changes, or that they were caused by renaming of database artefacts. You would need to wrap the function creation calls into some automation to generate and store those diffs, comparing it back, etc, but that may be doable. I would also generate new diffs right after major version updates of the database (a before and after of the output of pg_get_functiondef, applied to the stored diff?). I’m not so sure that would work for auditing, but that seems to have been tackled down-thread. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest.
Re: print in plpython not appearing in logs
Ludwig Isaac Lim writes: > I'm having problems wherein my print() statements inside my plpython stored > proc are not appearing in postgresql log. I tried setting the file=sys.stderr Hmm. I can tell you that with "logging_collector = on", I would only expect the logs to capture stderr output, not stdout. So it makes sense to me that plain "print" would disappear into the bit bucket. But if you specify stderr output, it ought to work. I don't know enough Python to know why it's not working, but it seems to me this is primarily a Python question not a Postgres question. Maybe you need an explicit fflush-equivalent step? Dunno. regards, tom lane
Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
On 12/2/2022 9:05 AM, Peter J. Holzer wrote: I don't know how reproducable that tokenization process is. Can you just do it again and compere the results? Right. You can compare passwords, even though you don't store the original. You might be able to run your virgin copy through the tokenization and compare the results. -- Quote Signature I talk with clients, find out where their pain points are, and solve those. On-call IT Management for small companies and non-profits. SCP, Inc. bwh...@inebraska.com 402-601-7990 Quote of the Day There is a huge difference between fathering a child and being a father. One produces a child. The other produces an adult. -- John Eldredge
Re: sort performance better with little memory than big memory
On Sun, 4 Dec 2022 at 00:14, yang zhao wrote: > So,can we try to split memory to pieces and qsort every one,and merge than > all in memory,I have tried this in my local code, and got about 12% > improvement when memory is enough. We're not very good at CPU cache awareness. The hard part here would be to figure out how large to make those pieces. Maybe the value could be a GUC which can be tuned similar to work_mem. One problem with that might be that L3 is shared with other processes and it does not seem unreasonable that smaller-than-L3 sizes will be better when the system is busy. However, it also seems possible that sorting in L3-sized chunks might still be faster even when the system is busy with many other tasks. The best thing to do here is to post your proposal along with a patch and the performance tests you've done to the pgsql-hackers mailing list. This way people can test this for themselves and see if they get similar results. Patches such as this one need to be tested on lots of different hardware. pgsql-hackers is the place for discussions about that. It's full of people that might be willing to patch with your patch and test if the performance is also improved on their hardware. David
Aw: Re: Stored procedure code no longer stored in v14 and v15, changed behaviour
> You would need to wrap the function creation calls into some automation to > generate and store those diffs, comparing it back, etc, but that may be > doable. I would also generate new diffs right after major version updates of > the database (a before and after of the output of pg_get_functiondef, applied > to the stored diff?). I wonder whether that would tie the sanity check to a particular PG version. I mean, pg_get_functiondef output being a server runtime artifact it might well change between server versions, no ? Best, Karsten
How to repair my plan modify error?
jacktby(at)gmail(dot)com -- Hello, I'm trying to modify pg codes for my personal project. And after I finish modify planner, I get this. postgres=# create table tt(a int); CREATE TABLE postgres=# \d tt ERROR: btree index keys must be ordered by attribute here are the patches of my modifies. 0001-finish-planner-modify.patch Description: Binary data 0003-fix-limit-and-count-bugs.patch Description: Binary data