Re: PostgreSQL extension for processing Graph queries (Apache AGE)

2022-12-03 Thread Young Seung Andrew Ko
Hello Alvaro,

With AGE running on PostgreSQL, you can store and query Graph data (in JSON
document) using openCypher based graph queries, Graph data is a set of
nodes (objects / entities / records) and edges (relationships between
nodes).
You can also run some Graph algorithms on these data such as Clustering,
Partitioning, Search, Shortest Path and Page Rank.
E.g. The Shortest Path algorithm is used to derive the fastest path
connecting one node to another. It is used for social network analysis,
logistic distribution path design, manufacturing process design, etc.

AK



On Wed, Nov 30, 2022 at 12:37 AM Alvaro Herrera 
wrote:

> On 2022-Nov-29, Young Seung Andrew Ko wrote:
>
> > Hello PostgreSQL users,
> >
> > https://github.com/apache/age
> > Apache AGE is an Apache 2-licensed open source PostgreSQL extension for
> > storing Graph data.
> >
> > The current version of Apache AGE is to enable PostgreSQL users to use
> > Neo4j's openCypher-based graph queries in unison with existing relational
> > tables
>
> Can you show some examples of this feature in action?  What sort of data
> would I use it for, how would I query it?
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>


print in plpython not appearing in logs

2022-12-03 Thread Ludwig Isaac Lim
I'm having problems wherein my print() statements inside my plpython stored 
proc are not appearing in postgresql log. I tried setting the file=sys.stderr

To reproduce:

CREATE OR REPLACE PROCEDURE p_ludwig_test()
AS
$$
   import sys
   try:
  x = 1 / 0
   except:
  plpy.log("hello")
  print("oink oink", file=sys.stderr)
  print("oink oink - v2")
  plpy.log("haha")
$$
LANGUAGE plpython3u;

call p_ludwig_test()



Output:
-
2022-12-02 11:46:11.324 UTC [19390] LOG:  hello
2022-12-02 11:46:11.324 UTC [19390] CONTEXT:  PL/Python procedure 
"p_ludwig_test"
2022-12-02 11:46:11.324 UTC [19390] STATEMENT:  call p_ludwig_test();
2022-12-02 11:46:11.324 UTC [19390] LOG:  haha
2022-12-02 11:46:11.324 UTC [19390] CONTEXT:  PL/Python procedure 
"p_ludwig_test"
2022-12-02 11:46:11.324 UTC [19390] STATEMENT:  call p_ludwig_test();


Notice that the "oink oink"  is  not there.


Relevant logging configuration:
logging_collector = on    
log_directory = 'logs'  
log_min_messages = info 
log_min_error_statement = error 

PG version
---
 PostgreSQL 14.6 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1 
20180712 (Red Hat 7.3.1-15), 64-bit



The reason why I'm trying to use print() is because plpy logger is quite 
verbose. Each logging will output 3 line

for example:
   plpy.log("hello")

Will generate the following 3 lines in the log:
2022-12-02 11:46:11.324 UTC [19390] LOG:  hello
2022-12-02 11:46:11.324 UTC [19390] CONTEXT:  PL/Python procedure 
"p_ludwig_test"
2022-12-02 11:46:11.324 UTC [19390] STATEMENT:  call p_ludwig_test();

Another thing is there is another stored procedure that I have wherein the 
print() to stderr actually works, so I'm not sure what I'm doing wrong.


Thank you in advance,
Ludwig




ERROR : cannot commit while a portal is pinned from plpython

2022-12-03 Thread Ludwig Isaac Lim
Hello:

I noticed the following lines in my PostgreSQL log file today:

2022-12-01 14:16:54.517 UTC [10073] ERROR:  cannot commit while a portal is 
pinned
2022-12-01 14:16:54.517 UTC [10073] STATEMENT:  call p_annotate_db_stats()



The process ID can be traced to plpython procedure call.

What is baffling is there's no commit (or rollback) inside the stored procedure 
call.

The notable changes I made to the stored procedure is:

adding this block of code

try:
      --processing stuff
except:
   exc_type, exc_value, tb = sys.exc_info()
   if tb is None:
  print("No exception / traceback information", file=sys.stderr)
  sys.exit(-1)
   prev = tb
   curr = tb.tb_next
   while curr is not None:
  prev = curr
  curr = curr.tb_next
   print(prev.tb_frame.f_locals, file=sys.stderr)
   traceback.print_exception(*sys.exc_info())
   print("Exiting", file=sys.stderr)



The stored procedure is executed via psql call using the following command in 
bash:
  nohup psql -c "call p_annotate_db_stats()" &


Postgresql version :  PostgreSQL 14.6 on aarch64-unknown-linux-gnu, compiled by 
gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15), 64-bit



Python version : Python 3.7.10

What could be the cause of the error? Thank you.



Regards,
Ludwig Lim




sort performance better with little memory than big memory

2022-12-03 Thread yang zhao
hello,
I am running some sorting sql on my machine,test data is tpch100g, and sql
is:explain analyze verbose  select l_shipdate,l_orderkey from lineitem_0
order by l_shipdate,l_orderkey desc .
I found that when I set work_mem to 65MB,sort method is external merge with
disk,which cost 50s in my server.
and when I set work_mem to 6GB,sort method is quicksort in memory, which
cost 78s in same server.
It is strange that more memory bring worse performance.I used perf and find
that when work_mem is 6GB,L1-dcache-load-misses is much more than 64MB when
qsort and tuplesort_gettuple_common.
So,can we try to split memory to pieces and qsort every one,and merge than
all in memory,I have tried this in my local code, and got about 12%
improvement when memory is enough.


Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-03 Thread Alban Hertroys


> On 2 Dec 2022, at 15:10, DAVID ROTH  wrote:
> 
> Is there a way to reverse engineer the original code (or its equivalent) from 
> what is saved in the database?

I’m dumping an idea here… Treading back quite a bit with information from later 
in this thread.

With the original procedure source code under version control and assuming the 
tokenization converts that source code consistently (since it’s done by a 
computer), you could store another level of diffs: From the source code you 
pushed, against the output of pg_get_functiondef.

Assuming that changes to the tokenised code, when converted back to text, only 
involve renames of database objects, the result of a reversely applied diff 
could very well be comparable to the original source code.

I suspect that would be sufficient for telling whether a developer is 
responsible for the changes, or that they were caused by renaming of database 
artefacts.

You would need to wrap the function creation calls into some automation to 
generate and store those diffs, comparing it back, etc, but that may be doable. 
I would also generate new diffs right after major version updates of the 
database (a before and after of the output of pg_get_functiondef, applied to 
the stored diff?).

I’m not so sure that would work for auditing, but that seems to have been 
tackled down-thread.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.





Re: print in plpython not appearing in logs

2022-12-03 Thread Tom Lane
Ludwig Isaac Lim  writes:
> I'm having problems wherein my print() statements inside my plpython stored 
> proc are not appearing in postgresql log. I tried setting the file=sys.stderr

Hmm.  I can tell you that with "logging_collector = on", I would only
expect the logs to capture stderr output, not stdout.  So it makes
sense to me that plain "print" would disappear into the bit bucket.
But if you specify stderr output, it ought to work.  I don't know
enough Python to know why it's not working, but it seems to me this is
primarily a Python question not a Postgres question.  Maybe you need
an explicit fflush-equivalent step?  Dunno.

regards, tom lane




Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-03 Thread Brad White

On 12/2/2022 9:05 AM, Peter J. Holzer wrote:

I don't know how reproducable that tokenization process is. Can you just
do it again and compere the results?

Right.
You can compare passwords, even though you don't store the original.
You might be able to run your virgin copy through the tokenization and 
compare the results.

--
Quote Signature I talk with clients, find out where their pain points 
are, and solve those.

On-call IT Management for small companies and non-profits.
SCP, Inc.
bwh...@inebraska.com
402-601-7990


Quote of the Day
   There is a huge difference between fathering a child and being a 
father.

   One produces a child. The other produces an adult.
    -- John Eldredge

Re: sort performance better with little memory than big memory

2022-12-03 Thread David Rowley
On Sun, 4 Dec 2022 at 00:14, yang zhao  wrote:
> So,can we try to split memory to pieces and qsort every one,and merge than 
> all in memory,I have tried this in my local code, and got about 12% 
> improvement when memory is enough.

We're not very good at CPU cache awareness.  The hard part here would
be to figure out how large to make those pieces. Maybe the value could
be a GUC which can be tuned similar to work_mem. One problem with that
might be that L3 is shared with other processes and it does not seem
unreasonable that smaller-than-L3 sizes will be better when the system
is busy.  However, it also seems possible that sorting in L3-sized
chunks might still be faster even when the system is busy with many
other tasks.

The best thing to do here is to post your proposal along with a patch
and the performance tests you've done to the pgsql-hackers mailing
list.  This way people can test this for themselves and see if they
get similar results. Patches such as this one need to be tested on
lots of different hardware. pgsql-hackers is the place for discussions
about that. It's full of people that might be willing to patch with
your patch and test if the performance is also improved on their
hardware.

David




Aw: Re: Stored procedure code no longer stored in v14 and v15, changed behaviour

2022-12-03 Thread Karsten Hilbert
> You would need to wrap the function creation calls into some automation to 
> generate and store those diffs, comparing it back, etc, but that may be 
> doable. I would also generate new diffs right after major version updates of 
> the database (a before and after of the output of pg_get_functiondef, applied 
> to the stored diff?).

I wonder whether that would tie the sanity check to a particular PG version.

I mean, pg_get_functiondef output being a server runtime artifact it might
well change between server versions, no ?

Best,
Karsten





How to repair my plan modify error?

2022-12-03 Thread jack...@gmail.com
jacktby(at)gmail(dot)com
--
Hello, I'm trying to modify pg codes for my personal project. And after I 
finish modify planner, I get this.
postgres=# create table tt(a int);
CREATE TABLE
postgres=# \d tt
ERROR:  btree index keys must be ordered by attribute

here are the patches of my modifies.



0001-finish-planner-modify.patch
Description: Binary data


0003-fix-limit-and-count-bugs.patch
Description: Binary data