On Tue, Mar 27, 2018 at 12:23 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Querying for other functions marked 'r' leaves me with some other related
> doubts:
>
> 1. Why are the various flavors of pg_get_viewdef() marked 'r'?  Surely
> reading the catalogs is a thing parallel children are allowed to do.
> If there is a good reason for pg_get_viewdef() to be 'r', why doesn't
> the same reason apply to all the other ruleutils functions?
>
> 2. Why are the various thingy-to-xml functions marked 'r'?  Again it
> can't be because they read catalogs or data.  I can imagine that the
> reason for restricting cursor_to_xml is that the cursor might execute
> parallel-unsafe operations, but then why isn't it marked 'u'?
>
> 3. Isn't pg_import_system_collations() unsafe, for the same reason
> as binary_upgrade_create_empty_extension()?

Yeah.  I hacked something up in Python to analyse the C call graph and
look for non-PARALLEL SAFE functions written in C that can reach
AssignTransactionId.  Attached, for interest.  Not a great approach
because current_schema, fetch_search_path, SPI_XXX and a couple of
others all lead there creating many possibly false positives (though
who knows).  If I filter those out I'm left with the ones already
mentioned (pg_import_system_collations,
binary_upgrade_create_empty_extension) plus two others:

1.  unique_key_recheck, not user callable anyway.
2.  brin_summarize_range is marked 's'.  Seems wrong.

-- 
Thomas Munro
http://www.enterprisedb.com
#!/usr/bin/env python
#
# Do any functions declared PARALLEL SAFE or PARALLEL RESTRICTED have a call
# graph that could apparently reach an unsafe function?
#
# Obtain a list of pg_proc functions that are declared 'r' or 's':
#
#   psql postgres -t -c "select prosrc from pg_proc where proparallel != 'u' and prolang = 12" > pg_functions.data
#
# Obtain a list of edges in the function call graph (macOS):
#
#   otool -tvV | \
#   awk '/^_[^:]*:/ { caller = substr($1,2); }
#        /\tcallq\t/ { callee = substr($3,2);
#                      printf("%s%s\n", caller, callee); }' > call_graph.data

from networkx import DiGraph
from networkx.algorithms.shortest_paths.generic import has_path
from networkx.algorithms.shortest_paths.generic import shortest_path
import re

unsafe_functions = ["AssignTransactionId"]

# prune certain subgraphs from the graph because they are false positives
ignore_functions = ["errfinish", "LockAcquire", "ExecEvalNextValueExpr", "ExecInitExpr", "SPI_execute_plan", "_SPI_execute_plan", "SPI_cursor_fetch"]

pg_functions = []
with open("pg_functions.data", "r") as data:
  for line in data:
    pg_functions.append(line.strip())
  
call_graph = DiGraph()
with open("call_graph.data", "r") as data:
  for line in data:
     caller, callee = line.strip().split(":")
     if callee not in ignore_functions:
       call_graph.add_node(caller)
       call_graph.add_node(callee)
       call_graph.add_edge(caller, callee)

for pg_function in pg_functions:
  if call_graph.has_node(pg_function):
    for unsafe_function in unsafe_functions:
      if has_path(call_graph, pg_function, unsafe_function):
        print "There is a path from %s to %s: %s" % (pg_function, unsafe_function, "->".join(shortest_path(call_graph, pg_function, unsafe_function)))
  

Reply via email to