On Sun, Jun 06, 2021 at 03:54:48AM -0700, Omar Kilani wrote: > What I sort of don't get is... before we insert anything into these > tables, we always check to see if a value already exists. And Postgres > must be returning no results for some reason. So it goes to insert a > duplicate value which somehow succeeds despite the unique index, but > then a reindex says it's a duplicate. Pretty weird.
In addition to the other issues, this is racy. You 1) check if a key exists, and if not then 2) INSERT (or maybe you UPDATE if it did exist). https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use Maybe you'll say that "this process only runs once", but it's not hard to imagine that might be violated. For example, if you restart a multi-threaded process, does the parent make sure that the child processes die before itself dying? Do you create a pidfile, and do you make sure the children are dead before removing the pidfile ? The right way to do this since v9.6 is INSERT ON CONFLICT, which is also more efficient in a couple ways. -- Justin