On Thu, Jan 26, 2017 at 4:37 PM, Jonathan Vanasco <postg...@2xlp.com> wrote:

> There are over 20 million records in a self-referential database table,
> where one record may point to another record as a descendant.
>
> Because of a bug in application code, there was no limit on recursion.
> The max was supposed to be 4.  A few outlier records have between 5 and
> 5000 descendants (there could be more.  I manually found one chain of 5000.
>
> I need to find all the chains of 5+ and mark them for update/deletion.
> While the database is about 10GB, the recursive search is maxing out on
> diskspace and causing a failure (there was over over 100GB of workspace
> free)
>
> Is there any way to make a recursive query work, or will I have to use
> another means and just iterate over the entire dataset (either in postgres
> or an external service)
>

​Thinking aloud - why doesn't just finding every record with 5 descendants
not work?  Any chain longer than 5 would have at least 5 items.

​Even without recursion you could build out a five-way self-join and any
records that make it that far are guilty.  I suppose this assumes your
setup is non-cyclic.

David J.

Reply via email to