On Tue, Jun 14, 2016 at 12:51 AM, Edson Richter <edsonrich...@hotmail.com> wrote:
> Em 14/06/2016 01:33, David G. Johnston escreveu: > > On Monday, June 13, 2016, Edson Richter < <edsonrich...@hotmail.com> > edsonrich...@hotmail.com> wrote: > >> Em 13/06/2016 23:36, Edson Richter escreveu: >> >> Em 13/06/2016 23:18, rob stone escreveu: >> >> On Mon, 2016-06-13 at 22:41 -0300, Edson Richter wrote: >> >> Em 13/06/2016 22:33, Edson Richter escreveu: >> >> I've a table "A" with 4,000,000 records. >> >> I've decided to delete records from oldest to newest but I can't >> delete records that have references in tables "B", "C" or "D". >> >> >> so, I've >> >> >> with qry as ( >> >> select A.id >> >> from A >> >> where not exists (select 1 from B where B.a_id = A.id) >> >> and not exists (select 1 from C where C.a_id = A.id) >> >> and not exists (select 1 from D where D.a_id = A.id) >> >> and A.creation_date < (now()::date - interval '12 month') >> >> order by A.id DESC >> >> limit 2000 >> >> ) >> >> delete from A where id in (select id from qry); >> >> >> All three referenced tables have indexes (B.a_id; C.a_id; D.a_id) >> in >> order to make query faster. >> >> So for first 2 million rows it worked really well, taking about 1 >> minute to delete each group of 2000 records. >> >> Then, after a while I just started to get errors like: >> >> >> >> ERROR: update or delete in "A" violates foreign key >> >> "fk_C_A" in >> "C". >> >> DETAIL: Key (id)=(3240124) is still referenced by table "C". >> >> >> Seems to me that indexes got lost in the path - the query is >> really >> specific and no "C" referenced records can be in my deletion. >> >> Has anyone faced a behavior like this? >> >> Am I doing something wrong? >> >> >> >> Of course: >> Version string PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, >> compiled >> by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit >> Oracle Linux 7 x64 with all updates. Running on EXT4 file system. >> Computer is Dell R420 with mirrored disks, 80GB of RAM (database has >> < >> 40GB in total). >> >> Sorry for not putting the info in the first e-mail. >> >> Edson >> >> >> >> What does:- >> >> SELECT COUNT(*) FROM C WHERE C.a_id = 3240124; >> >> return? >> >> Is it a many-to-one or a one-to-one relationship? >> >> >> >> >> SELECT COUNT(*) FROM C WHERE C.a_id = 3240124; >> >> count >> ------- >> 1 >> (1 registro) >> >> >> A.id is primary key of A table. Each table has its own primary key. >> >> Relationship to others table is 1-N, being N = {0,1} >> >> >> A.id -> B.a_id (being B.a_id unique but not enforced by unique key) >> >> A.id -> C.a_id (being C.a_id unique but not enforced by unique key) >> >> A.id -> D.a_id (being D.a_id unique but not enforced by unique key) >> >> >> Regards, >> >> Edson >> >> >> Just in case, I've run: >> >> - vacuum full analyze verbose; >> - reindex index ix_c_a_id; >> >> Result I get same error. So, I'm inclined to discard that this is a index >> error. >> >> >> Interesting: >> >> with qry as (select A.id >> from A >> where creatingdate < (now()::date - interval '12 month') >> and not exists (select 1 from B where B.a_id = A.id) >> and not exists (select 1 from C where C.a_id = A.id) >> and not exists (select 1 from D where D.a_id = A.id) >> order by A.id limit 2000) >> >> select * from qry where id = 3240124; >> > > Why do you assume 3240124 is within the first 2000 qualified records that > the CTE is limited to checking? > > > Because this is the key causing the error on delete. > Wasn't the point though I do suspect your expectations are acceptable in this instance. If you truly want to see if qry contains 3240124 you should lose the LIMIT 2000. > > > >> >> >> Total query runtime: 2.2 secs >> 0 rows retrieved. >> >> >> Why delete causes error, but querying don't? >> > > Given the error message this answer seems self-evident... > > > Not self-evident to me: delete says I'm trying to delete the record with > id = 3240124 and I can't. But the select says this record with id = 3240124 > is not there!!! > > This error: ERROR: update or delete in "A" violates foreign key "fk_C_A" in "C". is impossible to encounter when executing a pure select... > >> >> Would it be a bug when using delete ... where id in (subquery)? >> >> > > I'm unsure regarding the solution but I suspect the problem is that > between selecting the A row and deleting it another concurrent process > added a record to C that, if you were to re-run the select would cause the > row from A to be skipped. But the single query doesn't have that option so > it ends up failing. > > There is a FOR UPDATE clause you can add to the select but I don't think > that works here since table C is the one being altered and at the time of > the query there is nothing to lock. > > I'm doubting this is a bug, just poor concurrency understanding. Sorry I > cannot enlighten further at the moment. > > > There is not other process adding/updating records - and even there is, > the ID would be so high (because it is a sequence) that cannot be the > 3240124 (this record has been generated more than 2 years ago - that's why > my query has the "creationdate" filter - I don't touch records created > within last 12 months). > > Also, I've tried the same with "for update": same error! > > Have you confirmed that "fk_C_A" is referencing the columns you think it is? What's the history of this machine? Did you pass through 9.3 (especially early releases) on your way to 9.4? David J.