[BUGS] Primary key duplicates

Zeki Mokhtarzada Fri, 13 Aug 2004 10:51:34 -0700

I have a very strange bug with postgres 7.4.3.  I have a table with about 
15 million rows and recently, duplicate rows have started appearing.


For simplicity, let's say my table is as follows:

files
-------
fileid   integer   -- Primary key generated by a sequence
userid   integer   -- id of user that owns the file
filename character varying(255) -- name of file

Indexes:  "files_pkey" primary key, bree (fileid)
          "files_userid" hash (userid)


When I do: 

 select fileid, userid from files where userid = 1898598 order by fileid;

I get:

 fileid  | userid
---------+---------
 3787433 | 1898598
 3787433 | 1898598
 3787563 | 1898598
 9554275 | 1898598


Notice that 3787433 is duplicated.  How could this have happened if that 
column is flagged as the primary key.  Even more interesting:

select oid, fileid, userid from files where userid = 1898598 order by 
fileid;


   oid    | fileid  | userid  
----------+---------+---------
  1573737 | 3787433 | 1898598
  1573737 | 3787433 | 1898598 
  1573820 | 3787563 | 1898598 
 18612041 | 9554275 | 1898598

The rows have the same OID!  So my question is how do I delete the 
duplicate row.  If I execute 

 select fileid, userid from files where fileid = 1573737;

I get:

 fileid  | userid
---------+---------
 1573737 | 1008628


Similarly, if I try to delete both of the rows, only one of them gets 
deleted, then when I select by userid, I get the other remaining one 
listed.  But if I select by fileid I get no rows returned.

I suspect a corrupt index is at fault here.  If that's the case, a reindex
will take quite some time and will lock the table causing a long period of
downtime.  Is that my only option?  Any other ideas?

-Zeki 



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

[BUGS] Primary key duplicates

Reply via email to