I have a very strange bug with postgres 7.4.3. I have a table with about 15 million rows and recently, duplicate rows have started appearing.
For simplicity, let's say my table is as follows: files ------- fileid integer -- Primary key generated by a sequence userid integer -- id of user that owns the file filename character varying(255) -- name of file Indexes: "files_pkey" primary key, bree (fileid) "files_userid" hash (userid) When I do: select fileid, userid from files where userid = 1898598 order by fileid; I get: fileid | userid ---------+--------- 3787433 | 1898598 3787433 | 1898598 3787563 | 1898598 9554275 | 1898598 Notice that 3787433 is duplicated. How could this have happened if that column is flagged as the primary key. Even more interesting: select oid, fileid, userid from files where userid = 1898598 order by fileid; oid | fileid | userid ----------+---------+--------- 1573737 | 3787433 | 1898598 1573737 | 3787433 | 1898598 1573820 | 3787563 | 1898598 18612041 | 9554275 | 1898598 The rows have the same OID! So my question is how do I delete the duplicate row. If I execute select fileid, userid from files where fileid = 1573737; I get: fileid | userid ---------+--------- 1573737 | 1008628 Similarly, if I try to delete both of the rows, only one of them gets deleted, then when I select by userid, I get the other remaining one listed. But if I select by fileid I get no rows returned. I suspect a corrupt index is at fault here. If that's the case, a reindex will take quite some time and will lock the table causing a long period of downtime. Is that my only option? Any other ideas? -Zeki ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match