Vincent Hikida wrote:
There are several ways. I am making the simplifying assumption that name, type and dir cannot be NULL in either table. If they are the query is a little more complicated.

The following are a couple of many techniques.

SELECT a.a_name
            , a.a_type
            , a.a_dir
    FROM a_table a
 WHERE NOT EXISTS
             ( SELECT NULL
                    FROM b_table b
                 WHERE b.b_name      = a.a_name
                       AND b.b_type        = a.a_type
                       AND b.b_dir          = a.a_dir
             )

SELECT a.a_name
, a.a_type
, a.a_dir
FROM a_table a
LEFT JOIN b_table b
ON a.a_table = b.b_table
AND a.a_type = b.b_type
AND a.a_dir = b.b_type
WHERE b.b_table IS NULL // assumes that b.b_table is a not null column.


Let's say that dir could be null and dir is a string, then (assuming that dir can never be 'xyz') you could say something like

COALESCE(a.a_dir,'xyz')  = COALESCE(b.b_dir,'xyz')

Since NULL never equal NULL, if you want NULL in one table to match a NULL in another table, you need to change it to something not NULL. However this depends on what you want in your application.

Queries like this are used often to check the integrity of your data. Examples of this are 1) What orders don't have order items? 2) What books have no authors? etc.

This is wonderful! Thank you for responding so quickly! :) I should mention that I am still very much learning so I apologise in advance if I miss the obvious. ^.^;


They are all 'not null' and I am trying to do exactly the kind of task you described. I tried the first example on my DB and got a syntax error:

tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM file_info_1 a WHERE NOT EXIST (SELECT NULL FROM file_set_1 b WHERE b.fs_name=a.file_name, b.fs_parent_dir=a.file_parent_dir, b.fs_type=a.file_type);
ERROR: syntax error at or near "SELECT" at character 88


The second example you gave seems to work perfectly (as I will show near the bottom of this email). What are the benefits and down sides of each method? Is there a simple reason why the first method failed (probably a typo I imagine...)?


A little more specifics about my DB:

'file_info_1' and 'file_set_1' are two tables I use to store information of files and directories (this is a backup program).

'file_info_#' stores dynamic info like file size, owner and such. This table is dropped and recreated before a new scan of the partition creates a mass 'COPY' load (the '_1' indicates the first partition).

'file_set_#' stores static information such as "has the file been selected for backup" which is why I keep it in a separate table. I want to run this select first to write entries for newly added files and directories (the values will match the file's parent) and then again in reverse to remove from 'file_set_#' entries that no longer exist on the partition.

  If it helps, here is the structure of the tables:

CREATE TABLE file_info_ID (
        file_acc_time           bigint          not null,
        file_group_name         varchar(255)    not null,
        file_group_uid          int             not null,
        file_mod_time           bigint          not null,
        file_name               varchar(255)    not null,
        file_parent_dir         varchar(255)    not null,
        file_perm               varchar(10)     not null,
        file_size               bigint          not null,
        file_type               varchar(2)      not null        default 'f',
        file_user_name          varchar(255)    not null,
        file_user_uid           int             not null
);

CREATE TABLE file_set_# (
        fs_backup               boolean         not null        default 't',
        fs_display              boolean         not null        default 'f',
        fs_name                 varchar(255)    not null,       
        fs_parent_dir           varchar(255)    not null,
        fs_restore              boolean         not null        default 'f',
        fs_type                 varchar(2)      not null        default 'f'
);

And here is some sample data that I have to work with (yes, it's a win2k partition... I use it to test other aspects of my program and, if I blow it away, I won't be upset. ^.^; All of this is being done on a Fedora Core 3 install in case it makes a difference):

tle-bu=> SELECT file_type, file_parent_dir, file_name FROM file_info_1 WHERE file_parent_dir='/' LIMIT 30;
file_type | file_parent_dir | file_name
-----------+-----------------+------------------------
d | / | .
d | / | downloads
d | / | Documents and Settings
d | / | Program Files
f | / | io.sys
f | / | msdos.sys
f | / | _NavCClt.Log
d | / | WUTemp
d | / | Recycled
f | / | pagefile.sys
d | / | winnt
f | / | ntldr
f | / | ntdetect.com
f | / | boot.ini
f | / | config.sys
f | / | autoexec.bat
f | / | t5r4e3w2q1.exe
f | / | 1q2w3e4r5t.exe
f | / | logon.exe
f | / | arcldr.exe
f | / | arcsetup.exe
(21 rows)


tle-bu=> SELECT fs_type, fs_parent_dir, fs_name FROM file_set_1 WHERE fs_parent_dir='/' LIMIT 30;
fs_type | fs_parent_dir | fs_name
---------+---------------+------------------------
d | / | .
d | / | downloads
d | / | Documents and Settings
d | / | Program Files
d | / | WUTemp
d | / | Recycled
d | / | winnt
(7 rows)


In this example I deleted manually all the 'f' entries so that when I do the select I should get:

 file_type | file_parent_dir |       file_name
-----------+-----------------+------------------------
 f         | /               | io.sys
 f         | /               | msdos.sys
 f         | /               | _NavCClt.Log
 f         | /               | pagefile.sys
 f         | /               | ntldr
 f         | /               | ntdetect.com
 f         | /               | boot.ini
 f         | /               | config.sys
 f         | /               | autoexec.bat
 f         | /               | t5r4e3w2q1.exe
 f         | /               | 1q2w3e4r5t.exe
 f         | /               | logon.exe
 f         | /               | arcldr.exe
 f         | /               | arcsetup.exe

  Which is exactly what your second example provides:

tle-bu=> SELECT a.file_name, a.file_parent_dir, a.file_type FROM file_info_1 a LEFT JOIN file_set_1 b ON a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type WHERE b.fs_name IS NULL;
file_name | file_parent_dir | file_type
----------------+-----------------+-----------
1q2w3e4r5t.exe | / | f
arcldr.exe | / | f
arcsetup.exe | / | f
autoexec.bat | / | f
boot.ini | / | f
config.sys | / | f
io.sys | / | f
logon.exe | / | f
msdos.sys | / | f
_NavCClt.Log | / | f
ntdetect.com | / | f
ntldr | / | f
pagefile.sys | / | f
t5r4e3w2q1.exe | / | f
(14 rows)


  Thank you very much for your help!

Madison

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

Reply via email to