>Do you need to keeping the masking values between runs? I.e. must the masked 
>value be the same for the same input on multiple runs? 

No - we will extract a full set of PROD data, mask it, and then burn a DVD for 
our vendor.  If they ask for a refresh, we will just repeat using current PROD 
data.

>If not, then the simpliest way that I can think of is to either use a 
>sequential number as the replacement value. Keep a hash table so that when you 
>look at the unmapped number, you can either determine it has already been seen 
>and has a replacement value. If is does, then replace it. If it doesn't, 
>generate the next number in order and update your mapping data with the input 
>value and its replacement. This could be as simple as a very large sequential 
>array.

>If you have DB2, then you've got an easy way. Create a table with two column. 
>The first column is defined as a serial number which is autogenerated by DB2. 
>The second column is the live number. Put an index on both columns. When you 
>get a live number, do a lookup in the table to retrieve the mapped value. If 
>the lookup fails, add the live number to the table, getting the serial number 
>assigned. If this is not random enough, then actually use a random generated 
>number instead of a serial number in the first (live) column. This is a bit 
>more complicated since, if the live number is not yet in the table, you'll 
>need to generate the random number and try to insert a new row (unique index 
>on both of the column). If the new row inserts properly (which guarantees that 
>both the random number and live number are unique in the table), use the 
>random number. If the new row does not insert, then generate a new random 
>number and try to insert again. Repeat until the row inserts and use !
 t!

John, this is actually kinda like my plan B.  A real identifier of value N 
would be translated to be the Nth value in a sequence of pseudo-random numbers. 
 The only problem is maintaining a billion row table.  I have thought of asking 
our security officer if I can get away with only masking the last six digits of 
the identifier, leaving the first three ASIS.
 

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to