>Do you need to keeping the masking values between runs? I.e. must the masked >value be the same for the same input on multiple runs?
No - we will extract a full set of PROD data, mask it, and then burn a DVD for our vendor. If they ask for a refresh, we will just repeat using current PROD data. >If not, then the simpliest way that I can think of is to either use a >sequential number as the replacement value. Keep a hash table so that when you >look at the unmapped number, you can either determine it has already been seen >and has a replacement value. If is does, then replace it. If it doesn't, >generate the next number in order and update your mapping data with the input >value and its replacement. This could be as simple as a very large sequential >array. >If you have DB2, then you've got an easy way. Create a table with two column. >The first column is defined as a serial number which is autogenerated by DB2. >The second column is the live number. Put an index on both columns. When you >get a live number, do a lookup in the table to retrieve the mapped value. If >the lookup fails, add the live number to the table, getting the serial number >assigned. If this is not random enough, then actually use a random generated >number instead of a serial number in the first (live) column. This is a bit >more complicated since, if the live number is not yet in the table, you'll >need to generate the random number and try to insert a new row (unique index >on both of the column). If the new row inserts properly (which guarantees that >both the random number and live number are unique in the table), use the >random number. If the new row does not insert, then generate a new random >number and try to insert again. Repeat until the row inserts and use ! t! John, this is actually kinda like my plan B. A real identifier of value N would be translated to be the Nth value in a sequence of pseudo-random numbers. The only problem is maintaining a billion row table. I have thought of asking our security officer if I can get away with only masking the last six digits of the identifier, leaving the first three ASIS. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN

