Is Python Suitable for Large Find & Replace Operations?

rbt Mon, 13 Jun 2005 11:10:32 -0700

Here's the scenario:

You have many hundred gigabytes of data... possible even a terabyte or 
two. Within this data, you have private, sensitive information (US 
social security numbers) about your company's clients. Your company has 
generated its own unique ID numbers to replace the social security numbers.


Now, management would like the IT guys to go thru the old data and 
replace as many SSNs with the new ID numbers as possible. You have a tab 
delimited txt file that maps the SSNs to the new ID numbers. There are 
500,000 of these number pairs. What is the most efficient way  to 
approach this? I have done small-scale find and replace programs before, 
but the scale of this is larger than what I'm accustomed to.

Any suggestions on how to approach this are much appreciated.
-- 
http://mail.python.org/mailman/listinfo/python-list

Is Python Suitable for Large Find & Replace Operations?

Reply via email to