The proper solution would be to use /ActualText feature of the PDF specification.
I am very interested in this issue of searching PDFs. A google search for "PDF Actual Text" turned up nothing. I then downloaded the actual PDF spec from the Adobe web site and found the reference, and got the idea of what it's about. But how do people like me who use commonly available tools such as XeLaTeX to make PDFs implement this? Could you say a little more about this feature or provide pointers to some information?
Thanks - David
-------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex