Am 17.07.2008 um 22:33 schrieb Hamish Allan:

On Thu, Jul 17, 2008 at 8:49 PM, Philip Mötteli
<[EMAIL PROTECTED]> wrote:

I try to analyze objects, that have been serialized using keyed encoding. As long as there are only simple values, I have no problem. But the members
of too-many IVars are usually keyed by using something like
"IVarName[0-9]+". I have to filter those out and classify as too- many. But I can't count on it. Not on the name nor where the number is. Or if
there's a number. It could also be a letter.

Have you considered clustering the strings based on their Levenshtein distances?

Very interesting! This will help me to find the too-manies. But…


What I'm asking is, if you can
identify everything that is not a too-many relationship, find them via
a process of elimination (if it's not something I can identify, then
it must be a too-many).

But I can't make this analysis every-time an object gets serialized. My program would be way too slow. I should only analyze or reanalyze, when the last solution doesn't work anymore. So ideally I thought, I would create a regex for the too- manies and every-time the key of an encoded IVar drops in, I just test it against the regex, if it belongs to the too-manies. If at the end, an encoded IVar is left over or a type has changed, I know, that my analysis was not correct. So I reload all the data and redo the analysis. I'm very confident, that I will find the right solution this way after no more than 2 passes. Having to analyze only twice, compared to analyze after every serializing, is way faster._______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Reply via email to