I'm not attempting to solve this puzzle, but, regarding the output from print_r(), try this for nicely formatted output:
echo "<pre>"; print_r($myarray); echo "</pre>"; Looks much better. Monty > From: [EMAIL PROTECTED] (Peter Harkins) > Newsgroups: php.general > Date: Sat, 26 Oct 2002 02:37:51 -0700 > To: [EMAIL PROTECTED] > Subject: parsing conundrum > > If you know what recursion is and like a challege, here's a puzzle > to keep you up nights. If not, you'll probably just want to mutter to > yourself "what a poor, unlucky bastard" and pass on by. > > I'm parsing some data files into a PHP array and am stumped. I'm at > a loss for how to do this without grinding through character by character. > That would work, but my subconcious is nagging at me that there's got to be > a more elegant way to do it that I'm just not seeing, so I'm going to > describe the problem and ask for help before I start grinding. > > The app I'm getting this from has 4 data types: int, string, array > and mapping (associative array). > > Ints and strings are pretty straightforward, but there's no way to > tell 0 from null int or null string. This is an annoying limitation of the > app that just has to be ignored and dealt with by whatever gets this data > from us. This (among other reasons) make me glad PHP is weakly-typed. > > Arrays are indexed from 0 and values can mix ints and strings > freely. > > To start, mappings are arrays indexed by ints or strings. Mappings > aren't just arrays, though, they have a "width" (which is really a nested > array that I'm pretty certain is an ugly historical artifact.) Width allows > multiple values for one key and must be the same for all values in a > mapping, though the values (both of keys and their values) don't have to be > of the same type. Mappings can also mix ints and strings. > > The tough part is that arrays and mappings can nest inside of each > other and the only characters quoted in strings are \, " and \n. This means > recursion must be used, but I just can't figure out a way to find the > boundaries of each element. Anyone with a clean way to do this (probably > with some kind of crazy regexp) will recieve my awe and gratitude. > > Here's an example file[1]: > > null_string 0 > some_string "Fourscore and seven years ago..." > unset_int 0 > an_int 42 > negative_int -12 > null_array ({ }) > null_mapping ([ ]) > easy_array ({9,22,"test",}) > easy_mapping (["string":3,"foo":"bar",[12]:"I am not a crook!",]) > medium_array ({"a string, containing a comma and a \"",23,}) > medium_mapping (["str\"ing":3;5;7, 9:"Read my lips.";11;13;,]) > hard_array ({"comma, string",({3,4,5,}),({"'\"str'",4,({3,4,({ }),}),}),}) > hard_mapping ([17:"str";15,"foo":([ ]);17,"b'l\\a\nh":19;([21:23,]),"tour de > force":({29,31});({([ ])}),]) > > You may notice the last one is pathological[2]. Yes, PHP will really > let you use " and \n in array keys. The real data do sometimes get about > this complex; consider this a compressed version. As a fun fact, I've > learned vim's % command doesn't work when there's an odd number of double > quotes between your parens/braces. > > Calling print_r on the array this generates would return: > > Array > ( > [null_string] => 0 > [some_string] => "Fourscore and seven years ago..." > [unset_int] => 0 > [an_int] => 42 > [negative_int] => -12 > [null_array] => Array > ( > ) > > [null_mapping] => Array > ( > ) > > [easy_array] => Array > ( > [0] => 9 > [1] => 22 > [0] => test > ) > > [easy_mapping] => Array > ( > [string] => 3 > [foo] => "bar" > [12] => "I am not a crook!" > ) > > [medium_array] => Array > ( > [0] => a string, containing a comma and a " > [1] => 23 > ) > > [medium_mapping] => Array > ( > [string] => Array > ( > [0] => 3 > [1] => 5 > [2] => 7 > ) > > [9] => Array > ( > [0] => Read my lips. > [1] => 11 > [2] => 13 > ) > > ) > > [hard_array] => Array > ( > [0] => comma, string > [1] => Array > ( > [0] => 3 > [1] => 4 > [2] => 5 > ) > > [2] => Array > ( > [0] => '"str' > [1] => 4 > [2] => Array > ( > [0] => 3 > [1] => 4 > [2] => Array > ( > ) > > ) > > ) > > ) > > [hard_mapping] => Array > ( > [17] => Array > ( > [0] => str > [1] => 15 > ) > > [foo] => Array > ( > [0] => Array > ( > ) > > [1] => 17 > ) > > [b'l\a\nh] => Array > ( > [0] => 19 > [1] => Array > ( > [21] => 23 > ) > > ) > > [tour de force] => Array > ( > [0] => Array > ( > [29] => 31 > ) > > [1] => Array > ( > [0] => Array > ( > ) > ) > > ) > > ) > > ) > > The truly eagle-eyed will note that print_r doesn't really escape \n > when outputting, which is ugly and should probably have a bug report filed > on it. I've ignored this behavior of print_r to produce more readable > output. > > Anyone who gets the 'tour de force' can contact me for the 'victory > lap' puzzle if it didn't occur to them while solving this one. It's equally > frustrating. > > [1] Brownie points to anyone who recognizes the app that outputs this data > format. Yep, I'm doing web integration. Feel free to contact me off-list if > you've any ideas/code to share or would like to see what I'm doing. Yes, > technically, mappings can be indexed by an array or mapping. I've ignored > this because 1. PHP can't do this and 2. Anyone doing this will be dragged > out into the street and shot. > > [2] http://tuxedo.org/~esr/jargon/html/entry/pathological.html -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php