I find myself frobbing trees a lot these days: read in some XML,
wander around in tree-land for a while, then output either more XML
or somesuch.  And, quite frankly, it's a bit of a pain.

The issue, as I see it, is that Perl has no "power tools" for dealing
with trees.  I will admit that I don't know what these should look
like, but if Perl has them, it's news to me.  Here's an example:

Let's say that I've got a daemon which is running ps(1) on a regular
basis and logging the results.  A brute force approach would be to
save the raw ASCII output, but these days I'm trying to use XML.  So,
I write out the output as (informal) XML:

  <log>
    <ps time=123456789>
      <process>
        <pid>123</>
        <pcpu>4.6</>
        <stat>SN+</>
        ...
      </process>
    </ps>
    ...
  </log>

A bit bulky, bit nicely tagged and serialized.  Now, I want to do
something with it.  OK, the first thing I do is read it in as a tree.
I use my own SAX handler, because I want a pure Perl way to load in
a tree, preserving order.  It loads in something like this:

  [ 'log', {},
    [ 'ps', { time => 123456789 },
      [ 'process', {},
        [ 'pid',  {}, '123' ],
        [ 'pcpu', {}, '4.6' ],
        [ 'stat', {}, 'SN+' ],
        ...
      ],
    ],
    ...
  ]

The problem is that, although the data structure I've loaded in is a
tree, I generally want to use it as something else.  For example, let's
say that I want to "boil down" these log files a bit.  This means I
have to pick up the static values (e.g., pid), tally the distribution
of the flag values (e.g., stat), and average the numeric snapshots, as:

  foreach $time (sort(keys(%ps))) {
    $pid  =  $ps{$time}{pid} unless defined ($pid);
    $pcpu += $ps{$time}{pcpu};
    $stat{$ps{$time}{stat}}++;
    ...
  }

My approach to this, currently, is to walk the tree, creating the data
structure I'd _like_ to have, before I try to do the actual work.  This
isn't TOO painful, but it isn't the sort of DWIMitude I'd like to see.

More to the point, let's say that I simply want to transform the data
into a different order.  In a multiply subscripted array, this is just
a matter of swapping subscripts on the output loop(s).  Turning the tree
above into something like:

  <process pid="123">
    <time>123456789,...</>
    <pcpu>4.6,...</>
    <stat>SN+,...</>
  </process>

is not something I want to try in XSLT.  I can do it in Perl, of course,
but I end up writing a lot of code.  Am I missing something?  And, to
bring the posting back on topic, will Perl6 bring anything new to the
campfire?

-r
--
email: [EMAIL PROTECTED]; phone: +1 650-873-7841
http://www.cfcl.com/rdm    - my home page, resume, etc.
http://www.cfcl.com/Meta   - The FreeBSD Browser, Meta Project, etc.
http://www.ptf.com/dossier - Prime Time Freeware's DOSSIER series
http://www.ptf.com/tdc     - Prime Time Freeware's Darwin Collection

Reply via email to