Hi!
I am currently performing a Seach Engine Optimization (SEO) of HTML
web-pages of my web-site (on Win XP Home SP3). In order to do that it is
important to know, which 3 words are used most frequently on the page.
So I wrote a cross referencer (in C) to find those. The 2nd step is find
the 3 most frequently used word groups, consisting of 2 words. The
results of both should be combined.
Now I have several possibilities. It is easy to do this in C as well.
Alternatives are using flex, or the combination of flex and bison.
To have Flex identify a word is easy:
[-0-9A-Za-z]+
So is the identification of 2 words:
[-0-9A-Za-z](' '|\t)[-0-9A-Za-z]
The easiest way to implement this is to write 2 programs, and manually
combine the result.
Now my question is: Can both be combined in 1 Flex, or Flex and Bison
program. Flex will try to satisfy the longest match, so it will not find
the single word. Does this imply that I should introduce some
functionality like a 'Moving Average Filter'? Are there better solutions?
Any suggestion is much appreciated.
Kind regards,
Hans Lodder
_______________________________________________
help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison