Hi!

I am currently performing a Seach Engine Optimization (SEO) of HTML web-pages of my web-site (on Win XP Home SP3). In order to do that it is important to know, which 3 words are used most frequently on the page. So I wrote a cross referencer (in C) to find those. The 2nd step is find the 3 most frequently used word groups, consisting of 2 words. The results of both should be combined.

Now I have several possibilities. It is easy to do this in C as well. Alternatives are using flex, or the combination of flex and bison.

To have Flex identify a word is easy:

[-0-9A-Za-z]+

So is the identification of 2 words:

[-0-9A-Za-z](' '|\t)[-0-9A-Za-z]

The easiest way to implement this is to write 2 programs, and manually combine the result.

Now my question is: Can both be combined in 1 Flex, or Flex and Bison program. Flex will try to satisfy the longest match, so it will not find the single word. Does this imply that I should introduce some functionality like a 'Moving Average Filter'? Are there better solutions?

Any suggestion is much appreciated.

Kind regards,

Hans Lodder

_______________________________________________
help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison

Reply via email to