What should I use if I want to try to extract events (dates/times) out of an HTML page? I looked at Tika since it's a parsing project. Am I on the right track or is there something better to use? It also seems like Apache UIMA is kind of doing that, but I'm not sure. I thought since a lot of these projects are associated to lucene, someone might know.
David Lee