RE: best html parser for html documents generated by microsoft products

2005-12-03 Thread Mark Benussi
I use JTidy also, but not for Lucene parsing. There is no easy way of handling this, you simply have to remove all crappy Microsoft inserts as they come. -Original Message- From: Gaston [mailto:[EMAIL PROTECTED] Sent: 03 December 2005 13:49 To: java-user@lucene.apache.org Subject: best ht

RE: Compass 0.5 Released

2005-08-04 Thread Mark Benussi
Shay. Do you have a web site that we can visit to discover more about your technology? -Original Message- From: Shay Banon [mailto:[EMAIL PROTECTED] Sent: 04 August 2005 22:34 To: java-user@lucene.apache.org Subject: Compass 0.5 Released We are please to announce the 0.5 major feature r