On 10.07.09 16:48, Jonas Eckerman wrote: > Rosenbaum, Larry M. wrote: > >> I have found the Xpdf package [...] has a pdftotext command line utility. > > If you build it with the "--without-x" option, > > Ah. I didn't see that option. That's nice. I'm now using pdftotext > instead of pdftohtml here as well. :-)
I've been thinking about it. The pdftohtml could provide interesting infromations like colour informations that could lead to better spam detection. Any experiences with this? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Eagles may soar, but weasels don't get sucked into jet engines.