RE: Using Lucene for technical documentation

2020-12-03 Thread Trevor Nicholls
each search pattern to decide which field to query against. When I get stuck I will be back! Cheers T -Original Message- From: Paul Libbrecht Sent: Monday, 23 November 2020 21:23 To: java-user@lucene.apache.org Subject: Re: Using Lucene for technical documentation Hello Trevor, I

Re: Using Lucene for technical documentation

2020-11-23 Thread Erick Erickson
You might be able to get something “good enough” with one of the pattern tokenizers, see: https://lucene.apache.org/solr/guide/8_6/tokenizers.html. Won’t be 100% of course. And Paul’s comments are well taken, especially since your input will be inconsistent I’d guess. How much you want to bet t

Re: Using Lucene for technical documentation

2020-11-23 Thread Paul Libbrecht
Hello Trevor, I don’t know of an analyzer for mixes of code and text but I know of an analyser for mixes of code and formulæ. Clearly, you could build a custom analyzer that would tokenize differently depending on weather you’re in code or in text. That’s no super hard. However, where thin

Using Lucene for technical documentation

2020-11-22 Thread Trevor Nicholls
Hello, I'd better begin by identifying myself as a newbie. I am investigating using Lucene as a search tool for a library of technical documents, much of which consists of pieces of source code and discussion of the content. The standard analyzer does an adequate job with normal text but st