I'm building six different indexes in series, at the end of building an
index I call optimize() and then close() the writer, then move onto the
next one.
I build them in series because they are extracting the data from a
database and I don't want to overload the database.
However the optimizatio
You can call IW.waitForMerges().
Mike
On Fri, Jan 28, 2011 at 4:16 AM, Paul Taylor wrote:
> I'm building six different indexes in series, at the end of building an
> index I call optimize() and then close() the writer, then move onto the next
> one.
> I build them in series because they are extr
Hi,
I'm poking in the dark and hope someone has some light...
We have part numbers in technical documentation to retrieve. For now we
have a (long) regular expression to find those in a string. The part
numbers have letters, digits and (redundant) whitespace. Furthermore
authors often used a
Hi Wulf,
can I ask, if it is structured documentation (like XML or SGML) you're
dealing with? It's because I also work with technical documentation and we
do exactly, waht you're asking for, but it is XML-data.
On Fri, Jan 28, 2011 at 1:05 PM, Wulf Berschin wrote:
> Hi,
>
> I'm poking in the d
Hi Karolina,
yes (of course!) We have an XML element for the part numbers, but upto
now they are not all tagged thus we need regex matching as well...
Am 28.01.2011 13:31, schrieb Karolina Bernat:
Hi Wulf,
can I ask, if it is structured documentation (like XML or SGML) you're
dealing with? I
oh, okay.. well for the XML part we use Apache Digester and define rules to
enclose the correct elements. But I can't tell what's the best way to
proceed in your case, sorry. The steps you listed here sound reasonable to
me.
If you want to get search hits for a part number range and highlight
'A12
I wonder if you can define the problem away? It sounds like
you have essentially random input here. That is, the users
can put in whatever they want so whatever you do will be wrong
sometime. Could you sidestep the problem with auto-complete
and prefix queries (essentially adding * to the user's in
Hello,
since I moved on with my offset-info problem in HTML files, I got a new one
trying to bring the tokens positions information together with tokens/term
offset information. Can someone tell me, how can I get a token, if I know
its position? It would be nice to get the tokens position from the
: Subject: How to index part numbers
: References: <4d428976.6010...@fastmail.fm>
: In-Reply-To: <4d428976.6010...@fastmail.fm>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing
(11/01/25 2:14), Paul Taylor wrote:
On 22/01/2011 15:43, Koji Sekiguchi wrote:
(11/01/20 22:19), Paul Taylor wrote:
Trying to extend MappingCharFilter so that it only changes a token if the
length of the token
matches the length of singleMatch in NormalizeCharMap (currently the
singleMatch ju
10 matches
Mail list logo