Any reason why you don't use ReusableAnalyzerBase? If you don't want
it to be reusable just return false from its reset method, your code
will be cleaner and both methods are final.

simon

On Tue, Oct 19, 2010 at 7:05 PM, Shai Erera <ser...@gmail.com> wrote:
> Thanks Uwe - that's what I was aiming for. We let the external analyzers
> make sure they are safe by themselves, while ensuring Lucene/Solr ones are
> good. +1 from me to commit this :).
>
> Shai
>
> On Tue, Oct 19, 2010 at 7:03 PM, Uwe Schindler <u...@thetaphi.de> wrote:
>>
>> About the whole assertion (as it also affects TokenStreams). We want to
>> make sure that all Lucene/Solr TokenStreams and Analyzers are final or have
>> final implementation (even when we remove the reuseable method).
>>
>>
>>
>> The idea is to simply only hit this assert for classes from the
>> org.apache.lucene package prefix! So we can test Lucene code, but for all
>> other subclasses we simply ignore. The method assertFinal can do this for
>> us:
>>
>>
>>
>> Index: Analyzer.java
>>
>> ===================================================================
>>
>> --- Analyzer.java   (revision 1023877)
>>
>> +++ Analyzer.java   (working copy)
>>
>> @@ -48,6 +48,8 @@
>>
>>    private boolean assertFinal() {
>>
>>      try {
>>
>>        final Class<?> clazz = getClass();
>>
>> +      if (!clazz.getName().startsWith("org.apache.lucene.")
>>
>> +        return true;
>>
>>        assert clazz.isAnonymousClass() ||
>>
>>          (clazz.getModifiers() & (Modifier.FINAL | Modifier.PRIVATE)) != 0
>> ||
>>
>>          (
>>
>>
>>
>> Same for TokenStream. This is no performance problem, as assertFinal is
>> only called when asserts are enabled (trick is “assert assertFinal();” in
>> ctor).
>>
>>
>>
>> Uwe
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> From: Uwe Schindler [mailto:u...@thetaphi.de]
>> Sent: Tuesday, October 19, 2010 6:18 PM
>>
>> To: dev@lucene.apache.org
>> Subject: RE: Analyzer forcing tokenStream and reusableTokenStream to be
>> final
>>
>>
>>
>> By the way, the same tests are done for TokenStream subclasses (whose
>> impls must be final in all cases – its defined as decorator pattern, so we
>> enforce it). And: You don’t need to make the class itself final, its enough
>> to make both methods final.
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> From: Shai Erera [mailto:ser...@gmail.com]
>> Sent: Tuesday, October 19, 2010 6:06 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Analyzer forcing tokenStream and reusableTokenStream to be
>> final
>>
>>
>>
>> I guess you didn't read my email all the way through - I cannot disable
>> assertions for Lucene stuff because I use Lucene's assertions to assert that
>> my indexing code works :).
>>
>> Shai
>>
>> On Tue, Oct 19, 2010 at 5:59 PM, Uwe Schindler <u...@thetaphi.de> wrote:
>>
>> We simply added that to *test* the bundled analyzers for conformance. If
>> you don’t like that, you can simply disable assertions for the
>> org.apache.lucene package.
>>
>>
>>
>> -----
>>
>> Uwe Schindler
>>
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>
>> http://www.thetaphi.de
>>
>> eMail: u...@thetaphi.de
>>
>>
>>
>> From: Shai Erera [mailto:ser...@gmail.com]
>> Sent: Tuesday, October 19, 2010 5:53 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Analyzer forcing tokenStream and reusableTokenStream to be
>> final
>>
>>
>>
>> I still don't understand how not declaring my tokenStream and
>> reusableTokenStream final can break anything. The methods are there (in my
>> analyzers), and if I risk overriding them somewhere else it's my problem.
>>
>> What am I missing?
>>
>> To add to your email - I too didn't encounter an analyzer that cannot be
>> reused, yet.
>>
>> Shai
>>
>> On Tue, Oct 19, 2010 at 5:45 PM, Robert Muir <rcm...@gmail.com> wrote:
>>
>> On Tue, Oct 19, 2010 at 11:21 AM, Robert Muir <rcm...@gmail.com> wrote:
>> > If someone doesn't override both (e.g. they just override
>> > tokenStream), then it wouldnt actually use their subclasses code. So
>> > then the reflection hack from LUCENE-1678 would force the analyzer to
>> > never re-use, but instead call tokenStream: but this is very bad for
>> > indexing performance!
>> >
>>
>> Here's a jira issue with an example of how the
>> tokenstream/reusableTokenStream confusion makes this a real problem in
>> practice:
>>
>> https://issues.apache.org/jira/browse/LUCENE-2279
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to