Re: AW: Boolean OR does not work as described

2012-01-03 Thread Michael-O

Anna Hunecke schrieb:

Hi Mike,

I think the problem is grouping. If you have a query A AND B OR C, it will be 
grouped as A AND (B OR C) and not as you expected as (A AND B) OR C.
Just put parentheses in your query and you get the result that you want.


Hi Anna,

This is it but the documentation is incomplete in that case or simply 
erroneous. I expect boolean clauses to be implemented correctly or 
marked as non-comparable. I will file a ticket about that.


In my case this is not trivial. I have a meta query language which suits 
both SQL and Lucene. Our users can search either the index or an Oracle 
database in a completely transparent way. I have to change the 
parser/transformer for that.


Thanks,

Mike

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Boolean OR does not work as described

2012-01-03 Thread Michael-O

Chris Hostetter schrieb:
: if you want to mix and/or in one query, always use parenthesis. The 


or better yet, train yourself not to use AND, OR and NOT...

http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/


Thanks for the blog entry. I will read through that!

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Boolean OR does not work as described

2012-01-03 Thread Michael-O

Hi Uwe,

Uwe Schindler schrieb:

Hi Mike,

if you want to mix and/or in one query, always use parenthesis. The
operator precedence is strange with the default query parser. In
contrib there is another one (called PrecedenceQueryParser) that can
handle this but is incompatible with existing queries. The parser in
contrib on the other hand also has other problems, which are all
caused by the way how Lucene handles BooleanQuery internally.


I haven't noticed the PrecendenceQP. I never expected the default parser 
being erroneous.



If you print the parsed query using Query.toString() you can see the
"normalized" query in standard Lucene syntax. Lucene does not know
AND/OR, it only knows query clauses that are SHOULD (=optional), MUST
or MUST_NOT. The query "vorgklasse:au AND schluessel:13? OR
schluessel:23?" will be parsed like:

MUST: vorgklasse:au MUST: schluessel:13? SHOULD:
schluessel:23?

Which would be in standard syntax: "+vorgklasse:au
+schluessel:13? schluessel:23?" (first two are MUST clauses,
third is should). If you read this query using MUST/SHOULD you will
understand what happens: This means the first 2 terms MUST be in the
result, the third MIGHT be in the result. So it's not an union, it's
something more complex :-)


Yes, this is what I have expected as transformed internal query. I 
really differs from what I have expected. I simply performs a first-wins 
strategy.


Danke,

schönen Gruß nach Bremen!


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



How to correctly use SearcherManager#close?

2012-11-01 Thread Michael-O
Hi folks,

while I do understand the workflow with this class, I do not understand when to 
call close(). The JavaDoc is not crystal clear on that. Am I supposed to call 
this method after release() or when my webapp is destroyed?

Thanks,

Michael

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to correctly use SearcherManager#close?

2012-11-01 Thread Michael-O
Thanks for the quick response. Any chance this could be clearer in the JavaDoc 
of this class?

> Call it when you know you'll no longer need to call .acquire on it
> anymore (typically this would be when your webapp is being destroyed).
> 
> Really all this does is drop its internal reference to the current
> searcher.
> 
> Any in-flight searches still outstanding will run fine (and they
> should call .release, like normal, once they are done, even though
> you've already called .close on the SearcherManager)...
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> On Thu, Nov 1, 2012 at 5:27 PM, Michael-O <1983-01...@gmx.net> wrote:
> > Hi folks,
> >
> > while I do understand the workflow with this class, I do not understand
> when to call close(). The JavaDoc is not crystal clear on that. Am I
> supposed to call this method after release() or when my webapp is destroyed?
> >
> > Thanks,
> >
> > Michael
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> 
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to correctly use SearcherManager#close?

2012-11-02 Thread Michael-O
> That would be great!
> 
> Hmm which version are you looking at?  In 4.0 it currently says this:
> 
>   /**
>* Close this ReferenceManager to future {@link #acquire() acquiring}.
> Any
>* references that were previously {@link #acquire() acquired} won't be
>* affected, and they should still be {@link #release released} when
> they are
>* not needed anymore.
>*/
> 
> How would you improve it?

Hi Mike,

I am on 3.6.1. The first sentence is not really clear. Maybe to native speakers 
but not to me. It does not clearly say: Close the Mananger when your app shuts 
down or when you want to dispose that index.

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: How to correctly use SearcherManager#close?

2012-11-02 Thread Michael-O
> hey michael,
> 
> On Thu, Nov 1, 2012 at 11:30 PM, Michael-O <1983-01...@gmx.net> wrote:
> > Thanks for the quick response. Any chance this could be clearer in the
> JavaDoc of this class?
> 
> sure thing, do you wanna open an issues / create a patch I am happy to
> commit it.

Thanks Simon,

Issue created: https://issues.apache.org/jira/browse/LUCENE-4525

Grüße nach Kreuzberg!

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Excessive use of IOException without proper documentation

2012-11-02 Thread Michael-O
Hi,

why does virtually every method (exaggerating) throw an IOE? I know there might 
be a failure in the underlying IO (corrupt files, passing checked exc up, etc) 
but

1. Almost none of the has a JavaDoc on it
2. Throwing an IOE from most of the methods doesn't really help. You cannot 
create separate catch blocks. You have to rip your code apart in tens of try 
catch blocks. 

Here is a simple example: Both IndexSearcher#search and IndexSearcher#doc throw 
an IOE with any further documentation. I don't have the chance to detect where 
the exception has happened nor I can pass something reasonable back to the 
user. And both methods are keypoints in Lucene.

E.g., there are exceptions thrown in IndexSearcher without any message. Simply 
empty stubs.

Lucene 4.0 received a complete API overhaul. Why was no action taken to clean 
up exception management?

Mike

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Excessive use of IOException without proper documentation

2012-11-04 Thread Michael-O

Hi Simon,

there are generally two very good resources how exceptions should be 
handled in Java. I will quote both:


1. Oracle's JavaDoc Style Guide: 
http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#throwstag
2. Joshua Bloch's book Effective Java, chapter 9: 
http://my.safaribooksonline.com/book/programming/java/9780137150021


Am 2012-11-02 22:27, schrieb Simon Willnauer:

Hey,

On Fri, Nov 2, 2012 at 2:20 PM, Michael-O <1983-01...@gmx.net>
wrote:

Hi,

why does virtually every method (exaggerating) throw an IOE? I know
there might be a failure in the underlying IO (corrupt files,
passing checked exc up, etc) but

1. Almost none of the has a JavaDoc on it


what should the javadoc say? I mean it would repeat itself all the
time no?


At least one would know why this one is thrown. See source one.


2. Throwing an IOE from most of the methods doesn't really help.
You cannot create separate catch blocks. You have to rip your code
apart in tens of try catch blocks.

Here is a simple example: Both IndexSearcher#search and
IndexSearcher#doc throw an IOE with any further documentation. I
don't have the chance to detect where the exception has happened
nor I can pass something reasonable back to the user. And both
methods are keypoints in Lucene.


can you elaborate what you would change to make this easier
digestible for you? I mean specialized exceptions class would be a
mess here really.


Not only for me but for everyone.

Generally, good exception handling should setup a simple exception 
hierarchy, not more than three levels. Two base classes for checked and 
unchecked exceptions. A main exception for a logical group like 
searching, retreival, query checking, etc.


LuceneException
|-- SearchException
|-- RetrievalException
|-- ...

One should really wrap IOE in something like 'new 
SearchException("Failed to read the index", ioe)'. I would at least know 
that this one came from the index searcher.


Really advisable are items 61, 62 and especially 63.


E.g., there are exceptions thrown in IndexSearcher without any
message. Simply empty stubs.


I agree we should fix them. I already committed some fixes to
IndexSearcher. Can you come up with a patch that fixes some more in
core?

running grep -R "new.*Exception()" * | wc -l

yields 82 in core so there is room for improvement.



Lucene 4.0 received a complete API overhaul. Why was no action
taken to clean up exception management?


I'd really like to hear what you have in mind. can you elaborate?


Continuing my answer from above. Have you ever worked with the Spring 
Framework? They apply a very nice exception translation pattern. All 
internal exceptions are turned to specialized unchecked exceptions like

AuthenticationExceptoin, DaoAccessException, etc.

Lucene has already good exceptions like CorruptIndex, IndexNotFound, 
LockObtainFailed, etc.


Of course, such a step would break backwards-compat but this is possible 
for 5.0 and this would require a general consensus of the devs improving 
this issue.


I think that such a great framework can do better to inform programmers 
and users what has really failed especially if you design a public API.
We have succeeded exceptional results with the Lucene framework. I am 
someone who likes giving help back to OSS projects.


Hopefully, this is enough input for this message. We could discuss this 
further is there is a real interest from the devs.


Mike

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: Excessive use of IOException without proper documentation

2012-11-04 Thread Michael-O

Shalom Shai,

Am 2012-11-04 20:25, schrieb Shai Erera:

Hey Mike,

I'm not sure that I like the idea of throwing LuceneException or
SearchException everywhere. I've been there (long time ago) and I always
hated it.


Not everywhere but in the top-level API would be sufficient. Some code 
is never used/should be directly by the client. What is the reason for 
why you dislike such an approach?



First, what's the difference between 'new SearchException("Failed to read
the index", ioe)' and 'new IOException("Failed to read the index",
anotherIOE/anyE)'? Both don't give you any real way of detecting the
problem in your code, maybe only in the logs.

Perhaps you would like us to throw better IOE, i.e. rather than re-throwing
low-level IO exceptions, wrap them w/ another IOE w/ a detailed message.
That I can support, but do you have a concrete case where the exception
that was thrown was not descriptive enough?


There is a tremendous difference. I have possibility at compile time to 
detect whether an IOE came from Lucene or some other IO related action.


Consider this code:

try {
File mike = new File(...);
Reader r = new InputStreamReader(new FIS(mike)):
searcher.search(...);
}
catch (IOE e) {
}

Now, how am I supposed to figure out at compile time where this IOE came 
from? I cannot neither react properly nor present a suitable user 
response unless I do create for every IOE piece a separate try-catch-block.
This is just bloat. I cannot even log correctly because I do not know 
what the source of the IOE was.


What am I excepted to do when every method throws just new Exception()?
Absolutely nothing they would all look alike at compile time.


For instance, there's that "read past EOF" exception which is annoying b/c
you can tell by the stacktrace where it happened, but not necessarily why
it happened. However, these are fairly low in our call stack, and I doubt
that even Lucene code itself can give meaningful messages to all low-level
IOEs. Yet, the exception does tell you that's usually an index corruption
case, or some other bug ... when Lucene detects the index is corrupt, it
throws the explicit CorruptIndexEx, otherwise, this must be inferred from
the stacktrace. Not perfect, but throwing IndexingException won't improve
it either.


It would cover at least the case described above.


So I'm +1 for making sure our exceptions are as informative as they can
get. I'm -1 for starting to throw LuceneException all over the place - to
me it's like IOE. Worse, if we start throwing LuceneException, people might
be tempted to wrap any Ex w/ LuceneEx, even ArrayIndexOutOfBound etc. I
don't want that.


That should not be the case. IndexOutOfBoundsException is an unchecked 
exception.
There is a huge difference to IOE which is a checked exception. Joshua 
Bloch exactly describes the case of the IOOBE in item 58. IIOOBE must 
remain as unchecked because it indicates a violation of the API 
contract. E.g., a corrupt index is recoverable and not a programming 
error. This can happen through various reasons you cannot control.



If you've hit exceptions which could use better messages, I prefer that we
concentrate on them, rather than complicating the exceptions throwing
mechanism.


I definitively will. As I always do with OSS.


On Sun, Nov 4, 2012 at 7:25 PM, Michael-O <1983-01...@gmx.net> wrote:


Hi Simon,

there are generally two very good resources how exceptions should be
handled in Java. I will quote both:

1. Oracle's JavaDoc Style Guide: http://www.oracle.com/**
technetwork/java/javase/**documentation/index-137868.**html#throwstag<http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#throwstag>
2. Joshua Bloch's book Effective Java, chapter 9:
http://my.safaribooksonline.**com/book/programming/java/**9780137150021<http://my.safaribooksonline.com/book/programming/java/9780137150021>

Am 2012-11-02 22:27, schrieb Simon Willnauer:

  Hey,


On Fri, Nov 2, 2012 at 2:20 PM, Michael-O <1983-01...@gmx.net>
wrote:


Hi,

why does virtually every method (exaggerating) throw an IOE? I know
there might be a failure in the underlying IO (corrupt files,
passing checked exc up, etc) but

1. Almost none of the has a JavaDoc on it



what should the javadoc say? I mean it would repeat itself all the
time no?



At least one would know why this one is thrown. See source one.


  2. Throwing an IOE from most of the methods doesn't really help.

You cannot create separate catch blocks. You have to rip your code
apart in tens of try catch blocks.

Here is a simple example: Both IndexSearcher#search and
IndexSearcher#doc throw an IOE with any further documentation. I
don't have the chance to detect where the exception has happened
nor I can pass something reasonable back to the user. And both
methods are keypoints in Lucene.



can you elaborate