Re: Advise for Mediabase with Lucene

2008-10-02 Thread Mathias P.W Nilsson
I don't know if this is going to work. Let's say I have a root folder that is the startpoint for a client. The only thing I have in the database is the startPoint When traversing the child folders I want to check If the folder has changed since the last time.Can I store this in a lucene index, a

Re: Advise for Mediabase with Lucene

2008-10-02 Thread Erick Erickson
I'm not a particular fan of Hibernate, but that may just reflect my unfamiliarity. The real question is should you have two separate systems that you tie together, Lucene as a search engine AND a database for "other stuff". My *strong* preference is to keep the number of moving parts to a minimum

Re: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Chris Hostetter
If i'm reading your message correctly, you (and everyone who has replied so far) have gotten caught in a red herring. While an "explain" on the results from your queryB will most likely show you that the fieldNorm is the main differantiator in score between document-153 and document-244 that'

Re: Advise for Mediabase with Lucene

2008-10-02 Thread Mathias P.W Nilsson
Oh, I forgot. Would you save the documents as index on the file system or use Hibernate search with lucene? -- View this message in context: http://www.nabble.com/Advise-for-Mediabase-with-Lucene-tp19787867p19789551.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

Re: Advise for Mediabase with Lucene

2008-10-02 Thread Mathias P.W Nilsson
Thanks Erick! I've just bought the book Lucene in action and I will see where that leeds me. I'm aware of that lucene doens't do the other magic ;) just what it is made for. Indexing and searching. -- View this message in context: http://www.nabble.com/Advise-for-Mediabase-with-Lucene-tp197878

Extracting Dates

2008-10-02 Thread David Lee
What should I use if I want to try to extract events (dates/times) out of an HTML page? I looked at Tika since it's a parsing project. Am I on the right track or is there something better to use? It also seems like Apache UIMA is kind of doing that, but I'm not sure. I thought since a lot of these

Re: Advise for Mediabase with Lucene

2008-10-02 Thread Erick Erickson
Well, that depends (tm). What do you expect Lucene to do for you? Lucene would e a fine tool for creating a keywords index, searching it, etc. Really, the rest of the stuff in step <2>. Conceptually, you'd store each file as a document, with several fields. Say filepath text user rights then mix-

Advise for Mediabase with Lucene

2008-10-02 Thread Mathias P.W Nilsson
Hi! I'm currently developing a mediabase for 20-100 customers. A Customer can upload a file, folder via ftp and a file grabber searches the file system and adds the new file to a mysql database. It also creates thumbnails, adds search words etc. Now, this mediabase is pretty old and is developed

Re: Lucene vs. Database

2008-10-02 Thread Petite Abeille
On Oct 2, 2008, at 9:41 AM, agatone wrote: Now I have to go detailed into every one of them and write down stuff. Couple of handy guidelines: http://www.w3.org/DesignIssues/Principles.html E.g. "Principle of Least Power" :) Cheers, -- PA. http://alt.textdrive.com/nanoki/

Concurrent search

2008-10-02 Thread Carmelo Saffioti
Hi everybody, I'm trying to use the Lucene index on a web application, on which there are many concurrent users. How can I share the Lucene index to many concurrent users? Some people said that a shared index is accessible only by one user at time... Thank you Carmelo

RE: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Jimi Hullegård
Karl Wrote: > > 2 okt 2008 kl. 14.47 skrev Jimi Hullegård: > > > But apparently this setOmitNorms(true) also disables boosting > > aswell. That is ok for now, but what if we want to use boosting in > > the future? Is there no way to disable the length normalization > > while still keeping the boost

Re: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Karl Wettin
2 okt 2008 kl. 14.47 skrev Jimi Hullegård: But apparently this setOmitNorms(true) also disables boosting aswell. That is ok for now, but what if we want to use boosting in the future? Is there no way to disable the length normalization while still keeping the boost calculation? You can m

RE: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Jimi Hullegård
Erick wrote: > > Another possibility (and I'm not sure it'll work, but what > the heck) would > be > to create a Filter for active ideas. So rather than add a > "category:14" > clause, > you create a Category14Filter that you send to the query > along with your > +type:idea +alltext:betyg clauses.

Re: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Erick Erickson
Another possibility (and I'm not sure it'll work, but what the heck) would be to create a Filter for active ideas. So rather than add a "category:14" clause, you create a Category14Filter that you send to the query along with your +type:idea +alltext:betyg clauses. Now, category won't be considered

RE: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Jimi Hullegård
Erik wrote: > > On Oct 2, 2008, at 7:39 AM, Jimi Hullegård wrote: > > Is it possible to disable the lengthNorm calculation for particular > > fields? > > Yes, use Field#setOmitNorms(true) when indexing. Ok, thanks. I will just have to look on how to do this the best way (since the CMS is handling

Re: Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Erik Hatcher
On Oct 2, 2008, at 7:39 AM, Jimi Hullegård wrote: Is it possible to disable the lengthNorm calculation for particular fields? Yes, use Field#setOmitNorms(true) when indexing. Erik - To unsubscribe, e-mail: [EMAIL P

Calculation of fieldNorm causes irritating effect of sort order

2008-10-02 Thread Jimi Hullegård
Hi, Maybe I have missunderstood the general concept of how search results should be scored in regards to the fieldNorm, but the way i see it it causes an irritating effect of the sort order for me. Here's the deal: I'm building a simple site with documents that represents ideas. Each idea can

Fwd: CFP open for ApacheCon Europe 2009

2008-10-02 Thread Erik Hatcher
Begin forwarded message: From: Noirin Shirley <[EMAIL PROTECTED]> Date: October 2, 2008 4:22:06 AM EDT To: [EMAIL PROTECTED] Subject: CFP open for ApacheCon Europe 2009 Reply-To: [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] PMCs: Please send this on to your users@ lists! If you only have th

Re: Lucene vs. Database

2008-10-02 Thread agatone
Thank you all for your replies. I didn't expect so many of them - all appreciated. Now I have to go detailed into every one of them and write down stuff. Thank you again. -- View this message in context: http://www.nabble.com/Lucene-vs.-Database-tp19755932p19774692.html Sent from the Lucene -