London Open Source Search meetup - Mon 15th June

2009-05-28 Thread Richard Marr
Hi all, I'm organising another open source search social evening (OSSSE?) in London on Monday the 15th of June, this time with the able assistance of René Kriegler. The plan is to get together and chat about search technology, from Lucene to Solr, Hadoop, Mahout, Ferret and the like. We are plann

Re: no results for query with special characters

2009-05-28 Thread Laura Hollink
Thanks Ian! You were right, the problem was simply the InputStreamReader. (Sorry this reply is a bit off-topic now, I didn't have time to work on this specific problem earlier). Laura On Apr 27, 2009, at 3:24 PM, Ian Lea wrote: Hi The problem may well lie with the reading of the queries

Re: Help Needed...

2009-05-28 Thread Karl Wettin
28 maj 2009 kl. 12.22 skrev Gaurav Kumar: Hi everyone, I am doing a project using Lucene where i need to index HTML files. I am using Tika to parse HTML files. But i need to index files according to their tags which means that every text present in different HTML tag (like ) should be s

Re: Help Needed...

2009-05-28 Thread Anshum
Indexing/Storing are at developers discretion. You may choose to store or not store a field as per your requirement. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw On Thu, Ma

Re: Help Needed...

2009-05-28 Thread Alexander Aristov
you will need to develop parser and indexer. but remember that in current implementation content is not stored in lucene index, indexed - yes nut not stored. Best Regards Alexander Aristov 2009/5/28 Gaurav Kumar > Hi everyone, > > I am doing a project using Lucene where i need to index HTML

Re: Help Needed...

2009-05-28 Thread Paul Libbrecht
Kumar, you'll have to make your own documents with after parsing yourself the HTML (e.g. with Nekohtml to dom). As for the weights of tokens, supplementarily to IDF, you can do that per field, i.e. when you add a field into the document. paul Le 28-mai-09 à 12:22, Gaurav Kumar a écrit :

Help Needed...

2009-05-28 Thread Gaurav Kumar
Hi everyone, I am doing a project using Lucene where i need to index HTML files. I am using Tika to parse HTML files. But i need to index files according to their tags which means that every text present in different HTML tag (like ) should be stored in different fields. Can i do that. If yes how

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
No i am not indexing the html tags here. I just want to highlight the searched word in the html or xml file(the file from which the index was created) ,can't i trace that? Is there any function to trace the positions of the term stored in lucene index to find where it actually is in the file? Can o

Re: highlighting searched results in document

2009-05-28 Thread KK
As I know, you extract the text out of html pages, I dont think you want to index the tags as well, right? So what gets indexed by lucene is just the text and what you get as search result is what you've indexed. I'm repeating myself, once you have the search result, its upon you to do what you wan

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
Is this possible through lucene or has anybody tried such thing? On 28/05/2009, Ritu choudhary wrote: > well friend let me explain the whole thing to you then: > > i created lucene index out of some .xml and .html files and i also > checked this index through luke and its pretty alright till here

Re: highlighting searched results in document

2009-05-28 Thread Ritu choudhary
well friend let me explain the whole thing to you then: i created lucene index out of some .xml and .html files and i also checked this index through luke and its pretty alright till here . I searched the terms and can find them too but how do i use this result . I want to open the document ,the