Have you considered indexing chapters as documents? Using your example you would have three documents corresponding to your three chapters: A, B, and D. Once you have that structure the query "pain AND head" returns only chapters A and B. Using the information gained from this new chapter index you could then use your existing index to do "pain AND head AND (chapter:A OR chapter:B)"
On Fri, Apr 21, 2017 at 10:40 PM, neeraj shah <neerajsha...@gmail.com> wrote: > Hello, > Let me explain my case: > - suppose I am searching word ("pain" (in same chapter) "head") . This > is my query. > Now what i need to do is i need to first search "pain" and then i need to > search "head" seperately then i need common file name of both search > result. > Now the criteria is Suppose: > > FileA - Chapter A - has word only "*pain*" > FileB - Chapter B - has word both "*head*" and "*pain*" > FileC - Chapter A - has word only "*head*" > FileD - Chapter D - has only word "*head*" > FileE - Chapter A - has only word "*pain*" > > Now the result should be: > FileA - Chapter A - has word only "*pain*" > FileB - Chapter B - has word both "*head*" and "*pain*" > FileC - Chapter A - has word only "*head*" > FileE - Chapter A - has only word "*pain*" > > FileD - Chapter D - has only word "*head*" will not appear in search > result because "Chapter D" name is not same as other chapters which has > both search words. > In short I have to show only those chapters from any book but with same > chapter name which has both search word or atleast one search word. But > chapter name should be same. > > Above is my requirement that is why I was parsing all hits for pain and > head seperatly then i was collecting common "title" or chapter name from > both results or the result which has atleast one search word and same > chapter name. > In my result only "pain" word has "5 Lacs result" and "head" word has "60K" > results. > > Please suggest me if you have other approach in mind. > > Thanks, > Neeraj > > > > > > > On Sat, Apr 22, 2017 at 12:20 AM, Chris Hostetter < > hossman_luc...@fucit.org> > wrote: > > > > > : then which one is right tool for text searching in files. please can > you > > : suggest me? > > > > so far all you've done is show us your *indexing* code; and said that > > after you do a search, calling searcher.doc(docid) on 500,000 documents > is > > slow. > > > > But you still haven't described the usecase you are trying to solve -- > ie: > > *WHY* do you want these 500,000 results from your search? Once you get > > those Documents back, *WHAT* are you going to do with them? > > > > If you show us some code, and talk us through your goal, then we can help > > you -- otherwise all we can do is warn you that the specific > > searcher.doc(docid) API isn't designed to be efficient at that large a > > scale. Other APIs in Lucene are designed to be efficient at large scale, > > but we don't really know what to suggest w/o knowing what you're trying > to > > do... > > > > https://people.apache.org/~hossman/#xyproblem > > XY Problem > > > > Your question appears to be an "XY Problem" ... that is: you are dealing > > with "X", you are assuming "Y" will help you, and you are asking about > "Y" > > without giving more details about the "X" so that we can understand the > > full issue. Perhaps the best solution doesn't involve "Y" at all? > > See Also: http://www.perlmonks.org/index.pl?node_id=542341 > > > > > > PS: please, Please PLEASE upgrade to Lucene 6.x. 3.6 is more then 5 > years > > old, and completley unsupported -- any advice you are given on this list > > is likeley to refer to APIs that are completley different then the > version > > of Lucene you are working with. > > > > > > : > > : > > : On Fri, Apr 21, 2017 at 2:01 PM, Adrien Grand <jpou...@gmail.com> > wrote: > > : > > : > Lucene is not designed for retrieving that many results. What are you > > doing > > : > with those 5 lacs documents, I suspect this is too much to display so > > you > > : > probably perform some computations on them? If so maybe you could > move > > them > > : > to Lucene using eg. facets? If that does not work, I'm afraid that > > Lucene > > : > is not the right tool for your problem. > > : > > > : > Le ven. 21 avr. 2017 à 08:56, neeraj shah <neerajsha...@gmail.com> a > > : > écrit : > > : > > > : > > Yes I fetching around 5 lacs result from index searcher. > > : > > Also i am indexing each line of each file because while searching i > > need > > : > > all the lines of a file which has matched term. > > : > > Please tell me am i doing it right. > > : > > {code} > > : > > > > : > > InputStream is = new BufferedInputStream(new > FileInputStream(file)); > > : > > BufferedReader bufr = new BufferedReader(new > > InputStreamReader(is)); > > : > > String inputLine="" ; > > : > > > > : > > while((inputLine=bufr.readLine())!=null ){ > > : > > Document doc = new Document(); > > : > > doc.add(new > > : > > > > : > > Field("contents",inputLine,Field.Store.YES,Field.Index. > > : > ANALYZED,Field.TermVector.WITH_POSITIONS_OFFSETS)); > > : > > doc.add(new > > : > > Field("title",section,Field.Store.YES,Field.Index.NOT_ANALYZED)); > > : > > String newRem = new String(rem); > > : > > > > : > > doc.add(new > > : > > Field("fieldsort",newRem,Field.Store.YES,Field.Index.ANALYZED)); > > : > > doc.add(new Field("fieldsort2",rem. > toLowerCase().replaceAll("-", > > : > > "").replaceAll(" ", ""),Field.Store.YES,Field.Index.ANALYZED)); > > : > > > > : > > doc.add(new > > : > > Field("field1",Author,Field.Store.YES,Field.Index.NOT_ANALYZED)); > > : > > doc.add(new > > : > > Field("field2",Book,Field.Store.YES,Field.Index.NOT_ANALYZED)); > > : > > doc.add(new > > : > > Field("field3",sec,Field.Store.YES,Field.Index.NOT_ANALYZED)); > > : > > > > : > > writer.addDocument(doc); > > : > > > > : > > } > > : > > is.close(); > > : > > > > : > > {/code} > > : > > > > : > > On Thu, Apr 20, 2017 at 5:57 PM, Adrien Grand <jpou...@gmail.com> > > wrote: > > : > > > > : > > > IndexSearcher.doc is the right way to retrieve documents. If this > > is > > : > > > slowing things down for you, I'm wondering that you might be > > fetching > > : > too > > : > > > many results? > > : > > > > > : > > > Le jeu. 20 avr. 2017 à 14:16, neeraj shah < > neerajsha...@gmail.com> > > a > > : > > > écrit : > > : > > > > > : > > > > Hello Everyone, > > : > > > > > > : > > > > I am using Lucene 3.6. I have to index around 60k docuemnts. > > After > > : > > > > performing the search when i try to reterive documents from > > seacher > > : > > using > > : > > > > searcher.doc(docid) it slows down the search . > > : > > > > Please is there any other way to get the document. > > : > > > > > > : > > > > Also if anyone can give me an end-to-end example for working > > : > > FieldCache. > > : > > > > While implementing the cache i have : > > : > > > > > > : > > > > int[] fieldIds = FieldCache.DEFAULT.getInts(indexMultiReader, > > "id"); > > : > > > > > > : > > > > now i dont know how to further use the fieldIds for improving > > search. > > : > > > > Please give me an end-to-end example. > > : > > > > > > : > > > > Thanks > > : > > > > Neeraj > > : > > > > > > : > > > > > : > > > > : > > > : > > > > -Hoss > > http://www.lucidworks.com/ > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > >