The first thing I'd do is get a copy of Luke and look in my index to see exactly what's there. Nothing in your e-mails indicates that you *should* get any hits. Although I admin not getting jakarta lucene in 50M pages seems unlikely.
But Ian's suggestion that you start with a smaller index is spot on. Best Erick On Thu, Jul 16, 2009 at 8:42 AM, prashant ullegaddi < prashullega...@gmail.com> wrote: > 50 million HTML pages (part of clueweb09 dataset for TREC) were indexed > using Hadoop into 56 indexes. 56 indexes were merged into a single index. > Analyzer is the StandardAnalyzer. > > > > On Thu, Jul 16, 2009 at 6:07 PM, Anshum <ansh...@gmail.com> wrote: > > > Hi Prashant, > > > > What did you index? how did you index? what analyzer did you use? without > > all of these, perhaps it'd be difficult to figure out the issue. > > > > -- > > Anshum Gupta > > Naukri Labs! > > http://ai-cafe.blogspot.com > > > > The facts expressed here belong to everybody, the opinions to me. The > > distinction is yours to draw............ > > > > > > On Thu, Jul 16, 2009 at 6:04 PM, prashant ullegaddi < > > prashullega...@gmail.com> wrote: > > > > > Hi, > > > > > > I tried searching: > > > "Apache Jakarta"~10 > > > > > > Nothing was returned. What might be wrong? > > > > > > Regards, > > > Prashant. > > > > > >