Thanks for the help Mike. Was quick to jump to a wrong conclusion My codec does not implement Term-Vectors, Payloads, DocValues and Norms.
It should be trivial to implement Payloads, but I am not sure about others. Anyways, I can generate a HTML report and identify failures based on individual tests -- Ravi On Tue, May 14, 2013 at 3:31 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > On Tue, May 14, 2013 at 3:03 AM, Ravikumar Govindarajan > <ravikumar.govindara...@gmail.com> wrote: > > We ran the checkIndex and a simple test case. It passes. Actually, I had > > assumed problem with lucene, whereas it was an issue with our custom > codec. > > Phew, thanks for bringing closure! > > > I do not know how to confirm whether a new codec works correctly. Are > there > > any tools/existing test-cases available for validation? > > One really healthy way to test your new codec is to run all Lucene > tests against it (assume your codec is general, i.e. implements > everything). > > You just need to 1) get your codec onto the test classpath and 2) pass > -Dtests.codec=YourCodecName to force tests to use it. > > I'm not certain about step 1) ... it could be passing -lib to ant does > that? But I'm not sure that will propagate to the classpath when ant > runs the tests ... > > Mike McCandless > > http://blog.mikemccandless.com > > > > > -- > > Ravi > > > > > > > > On Mon, May 13, 2013 at 9:19 PM, Michael McCandless < > > luc...@mikemccandless.com> wrote: > > > >> That code looks correct. > >> > >> But can you tie it all together into a runnable test case? Ie add in > >> the terms enum, calling docFreq and getting 0 when it should be 1. > >> > >> Also, if you run CheckIndex on the index produced by the code below, > >> how many terms/freqs/positions does it report? > >> > >> Mike McCandless > >> > >> http://blog.mikemccandless.com > >> > >> > >> On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan > >> <ravikumar.govindara...@gmail.com> wrote: > >> > Indexing code below. Looks very simple. Is this correct? > >> > > >> > IndexWriterConfig conf = new > >> > IndexWriterConfig(Version.LUCENE_42, new > >> > StandardAnalyzer(Version.LUCENE_42)); > >> > conf.setOpenMode(OpenMode.CREATE_OR_APPEND); > >> > String indexPath = "<some-file-path>"; > >> > Directory dir=FSDirectory.open(new File(indexPath)); > >> > writer = new IndexWriter(dir,conf); > >> > FieldType type = new FieldType(); > >> > type.setTokenized(true); > >> > type.setIndexed(true); > >> > type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS); > >> > Field field = new Field("content", "one two two three", type); > >> > luceneDoc.add(field); > >> > writer.addDocument(luceneDoc); > >> > writer.close(); > >> > > >> > Reading docFreq and totalTermFreq through terms-enum returns 0 and -1, > >> for > >> > all terms > >> > > >> > -- > >> > Ravi > >> > > >> > > >> > On Fri, May 10, 2013 at 10:19 PM, Michael McCandless < > >> > luc...@mikemccandless.com> wrote: > >> > > >> >> It should not be 0, as long as TermsEnum.next() does not return null > >> >> ... can you make a small test case? Thanks. > >> >> > >> >> Mike McCandless > >> >> > >> >> http://blog.mikemccandless.com > >> >> > >> >> > >> >> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan > >> >> <ravikumar.govindara...@gmail.com> wrote: > >> >> > I have to add that the above code is wrong. > >> >> > > >> >> > It has to be > >> >> > > >> >> > while((ref=tEnum.next())!=null) > >> >> > { > >> >> > ref = tEnum.term(); > >> >> > tEnum.docFreq(); // Even here VAL=0 > >> >> > } > >> >> > > >> >> > Apologies for the mistake, but the problem remains > >> >> > > >> >> > > >> >> > > >> >> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan < > >> >> > ravikumar.govindara...@gmail.com> wrote: > >> >> > > >> >> >> We have the following code > >> >> >> > >> >> >> SegmentInfos segments = new SegmentInfos(); > >> >> >> segments.read(luceneDir); > >> >> >> for(SegmentInfoPerCommit sipc: segments) > >> >> >> { > >> >> >> String name = sipc.info.name; > >> >> >> SegmentReader reader = new SegmentReader(sipc, 1, new > IOContext()); > >> >> >> Terms terms = reader.terms("content"); > >> >> >> TermsEnum tEnum = terms.iterator(null); > >> >> >> tEnum.docFreq(); //VAL=0 > >> >> >> tEnum.totalTermFreq(); //VAL=-1 > >> >> >> } > >> >> >> > >> >> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION > >> >> >> > >> >> >> Why does the docFreq returned as 0 for all terms. Is this > expected or > >> >> am I > >> >> >> doing something wrong? > >> >> >> > >> >> >> -- > >> >> >> Ravi > >> >> >> > >> >> >> > >> >> >> > >> >> > >> >> --------------------------------------------------------------------- > >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> >> > >> >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >