Hi, Sorry for all the code, It got sent out accidentally.
The following code is part of the Benchmark utility in Lucene, specifically SubmissionReport.java // Here reader is the IndexReader. Iterator itr = docMap.entrySet().iterator(); int totalNumDocuments = reader.numDocs(); ScoreDoc sd[] = td.scoreDocs; String sep = " \t "; DocNameExtractor docext = new DocNameExtractor(docNameField); for (int i=0; i<sd.length; i++) { String docName = docext.docName(searcher,sd[i].doc); // ***** The Map of documents will help us get the docid int indexedDocID = docMap.get(docName); Fields fields = reader.getTermVectors(indexedDocID); Iterator<String> strItr=fields.iterator(); /// ********** The following while is printing the fieldNames which only show 2 fields out of the 5 that I am looking for. while(strItr.hasNext()) { String fieldName = strItr.next(); System.out.println("next field " + fieldName); } Document DocList= reader.document(indexedDocID); List<IndexableField> field_list = DocList.getFields(); /// ****** The following for loop prints the five fields and it's related information. for(int j=0; j < field_list.size(); j++) { System.out.println ( "list field is : " + field_list.get(j).name() ); IndexableFieldType IFT = field_list.get(j).fieldType(); System.out.println(" Field storeTermVectorOffsets : " + IFT.storeTermVectorOffsets()); System.out.println(" Field stored :" + IFT.stored()); } // ***************************** // } /**** THE OUTPUT for this section of code is fields size : 2 next field body next field docname list field is : docid Field storeTermVectorOffsets : false list field is : docname Field storeTermVectorOffsets : false list field is : docdate Field storeTermVectorOffsets : false list field is : doctitle Field storeTermVectorOffsets : false list field is : body Field storeTermVectorOffsets : false *******/ Hope this code comes out legible in the email. Thank you. Regards, Sachin Kulkarni On Tue, Aug 19, 2014 at 8:39 AM, Sachin Kulkarni <kulk...@hawk.iit.edu> wrote: > Hi Kumaran, > > > > The following code is part of the Benchmark utility in Lucene, > specifically SubmissionReport.java > > > Iterator itr = docMap.entrySet().iterator(); > int totalNumDocuments = reader.numDocs(); > ScoreDoc sd[] = td.scoreDocs; > String sep = " \t "; > DocNameExtractor docext = new DocNameExtractor(docNameField); > for (int i=0; i<sd.length; i++) > { > System.out.println("i = " + i); > String docName = docext.docName(searcher,sd[i].doc); > System.out.println("docName : " + docName + "\t map size " + > docMap.size()); > // ***** The Map will help us get the docid and > int indexedDocID = docMap.get(docName); > System.out.println("indexed doc id : " + indexedDocID + "\t docname : " > + docName); > // ******** GET THE tf-idf data now ************ // > Fields fields = reader.getTermVectors(indexedDocID); > System.out.println("fields size : " + fields.size()); > // **** Print log output for testing **** // > Iterator<String> strItr=fields.iterator(); > while(strItr.hasNext()) > { > String fieldName = strItr.next(); > System.out.println("next field " + fieldName); > } > Document DocList= reader.document(indexedDocID); > List<IndexableField> field_list = DocList.getFields(); > for(int j=0; j < field_list.size(); j++) > { > System.out.println ( "list field is : " + field_list.get(j).name() ); > IndexableFieldType IFT = field_list.get(j).fieldType(); > System.out.println(" Field storeTermVectorOffsets : " + > IFT.storeTermVectorOffsets()); > //System.out.println(" Field stored :" + IFT.stored()); > //for (FieldInfo.IndexOptions c : IFT.indexOptions().values()) > // System.out.println(c); > } > // *****************************88 // > > > On Tue, Aug 19, 2014 at 2:04 AM, Kumaran Ramasubramanian < > kums....@gmail.com> wrote: > >> Hi Sachin Kulkarni, >> >> If possible, Please share your code. >> >> >> - >> Kumaran R >> >> >> >> >> >> On Tue, Aug 19, 2014 at 9:07 AM, Sachin Kulkarni <kulk...@hawk.iit.edu> >> wrote: >> >> > Hi, >> > >> > I am using Lucene 4.6.0. >> > >> > I have been storing 5 fields for my documents in the index, namely body, >> > title, docname, docdate and docid. >> > >> > But when I get the fields using >> IndexReader.getTermVectors(indexedDocID) I >> > only get >> > the docname and body fields and can retrieve the term vectors for those >> > fields, but not others. >> > >> > I check to see if all the five fields are stored using >> > IndexedFieldType.stored() >> > and all return true. I also check to see that all the fields are indexed >> > and they are, but >> > still when I try to getTermVectors I only receive two fields back. >> > >> > Is there any other config setting that I am missing while indexing that >> is >> > causing this behavior? >> > >> > Thanks to Kumaran and Ian for their answers to my previous questions >> but I >> > have not been able to figure out the above one yet. >> > >> > Thank you very much. >> > >> > Regards, >> > Sachin >> > >> > >