Hello I'm using lucene & nutch, but I don't now witch type of field of documents are created by nutch, I developed this program in java :
Directory dir = FSDirectory.open(new File("C:/Users/MyWebPage/index")); IndexSearcher search = new IndexSearcher(dir); int numberDoc = search.maxDoc(); System.out.println("number of doc "+ numberDoc); for (int i=0;i<numberDoc;i++) { System.out.println("Document numero "+i); Document doc = search.doc(i); for (Fieldable f :doc.getFields()) { for (Fieldable ff: doc.getFieldables(f.name())) { System.out.println("\t"+ff.name()+" "+ff.stringValue()); } System.out.println("*******************"); } I'have this result .... number of doc 1907 Document numero 0 title Convention and Visitors Office segment 20110502142927 boost 0.13529637 digest d07c6f19b2efaa8739754e9e9ff75fcc tstamp 20110502122931566 url http://ar.info.com/ ..... ... Document numero 90 title Who are we? - Presentation of the Paris Convention Bureau segment 20110502144050 boost 0.0016601664 digest 62ee8c0ff6c2ab7c91599f3c3ff18735 tstamp 20110502125316832 url http://convention.info.com/en/about-us/ my question is : what's segment, boost, digest, tstamp and how can I read it thanks for your help