I was looking at a JIRA issue someone posted pertaining to optimizing highlighting for when there are term vectors ( SOLR-5855 ). I dug into the details a bit and learned something unexpected: CompressingTermVectorsReader.get(docId) fully loads all term vectors for the document. The client/user consuming code in question might just want the term vectors for a subset of all fields that have term vectors. Was this overlooked or are there benefits to the current approach? I can’t think of any except that perhaps there’s better compression over all the data versus in smaller per-field chunks; although I’d trade that any day over being able to just get a subset of fields. I could imagine it being useful to ask for some fields or all — in much the same way we handle stored field data.
~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley
