Shawn and Alexandre,

On 5/17/23 13:12, Shawn Heisey wrote:
On 5/17/23 10:01, Christopher Schultz wrote:
The "all" field contains copies of the other fields values for each record I've studied except for "identifier".

I have re-indexed the whole document set and the "all" field still does not contain the values I can see (in the search results) for "identifier".

Can you share your schema?  If you need to redact sensitive info from it, please do it in a way that ensures we can distinguish one bit of redacted data from other redacted bits.

Part of my intent in asking is to find out the answer to the question that Alexandre asked.  It will also provide data to determine what questions I will ask next.

All of my fields are stored (this is how I knew that other field values were in fact available in the "all" field).

I think this might be a false-alarm. As careful as I tried to be to make sure to described the situation as accurately and completely as possible, I cannot replicate it. I inserted a new document into the index (via my own software) and the field values were copied as expected.

I wonder if my problem was a timing issue between inserting the document and the server-specified soft-auto-commit value. We know there is a delay between when the data are inserted into the index and when they can be found successfully via a search. I did not take any screenshots at the time of the field values so I can't even be sure I wasn't just having selective-vision at the time.

Thanks for your replies and I apologize for the noise. I'll pick this thread back up if for some reason I am able to reproduce the issue.

Speaking of the lag-between-insert-and-searchability, is there any information Solr is able to provide regarding a core's freshness? I have an administrative interface in my application I've been building which is able to provide some basic information about a core, "freshen" a core schema, and re-index the core with data from my application. I would love to be able to show "last data added to index today 13:34:46" and "last soft commit/searcher-open (or whatever the right term is) today 13:32:00" so the admin can see "okay, we have a blind-spot which extends 00:02:46 into the past". Does the core metadata give that kind of info? I'm currently using SolrJ's CoreAdminRequest.getStatus call to get the metadata.

I can see this kind of data in there (this is old data in a code-comment; please ignore the actual values):
            // index={numDocs=85,maxDoc=90,deletedDocs=5,
            // indexHeapUsageBytes=-1,
            // version=2093,
            // segmentCount=8,
            // current=true,
            // hasDeletions=true,
// directory=org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(MMapDirectory@/path lockFactory=org.apache.lucene.store.NativeFSLockFactory@52e9883c; maxCacheMB=48.0 maxMergeSizeMB=4.0),
            // segmentsFile=segments_az,
            // segmentsFileSizeInBytes=650,
            // userData={commitCommandVer=0, commitTimeMSec=1678132702948},
// lastModified=Mon Mar 06 14:58:22 EST 2023,sizeInBytes=56606,size=55.28 KB}}}

Presumably, lastModified gives me the timestamp the last document was added. What about when the index was opened for searching?

As always, thank you for your thoughts.

-chris

Reply via email to