Hi,

I'm experimenting with Solr and indexing schemaless JSON content.

I'm using the latest docker image of Solr, and just testing various things.

The indexing and querying works as I would expect for documents of reasonable 
size.

However, if I ask it to index a document that is ~100MB, I'm unable to query any results from this document.

Yet, I can't find any indication that there was an error in indexing the 
document.

Indexing:

curl -vv 'http://localhost:8983/solr/gettingstarted/update/json/docs?f=/docs/**&commit=true' -H 'Content-type: application/json' -d @837-10000-2022010415135.json
*   Trying 127.0.0.1:8983...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8983 (#0)
> POST /solr/gettingstarted/update/json/docs?f=/docs/**&commit=true HTTP/1.1
> Host: localhost:8983
> User-Agent: curl/7.68.0
> Accept: */*
> Content-type:application/json
> Content-Length: 97581522
> Expect: 100-continue
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Security-Policy: default-src 'none'; base-uri 'none'; connect-src 'self'; form-action 'self'; font-src 'self'; frame-ancestors 'none'; img-src 'self'; media-src 'self'; style-src '
self' 'unsafe-inline'; script-src 'self'; worker-src 'self';
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< Content-Type: text/plain;charset=utf-8
< Vary: Accept-Encoding, User-Agent
< Content-Length: 57
<
{
 "responseHeader":{
   "status":0,
   "QTime":376}}
* Connection #0 to host localhost left intact


No errors are logged in the log file:

2022-03-03 22:57:00.412 INFO (searcherExecutor-26-thread-1-processing-x:gettingstarted) [ x:gettingstarted] o.a.s.c.QuerySenderListener QuerySenderListener done. 2022-03-03 22:57:00.414 INFO (searcherExecutor-26-thread-1-processing-x:gettingstarted) [ x:gettingstarted] o.a.s.c.SolrCore [gettingstarted] Registered new searcher autowarm time: 0 ms 2022-03-03 22:57:00.414 INFO  (qtp1515403487-49) [ x:gettingstarted] o.a.s.u.p.LogUpdateProcessorFactory [gettingstarted]  webapp=/solr path=/update/json/docs params={f=/docs/**&commit=true}{add=[fbb18697-d823-46e8-8571-6dde6750634b (1726321231525838848)],commit=} 0 369

The "Num Docs" reported in the solr GUI increases each time I do this.

A query for everything (*:*) gives me the correct doc count.

But no matter what I query for, I cannot get a result from inside the large document.  Am I hitting some limit that is silently messing up the indexing and/or the query return?

Thanks,

Dan




Reply via email to