Hello, re the "400 OK". I don't see that happening myself locally, I have the correct "Bad Request" status line when making requests directly to the /update handler. Perhaps it's an issue in Solarium-PHP?
On Wed, 19 Mar 2025 at 13:26, Ehrenleitner Robert Harald < robert.ehrenleit...@plus.ac.at> wrote: > Hi, > > that was fast. > > Actually, I see that the documents which do not have a title are also > missing in the index of the older Solr version which is still fed by the > older version of Solarium-PHP. So, probably the newer version of > Solarium-PHP exposes an error which was there before but was not logged. I > don't want to check this now. > > As a side node: It seems like Solr responds with HTTP status "400 OK", > which is not a good idea. It should be "400 Invalid request". > > Thanks for the advice with the filename, that's a good idea. I will modify > the crawler to fallback to the slug (special term from WordPress) or to the > filename if the title is empty. > > Kind regards, > > > > > Mag.phil. Robert Ehrenleitner, BEng. > -- > > Mag.phil. Robert Ehrenleitner, BEng. > > Web-Developer > > IT-Services | Application & Digitalization Services > > Hellbrunner Straße 34 | 5020 Salzburg | Austria > > Tel.: +43/(0)662/8044 - 6778 > > *www.plus.ac.at <http://www.plus.ac.at>* > > > > ------------------------------ > *Von:* Colvin Cowie <colvin.cowie....@gmail.com> > *Gesendet:* Mittwoch, 19. März 2025 11:51 > *An:* users@solr.apache.org <users@solr.apache.org> > *Betreff:* Re: Solr throws errors on empty fields on ingestion > > [Sie erhalten nicht häufig E-Mails von colvin.cowie....@gmail.com. > Weitere Informationen, warum dies wichtig ist, finden Sie unter > https://aka.ms/LearnAboutSenderIdentification ] > > Required fields need non-empty values, as far as I know there's no > exceptions to that. > > Take this from the UX/end user perspective. If a document has no title, or > an empty title, what does a user expect to see and do with that? > If they expect to see *something* then yes I think you should insert a > suitable default or a fallback value like the file name or url. > If they don't expect to see something (and you can't always provide a > title), then the title shouldn't be marked as required. > > On Wed, 19 Mar 2025 at 10:03, Ehrenleitner Robert Harald < > robert.ehrenleit...@plus.ac.at> wrote: > > > > > > > Hi all, > > > > we have a crawler built on our own based on Solarium-PHP which ingests > > Solr. Since I have upgraded from 9.6.1 to 9.8.0, I see errors in the log > of > > the crawler. It tells me that Solr complains that the field "title" is > > missing. Acutally, it is part of the request, but it's just empty. > > > > This is a snippet of the request body (for this to be output, I have > > inserted a var_dump() in an appropriate place of Solarium-PHP): > > > > Content-Disposition: form-data; name="literal.publishDate" > > Content-Type: text/plain;charset=UTF-8 > > > > 2023-01-12T10:25:06Z > > --00000000000002800000000000000000 > > Content-Disposition: form-data; name="literal.title" > > Content-Type: text/plain;charset=UTF-8 > > > > > > --00000000000002800000000000000000 > > Content-Disposition: form-data; name="literal.number" > > > > And this is the response: > > > > Error indexing document 14935: wp-content/uploads/loremipsum.pdf: Solr > > HTTP error: OK (400) > > { > > "responseHeader":{ > > "status":400, > > "QTime":121 > > }, > > "error":{ > > > > > "metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.common.SolrException"], > > "msg":"[doc=141396] missing required field: title", > > "code":400 > > } > > } > > > > I cannot fix the PDF file having no title (for various non-technical > > reasons), nevertheless it was working fine until before the upgrade. > > > > The schema was created with this JSON data, especially its title field: > > { > > /* something left out here */ > > { > > "name": "title", > > "type": "text_general", > > "stored": true, > > "indexed": true, > > "multiValued": false, > > "required": true > > }, > > /* something left out here */ > > } > > > > The document is not being indexed. > > > > How can I fix this? Is there probably something in the schema (JSON data) > > I have to change? Or is it better to replace empty titles with some > > constant non-empty string (this can be done in the crawler)? > > > > I have noticed that in the documentation regarding the field option > > "required", it says: > > > > Instructs Solr to reject any attempts to add a document which does not > > have a value for this field. This property defaults to false. > > > > This is ambiguous for me. What is meant with "does not have a value?" > > Well, the value is present but it is an empty string. > > > > Kind regards, > > > > Mag.phil. Robert Ehrenleitner, BEng. > > -- > > > > Mag.phil. Robert Ehrenleitner, BEng. > > > > Web-Developer > > > > IT-Services | Application & Digitalization Services > > > > Hellbrunner Straße 34 | 5020 Salzburg | Austria > > > > Tel.: +43/(0)662/8044 - 6778 > > > > *www.plus.ac.at <http://www.plus.ac.at>* > > > > > > >