Re: how to fix full-import indexed document not shown on query search

Shawn Heisey Thu, 07 Dec 2023 13:49:01 -0800

On 12/7/23 07:56, Vince McMahon wrote:

{
   "responseHeader": {
     "status": 0,
     "QTime": 0
   },
   "initArgs": [
     "defaults",
     [
       "config",
       "db-data-config.xml"
     ]
   ],
   "command": "status",
   "status": "idle",
   "importResponse": "",
   "statusMessages": {
     "Total Requests made to DataSource": "1",
     "Total Rows Fetched": "915000",
     "Total Documents Processed": "915000",
     "Total Documents Skipped": "0",
     "Full Dump Started": "2023-12-07 02:54:29",
     "": "Indexing completed. Added/Updated: 915000 documents. Deleted
0 documents.",
     "Committed": "2023-12-07 02:54:51",
     "Time taken": "0:0:21.831"
   }
}

There's no way Solr can index 915000 docs in 21 seconds without a LOT ofthreads in the indexing program, and DIH is single-threaded. As you'vealready noted, it didn't actually index most of the documents. I don'thave an answer as to why it didn't work.

DIH lacks decent logging, error handling, and multi-threading. It isnot the most reliable way to index. This is why it was deprecated awhile back and then removed from 9.x. You would be far better offwriting your own indexing program rather than using DIH.

I have an idea for a multi-threaded database->solr indexing program, buthaven't had much time to spend on it. If I can ever get it done, itwill be freely available.

On the entity, "rows" is not a valid attribute. To control how many DBrows are fetched at a time, set batchSize on the dataSource element.The default batchSize is 500.


Thanks,
Shawn

Re: how to fix full-import indexed document not shown on query search

Reply via email to