On 7/22/2021 11:53 AM, Jon Morisi wrote:
RE Shawn and Michael,
I am just looking for a way to speed it up. Mike Drob had mentioned docvalues,
which is why I was researching that route.
I am running my search tests from solr admin, no facets, no sorting. I am
using Dsolr.directoryFactory=HdfsDirectoryFactory
Getting good caching with HDFS is something I am not sure how to do. I
would bet that you have to assign a whole bunch of memory to the Solr
heap and then allocate a lot of that to the HDFS client for caching
purposes.
You can take a look at this wiki page I wrote, but keep in mind that it
is tailored for local disks, not HDFS:
https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems
Is there any way you can switch to local disks instead of HDFS? Solr
tends to perform badly with indexes on the network instead of local.
What are you trying to achieve with your usage of HDFS?
URL:
. /select?q=ptokens:8974561 AND ptokens:9844554 AND ptokens:8564484 AND
ptokens:9846541&echoParams=all
Response once it ran (timeout on first attempt, waited 5min for re-try):
responseHeader
zkConnected true
status 0
QTime 2411
params
q "ptokens:243796009 AND ptokens:410512000 AND ptokens:410604004 AND
ptokens:408729009"
df "data"
rows "10"
echoParams "all"
What is the field definition for ptokens and what is the fieldType
definition for the type referenced in the field definition? If this
field is set up as a numeric Point type, you're running into a known
limitation -- single-value lookups on Point fields are slow, and if the
field cardinality is high, then make that VERY slow. The workaround
would be to switch to either a String type or a Trie type, and
completely reindex. Trie types are deprecated and will eventually be
removed from Solr. Or you could turn the query into a range query, and
it would work much better -- Point types are EXCELLENT for range queries.
dashboard info:
System 0.16 0.13 0.14
Physical Memory 97.7%
377.39 GB
368.77 GB
Swap Space 4.7%
4.00 GB
193.25 MB
File Descriptor Count 0.2%
128000
226
JVM-Memory 22.7%
15.33 GB
15.33 GB
If disabling swap as Michael is suggesting DOES make performance better,
I think that would be an indication of some very strange system level
problems. I don't expect it to change anything.
Thanks,
Shawn