Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dominique Bejean
Hi, One of our customer have huge servers - Bar-metal - 64 CPU - 512 Gb RAM - 6x2Tb disk in RAID 6 (so 2Tb disk space available) I think the best way to optimize resources usage of these servers is to install several Solr instances. I imagine 2 scenarios to be tested according to d

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread matthew sporleder
Why do you want to split it up at all? On Thu, Oct 6, 2022 at 3:58 AM Dominique Bejean wrote: > > Hi, > > One of our customer have huge servers > >- Bar-metal >- 64 CPU >- 512 Gb RAM >- 6x2Tb disk in RAID 6 (so 2Tb disk space available) > > > I think the best way to optimize resou

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Gus Heck
It depends... on your data, on your usage, etc. The best answers are obtained by testing various configurations, if possible by replaying captured query load from production. There is (for all java programs) an advantage to staying under 32 GB RAM, but without an idea of the number of machines you

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Deepak Goel
What would the iops look like? Deepak "The greatness of a nation can be judged by the way its animals are treated - Mahatma Gandhi" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" Make In India : http:

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread dmitri maziuk
On 2022-10-06 2:57 AM, Dominique Bejean wrote: Do not configure disks in RAID 6 but, leave 6 standard volumes (more space disk, more I/O available) If they're running linux: throw out the raid controller, replace with ZFS on 2 SSDs and 4 spinning rust drives. You're not going to have more i/

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Shawn Heisey
On 10/6/22 01:57, Dominique Bejean wrote: One of our customer have huge servers - Bar-metal - 64 CPU - 512 Gb RAM - 6x2Tb disk in RAID 6 (so 2Tb disk space available) I think the best way to optimize resources usage of these servers is to install several Solr instances. That

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dave
I know these machines. Sharding is kind of useless. Set the ssd tb drives up in fastest raid read available, 31 xms xmx, one solr instance. Buy back up ssd drives when you burn one out and it fails over to the master server. Multiple solr instances on one machine makes little sense unless they h

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Walter Underwood
We have kept a 72 CPU machine busy with a single Solr process, so I doubt that multiple processes are needed. The big question is the size of the index. If it is too big to fit in RAM (OS file buffers), then the system is IO bound and CPU doesn’t really matter. Everything will depend on the spe

Utilizing the Script Update Processor

2022-10-06 Thread Matthew Castrigno
The documentation here: https://solr.apache.org/guide/solr/latest/configuration-guide/script-update-processor.html#javascript Provides an example using the Script Update processor. When accessing the script, the example includes reference to it in the parameters. https://solr.apache.org/guide/sol

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dominique Bejean
Hi, Thank you all for your responses. I will try to answer your questions in one single message. We are starting to investigate performance issues with a new customer. There are several bad practices (commit, sharding, replicas count and types, heap size, ...), that can explain these issues and w

Re: Solr Admin Connection reset when connecting to Zookeeper

2022-10-06 Thread Vishal Shanbhag
Hi All, I did a further search and found a known bug (now resolved) of Solr https://issues.apache.org/jira/browse/SOLR-15849 Cursory reading of the notes suggest that this may be the root cause of the below mentioned issue. Can anyone confirm ? We are planning to upgrade our Solr environmen

coustom sharding

2022-10-06 Thread Dmitry Prus
Hello , dear Solr team. I hope you are doing well.  Would you help me with little question. I need separate data by shards when I do commit with SolrJ.  For example fields of my collection:  phoneNo— +9985612525 , country- UZ phoneNo — +3809523636 , country- UKR   all UKR goes to Shard 1 of my co

SOLR internal error

2022-10-06 Thread Biswas, Akash (ELS-BLR)
Hello Community Members, I am doing a query in SOLR and solr is throwing error as given below: "error":{ "msg":"0", "trace":"java.lang.ArrayIndexOutOfBoundsException: 0\n\tat org.apache.lucene.util.QueryBuilder.newSynonymQuery(QueryBuilder.java:653)\n\tat org.apache.solr.parser.SolrQu

Re: SOLR removal

2022-10-06 Thread L H
Hello dear members! I registered to receive solr community emails because we were using solr in my previous company. Now, i am no longer using solr in my assessments. COULD ADMIN PLEASE REMOVE MY EMAIL FROM THE MAILING LIST? MY EMAIL: leoncehavugim...@gmail.com Thank you in advance! Kind

Re: Utilizing the Script Update Processor

2022-10-06 Thread Matthew Castrigno
I attempted to add the scripting module as described here by adding    SOLR_MODULES=scripting in solr.xml but got the following error message when I attempted to start: HTTP ERROR 404 javax.servlet.UnavailableException: Error processing the request. CoreContainer is either not initialized or shut

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread dmitri maziuk
On 2022-10-06 4:54 PM, Dominique Bejean wrote: Storage configuration is the second point that I would like to investigate in order to better share disk resources. Instead have one single RAID 6 volume, isn't it better to have one distinct not RAID volume per Solr node (if multiple Solr nodes are

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dominique Bejean
Thank you Dima, Updates are highly multi-threaded batch processes at any time. We won't have all index in RAM cache Disks are SSD Dominique Le ven. 7 oct. 2022 à 00:28, dmitri maziuk a écrit : > On 2022-10-06 4:54 PM, Dominique Bejean wrote: > > > Storage configuration is the second point tha

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dave
You should never index directly into your query servers by the way. Index to the indexing server and replicate out to you query servers and tune each as needed > On Oct 6, 2022, at 6:52 PM, Dominique Bejean > wrote: > > Thank you Dima, > > Updates are highly multi-threaded batch processes a

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread dmitri maziuk
On 2022-10-06 5:52 PM, Dominique Bejean wrote: Thank you Dima, Updates are highly multi-threaded batch processes at any time. We won't have all index in RAM cache Disks are SSD You'd have to benchmark, pref. with you real jobs, on RAID-10 (as per my previous e-mail) vs JBOD. I suspect you wo

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Walter Underwood
Run a GC analyzer on that JVM. I cannot imagine that they need 80 GB of heap. I’ve never run with more than 16 GB, even for a collection with 70 million documents. Look at the amount of heap used after full collections. Add a safety factor to that, then use that heap size. wunder Walter Underw

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread James Greene
A reason for sharding on a single server is the 2.1b max docs per core limitation. On Thu, Oct 6, 2022, 12:51 PM Dave wrote: > I know these machines. Sharding is kind of useless. Set the ssd tb drives > up in fastest raid read available, 31 xms xmx, one solr instance. Buy back > up ssd drives w

Re: Advice in order to optimise resource usage of a huge server

2022-10-06 Thread Dominique Bejean
Hi Dave, Are you suggesting to use historical Solr master/slave architecture ? In Sorlcloud / SolrJ architecture this can be achieved by creating only TLOG replicas then FORCELEADER located on a specific server (then indexing server) and search only on TLOG replicas with the parameter "shards.pre