Re: Slow softCommits under heavy load?

2023-07-24 Thread Koen De Groote
Thanks for your advice, I'll try this out.

Regards,
Koen

On Mon, Jul 24, 2023 at 3:47 AM Rahul Goswami  wrote:

> Ok if the count for cache autoWarm for the various caches is 0 then there
> is no cache warmup, so that shouldn’t be contributing to the slowness.
>
> For now, I would recommend increasing the autoSoftCommit interval to a
> higher number like 6 (1 min) and see if you observe any difference in
> performance, although stopping those softCommits upon each client update
> call (especially during heavy write load) is what should help more.
>
> -Rahul
>
> On Sun, Jul 23, 2023 at 3:54 PM Koen De Groote  wrote:
>
> > Point taken.
> >
> > Going over the code, I am seeing *autowarm*Count="0" a few times in the
> > config XML near various LRU and fastLRU cache definitions. Not seeing
> > specific queries defined in any XML.
> >
> > Regards,
> > Koen
> >
> > On Sun, Jul 23, 2023 at 7:10 PM Rahul Goswami 
> > wrote:
> >
> > > “The application in question was creating a document per interaction
> and
> > > doing a
> > > soft commit at the end of the interaction.“
> > >
> > > You also mentioned your autoSoftCommit interval is 1 sec. If you really
> > > need NRT, I would suggest the client stop sending a softCommit upon
> each
> > > insert since the (extremely) short autoSoftCommit interval is anyway
> > taking
> > > care of making the writes available immediately .
> > >
> > > If that is not possible, try increasing the autoSoftCommit interval in
> > > solrconfig. You don’t need both. Also, do you have any autoWarm cache
> > > queries?
> > >
> > > -Rahul
> > >
> > >
> > >
> > > On Sun, Jul 23, 2023 at 1:00 PM Koen De Groote 
> > wrote:
> > >
> > > > According to monitoring, there's no increase in use of file handles
> or
> > > file
> > > > descriptors in the period of heavy load, on the entire system.
> > > >
> > > > On Sun, Jul 23, 2023 at 3:30 PM ufuk yılmaz
> >  > > >
> > > > wrote:
> > > >
> > > > > Can it be related to file descriptor/open file handles limit?
> > > > >
> > > > > —
> > > > >
> > > > > > On 23 Jul 2023, at 14:24, Koen De Groote 
> > wrote:
> > > > > >
> > > > > > Shawn,
> > > > > >
> > > > > > After having a look at these files: No, I cannot share them.
> > > > > >
> > > > > > What I can say is that there's a couple hundred fields,
> > dynamicFields
> > > > and
> > > > > > copyFields(each).
> > > > > >
> > > > > > The updatehandler uses solr.DirectUpdateHandler2(the only one I
> can
> > > see
> > > > > in
> > > > > > the source code extending the regular updateHandler), with a max
> > > > > autoCommit
> > > > > > time of 6 and a max autoSoftCommit time of 1000
> > > > > >
> > > > > > Regards,
> > > > > > Koen
> > > > > >
> > > > > >
> > > > > >> On Sun, Jul 23, 2023 at 1:43 AM Shawn Heisey <
> > elyog...@elyograg.org
> > > >
> > > > > wrote:
> > > > > >>
> > > > > >>> On 7/22/23 17:09, Koen De Groote wrote:
> > > > > >>> Recently, I experienced softCommits taking up to 30 seconds to
> > > > return.
> > > > > >> The
> > > > > >>> application in question was creating a document per interaction
> > and
> > > > > >> doing a
> > > > > >>> soft commit at the end of the interaction. After a period of a
> > few
> > > > > dozens
> > > > > >>> clients sending a continuous stream of such interactions, I
> could
> > > see
> > > > > it
> > > > > >>> getting slower and slower and the source appears to be the
> > > > softCommit.
> > > > > >>>
> > > > > >>> No changes in the XML config have been provided in terms of
> > commit
> > > > > >> timings.
> > > > > >>> The JVM is given 20GB heap space, of which it seems to hang
> > steady
> > > at
> > > > > >> 10GB
> > > > > >>> at all times throughout usage, and there's some 25M documents
> > > getting
> > > > > up
> > > > > >> to
> > > > > >>> a total of 150GB of data on disk. Everything is on 1 shard,
> with
> > 2
> > > > > hosts
> > > > > >>> each having 1 instance of the collection. The underlying disk
> is
> > an
> > > > SSD
> > > > > >> on
> > > > > >>> both hosts.
> > > > > >>>
> > > > > >>> Before I dive into documentation or code, I was wondering if
> > anyone
> > > > > here
> > > > > >>> might have immediate ideas of what could cause such behavior
> for
> > > soft
> > > > > >>> commits.
> > > > > >>>
> > > > > >>> If someone has an immediate bit of knowledge towards what
> causes
> > > > > >>> softCommits to take up to 30 seconds, that'd be appreciated.
> > > > > >>
> > > > > >> Can you share the whole config -- solrconfig.xml, the schema,
> and
> > > any
> > > > > >> file(s) referenced by either of those.  The schema may be named
> > > > > >> managed-schema.xml, managed-schema, or schema.xml (or even
> > something
> > > > > >> different) depending on Solr version and the rest of the config.
> > > > > >>
> > > > > >> Normally there isn't anything sensitive in these files, but if
> you
> > > do
> > > > > >> have something, redact it as minimally as possible, don't just
> > > delete
> > > > > >> the whole section with the sensitive data.
> > >

LTR Features on nested documents

2023-07-24 Thread Sergio García Maroto
Hi,

I am trying to set up a list of features within LTR.
I have a collection *"person" *with a design of two levels. I have Person
documents with nested documetns classified as jobs.

Within the job level I have two fields describing if the job is current and
recency. I would like to incorporante these two as features.
Sample of two documents, one for a person an another one for a job.
{ "PersonID":22095, "NameFullD":"Peter Peter", "_root_":"22095", "
type_level":"parent"}, {
{ "type_level":"job", "_root_":"22095"},
"IsCurrent":"true"},
  "*JobEndDate*":"2021-05-30"},
{

My query runs as a blockjoin query targeting child document job and returns
people as parent documetns.
q="({!type=parent which=type_level:parent v='((CompanyNameNSD:ibm) AND
(type_level:(job)))' score=total} AND type_level:(parent)))"

My question is related to features when related to nested documetns. Is it
posible to get the feaure value back.
I tried this way but seems to work only when the query onlt targets
children documents and gets back chikdren When I introduce {!type=parent
which=type_level:parent
then doesn't work. I get back

isCurrentJob=0.0,originalScore=1.7668228"


Feature store sample
 [
  {
"store" : "personFeatureStore",
"name" : "isCurrentJob",
"class" : "org.apache.solr.ltr.feature.SolrFeature",
"params" : {
  "fq": ["{!terms f=PrimaryNS}true"]
}
  },
  {
"store" : "personFeatureStore",
"name" : "originalScore",
"class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
"params" : {}
  }
]


Regards,
Sergio Maroto


Re: [EXTERNAL] Re: upgrade to 8.6 to 9.2

2023-07-24 Thread Adam Constabaris
If the API endpoints and querying still works (you didn't mention this, but
the fact that you mentioned the admin console specifically suggests that
perhaps that's the only thing not working), and you've previously accessed
the console using the same hostname, perhaps it's cached JS/CSS that's
getting in the way?  You can clear the cache in most browsers by holding
down shift and reloading the page.



On Fri, Jul 21, 2023 at 11:04 AM Oakley, Craig (NIH/NLM/NCBI) [C]
 wrote:

> On thing that comes to mind is to have this in your start.sh script:
>
> export SOLR_JETTY_HOST="0.0.0.0"
>
> -Original Message-
> From: Shawn Heisey 
> Sent: Friday, July 21, 2023 8:48 AM
> To: users@solr.apache.org
> Subject: [EXTERNAL] Re: upgrade to 8.6 to 9.2
>
> On 7/20/23 20:23, Arin Ekandem wrote:
> > I performed upgrade from 8.6 to 9.2. Solr starts but the console no
> longer renders.
> >
> > Are there particular areas I should look at that would cause this? I
> have verified that /etc/default/solr.in.sh is configured correctly.
>
> By "console" are you talking about the admin UI?  I am not aware of any
> kind of console for Solr.
>
> We will need to start with solr.log and see if there is anything useful
> there.
>
> Attachments don't work on the mailing list, so place the file on a paste
> or file sharing site and give us a URL.
>
> Thanks,
> Shawn
> CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you recognize the sender and are
> confident the content is safe.
>
>


Add a new Shard to the collection

2023-07-24 Thread HariBabu kuruva
Hi All,

I would like to add a new shard to the existing collection to have better
performance.  Currently we have one shard.

Solr - 8.11.1
Nodes(servers) - 10 (Non prod - 4 nodes)
Zookeepers-5

I have tried the SPLITSHARD command in one of the non prod environments.

*https://solrserver.corp.company.com:8981/solr/admin/collections?action=SPLITSHARD&collection=abcStore&shard=shard1
*
Now i can see total 3 shards
Shard1
Shard1_0
Shard1_1

But Shard1 is shown as inactive. Please let me know if we need to remove
this ?

Please help me if this is the correct way of splitting the shard.
Are there any impacts to the data because of this ?
What are the measures to be taken  while doing this in a PROD environment.

-- 

Thanks and Regards,
 Hari
Mobile:9790756568


Json ambiguous condition b/n a flat and child doc for partial updates

2023-07-24 Thread rajani m
Hi Solr Users,

Sending a partial update in json format to a doc that has only 2 fields {
"id":"10", "contributor_name":"john"} fails with an error as seen below.

{ "id":"10", "contributor_name":{"set":"Doe"}}

Unable to index docs with children: the schema must include definitions for
both a uniqueKey field and the '_root_' field, using the exact same
fieldType => org.apache.solr.common.SolrException: Unable to index docs
with children: the schema must include definitions for both a uniqueKey
field and the '_root_' field, using the exact same fieldType

The same does not fail if the payload is xml.


  
10
"Doe"
  


There are no parent child docs in the index and the index has an "id" field
as a unique string type key. There is no _root_ field in the index.

Trying to figure out the cause and any alternative to avoid the issue while
sending doc as json?

Thank you,
Rajani


Re: Json ambiguous condition b/n a flat and child doc for partial updates

2023-07-24 Thread rajani m
The field type is SortableTextField



On Mon, Jul 24, 2023 at 2:03 PM rajani m  wrote:

> Hi Solr Users,
>
> Sending a partial update in json format to a doc that has only 2 fields {
> "id":"10", "contributor_name":"john"} fails with an error as seen below.
>
> { "id":"10", "contributor_name":{"set":"Doe"}}
>
> Unable to index docs with children: the schema must include definitions
> for both a uniqueKey field and the '_root_' field, using the exact same
> fieldType => org.apache.solr.common.SolrException: Unable to index docs
> with children: the schema must include definitions for both a uniqueKey
> field and the '_root_' field, using the exact same fieldType
>
> The same does not fail if the payload is xml.
>
> 
>   
> 10
> "Doe"
>   
> 
>
> There are no parent child docs in the index and the index has an "id"
> field as a unique string type key. There is no _root_ field in the index.
>
> Trying to figure out the cause and any alternative to avoid the issue
> while sending doc as json?
>
> Thank you,
> Rajani
>
>


BlendedInfixSuggester replication

2023-07-24 Thread r ohara
Hello all,

We are using Solr 8.11.2 in solrcloud mode and using the
BlendedInfixSuggester for autocomplete for our site. We have a very large
index and it takes almost 2 days to finish building so during this time,
autosuggest isn't available. It's a TLOG/PULL replica setup, so we tried to
build on the TLOG, and copy over the blendedInfixSuggesterIndexDir to the
PULL replicas but we just get empty results back. I found this ticket (
https://issues.apache.org/jira/browse/SOLR-866)
 which implies that
replication is not supported. Is there a good way to deal with this? We
have continuous updates so we would like to build at least once a week.

Thanks in advance