Re: LTR Features on nested documents

2023-07-27 Thread Alessandro Benedetti
Hi Sergio,
in the block join, under the hood, nested docs are separate Lucene(Solr
documents).
Assuming you are retrieving parents after querying children (
https://solr.apache.org/guide/solr/latest/query-guide/block-join-query-parser.html#block-join-parent-query-parser)
that's all you got for reranking.
So you can't calculate features on children (which are anyway
separate docs).
It could be a nice contribution though, if you want to work on this, ahppy
to review!

Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter
 | Youtube
 | Github



On Mon, 24 Jul 2023 at 12:23, Sergio García Maroto 
wrote:

> Hi,
>
> I am trying to set up a list of features within LTR.
> I have a collection *"person" *with a design of two levels. I have Person
> documents with nested documetns classified as jobs.
>
> Within the job level I have two fields describing if the job is current and
> recency. I would like to incorporante these two as features.
> Sample of two documents, one for a person an another one for a job.
> { "PersonID":22095, "NameFullD":"Peter Peter", "_root_":"22095", "
> type_level":"parent"}, {
> { "type_level":"job", "_root_":"22095"},
> "IsCurrent":"true"},
>   "*JobEndDate*":"2021-05-30"},
> {
>
> My query runs as a blockjoin query targeting child document job and returns
> people as parent documetns.
> q="({!type=parent which=type_level:parent v='((CompanyNameNSD:ibm) AND
> (type_level:(job)))' score=total} AND type_level:(parent)))"
>
> My question is related to features when related to nested documetns. Is it
> posible to get the feaure value back.
> I tried this way but seems to work only when the query onlt targets
> children documents and gets back chikdren When I introduce {!type=parent
> which=type_level:parent
> then doesn't work. I get back
>
> isCurrentJob=0.0,originalScore=1.7668228"
>
>
> Feature store sample
>  [
>   {
> "store" : "personFeatureStore",
> "name" : "isCurrentJob",
> "class" : "org.apache.solr.ltr.feature.SolrFeature",
> "params" : {
>   "fq": ["{!terms f=PrimaryNS}true"]
> }
>   },
>   {
> "store" : "personFeatureStore",
> "name" : "originalScore",
> "class" : "org.apache.solr.ltr.feature.OriginalScoreFeature",
> "params" : {}
>   }
> ]
>
>
> Regards,
> Sergio Maroto
>


[Solr 9.3.0] Difficulties creating Category-Routed Aliases when using the SolrJ client

2023-07-27 Thread David Driver
Hi,

I'm looking at upgrading our solr docker from 9.2.1 to 9.3.0. However,
after upgrading I'm finding that our unit tests are failing for certain
scenarios when a SolrJ client is used. The version of SolrJ we use is
8.11.2, but our applications cannot upgrade the client.

I am unable to create a category-routed alias when using SolrJ.
Specifically, it fails to create the initial collection. I can create a
category-routed alias when using the REST endpoint (the new one for 9.3.0).
I also have no issue creating normal collections using either SolrJ or
REST, so I am confused why only one of these four scenarios fails.

To reproduce this, you need to first initialise collection defaults on the
Solr Cluster. For the servers we test on we send the following to http://
:8983/api/cluster

{
"set-obj-property": {
"defaults" : {
"collection": {
"numShards": 3,
"nrtReplicas": 3
}
}
}
}

Our applications can then omit these values from any request used to create
a collection since it isn't their concern. (Set these values to null in
SolrJ). When the issue occurs with SolrJ you will get an exception with the
message numShards is a required param (when using CompositeId router).

Is there a way for me to be able to get this working using SolrJ?
Category-routed aliases are the main structure we use for indexes. If there
isn't a workaround we would need to keep our docker images on 9.2.1 since
we know that version works.

Cheers
Dave Driver


Re: solr 9.3.0 file permissions issues snapshot_metadata dir

2023-07-27 Thread rajani m
It was due to https://issues.apache.org/jira/browse/SOLR-16457  the home
was set to empty string, fixing it resolved it.

Thank you Justin, appreciate the response and the link to the github which
led me to the above jira.


On Wed, Jul 26, 2023 at 11:54 AM Justin Sweeney 
wrote:

> You'll probably want to look into your Java Security Policy settings
> as that is what causes this error. You can see default security policy
> included in the Solr distribution here:
>
> https://github.com/apache/solr/blob/b9100ba775defbed5114dd92526382047ce611dc/solr/server/etc/security.policy#L4
> .
> You'll want to make sure the right properties are being set for your
> Solr data directory to ensure Solr has file permissions to that
> directory.
>
> On Wed, Jul 26, 2023 at 10:48 AM rajani m  wrote:
> >
> > Hi,
> >
> >Trying to upgrade from 9.1.1 to the latest version solr 9.3.0,
> > encountered a file permissions issue specific to this directory
> > "snapshot_metadata" which is not seen in 9.1.1. It is there in 9.2.x and
> > the latest version.
> >
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > access denied ("java.io.FilePermission"
> >
> "/mnt/data/solr/solr/legacy_v1_s1_shard3_replica_n1/data/snapshot_metadata"
> > "read")
> >
> > What is causing it?  If this directory is only accessed by backup and
> > restore features, and if we don't use that feature, can we delete this
> > directory and work without it?
> >
> > Thanks,
> > Rajani
>


what is SolrAuthV2 and why does it break replication

2023-07-27 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I have still not received any suggestions or clarifications, so I am resending 
this with a different subject.

I found that if I completely eliminate security.json, Leader/Follower 
replication succeeds; but for obvious reasons, we do want security.json to be 
there.

Setting -Dsolr.pki.sendVersion=v1 -Dsolr.pki.acceptVersions=v1,v2 does not 
help; nor does it work to set up security.json to allow replication without a 
password and to remove httpBasicAuthUser and httpBasicAuthPassword from 
solrconfig.xml on the Follower side

Does anybody have any suggestions?

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Tuesday, July 18, 2023 3:12 PM
To: users@solr.apache.org
Subject: RE: authentication for Leader/Follower replication

I am wondering whether anyone yet has any suggestions how to proceed

-Original Message-
From: Oakley, Craig (NIH/NLM/NCBI) [C]  
Sent: Thursday, July 6, 2023 4:00 PM
To: users@solr.apache.org
Subject: authentication for Leader/Follower replication

We are having problems transitioning Leader/Follower replication to Solr9.2.1

In Solr8.5 and below, what was then called Master/Slave replication had the 
annoying problem that, even though we specified httpBasicAuthUser and 
httpBasicAuthPassword, it would always attempt to connect first without a 
password before retrying with a password. This made solr.log noisy with lots of 
unnecessary login failures: but at least it worked.

When we transitioned to Solr8.11 (with the nomenclature changed to be less 
oppressive) we found that this version of Leader/Follower replication refused 
to retry (and refused to do anything with the values specified 
httpBasicAuthUser and httpBasicAuthPassword). We needed to open up replication 
in security.json to be available without password.

Now when we are preparing to upgrade to Solr9.2.1, we are having issues with 
the following:
2023-07-06 15:46:53.315 INFO  (indexFetcher-39-thread-1) [   ] 
o.a.s.h.IndexFetcher Last replication failed, so I'll force replication
2023-07-06 15:46:53.320 WARN  (indexFetcher-39-thread-1) [   ] 
o.a.s.h.IndexFetcher Leader at: 
http://[REDACTED]/solr/sequence2_shard1_replica_n1 is not available. Index 
fetch failed by exception: 
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error 
from server at http://[REDACTED]/solr/sequence2_shard1_replica_n1: Expected 
mime type in [application/octet-stream, application/vnd.apache.solr.javabin] 
but got text/html. 


Error 401 Could not load principal from SolrAuthV2 header.

HTTP ERROR 401 Could not load principal from SolrAuthV2 header.

URI:/solr/sequence2_shard1_replica_n1/replication
STATUS:401
MESSAGE:Could not load principal from SolrAuthV2 
header.
SERVLET:default





I have added "blockUnknown":false to security.json and have confirmed that the 
replication?command=indexversion command can be run without a password, and 
that it can be run with the login and password specified in httpBasicAuthUser 
and httpBasicAuthPassword

I have tried tweaking security.json with forwardCredentials values, but that 
has not helped

Any suggestions?





Re: knn parser not working as expected

2023-07-27 Thread gnandre
But the q parameter is still not working. I am stumped.

On Fri, Jul 21, 2023 at 1:43 AM gnandre  wrote:

> Thanks. If I move the knn parser syntax and value to fq param and make q
> as *:*, it works and starts giving relevant results instantly.
>
> curl -X POST -H "Content-Type: application/json" -d '{
>   "params": {
> "q": "*:*", "fq": "{!knn f=dense_vector
> topK=1}[0.06525743007659912,0.015727980062365532,0.003069591475650668,-0.016254400834441185,0.003478930564597249,-0.02475954219698906,0.020238326862454414,0.010255611501634121,0.05522076040506363,0.020635411143302917,0.05825875699520111,-0.05110647529363632,-0.04696913808584213,0.05991407483816147,-0.0003015052934642881,0.03625837340950966,-0.044656239449977875,-0.06582673639059067,-0.06842341274023056,-0.022927379235625267,0.048230838030576706,-0.12659960985183716,-0.019311215728521347,-0.04432906210422516,0.03600681200623512,0.010301047936081886,0.08415472507476807,0.04727723449468613,-0.0584205724298954,-0.045265913009643555,0.012285877950489521,0.0034233061596751213,-0.00982636958360672,-0.013216182589530945,-0.038882751017808914,-0.05872005969285965,-0.029350444674491882,0.04930287227034569,0.0022274062503129244,0.01728842593729496,-0.08762819767,-0.045831114053726196,0.072530098259449,0.03804686293005943,0.0021682181395590305,-0.05424166098237038,-0.004494055639952421,0.05843663960695267,0.058729417622089386,0.016252348199486732,0.0019551776349544525,-0.012190568260848522,-0.08235936611890793,-0.003848800901323557,0.028969185426831245,0.047798849642276764,-0.04074695333838463,-0.10175333172082901,0.06699151545763016,-0.06788542866706848,-0.01607389748096466,0.07294511049985886,0.007754810154438019,0.039606861770153046,0.07451225817203522,-0.02967959391212,0.014015864580869675,0.08055979013442993,0.0010412412229925394,0.13284511864185333,-0.013288799673318863,-0.05446619912981987,-0.03510258346796036,-0.12459734082221985,-0.017629574984312057,-0.04287091642618179,-0.019087448716163635,0.027409998700022697,-0.040427371859550476,-0.1713477075099945,-0.0035959691740572453,0.01750982739031315,-0.06452985852956772,0.10622204840183258,-0.06865541636943817,0.06022517383098602,0.03378240391612053,0.02320132404565811,0.02072194404900074,0.03390982002019882,0.0051648980006575584,0.05843415856361389,-0.07012602686882019,0.046549294143915176,0.005304296966642141,0.09183698892593384,0.060101959854364395,-0.031673040241003036,0.03126641735434532,0.10213921219110489,0.07624002546072006,-0.09995660930871964,0.03316718339920044,-0.040208760648965836,-0.016963355243206024,-0.01603076048195362,-0.00566966412588954,0.0570228286087513,0.006566803902387619,0.028397461399435997,-0.03737075999379158,-0.03357473015785217,-0.05060608312487602,0.0882791057229042,0.14182551205158234,0.01651209406554699,0.047577112913131714,-0.028357332572340965,-0.12397051602602005,0.03264006972312927,0.030581200495362282,0.025287700816988945,-0.08509892970323563,0.032361947267,-0.06732083112001419,0.0193667970597744,0.07096285372972488,-5.732041797079612e-33,0.033934514969587326,0.029480531811714172,-0.024119360372424126,0.03248802572488785,0.060654137283563614,-0.04089922457933426,-0.06845896691083908,0.015865417197346687,-0.03816983848810196,0.12768638134002686,-0.047979939728975296,0.01888129487633705,0.01966758444905281,-0.021792754530906677,-0.00209379056468606,-0.060791824012994766,0.07595516741275787,-0.05137578397989273,-0.020345840603113174,0.02730456180870533,-0.08421282470226288,0.0052170781418681145,-0.0396740548312664,0.013655638322234154,0.043763574212789536,0.0368662029504776,-0.021710995584726334,0.03603581339120865,0.04991370812058449,-0.007524373475462198,0.033250145614147186,0.0669487863779068,-0.012807670049369335,-0.08904062211513519,-0.04803512617945671,-0.0461772084236145,0.018098553642630577,0.01096352282911539,0.0617918036878109,0.014066621661186218,-0.03305654972791672,-0.08129353821277618,-0.025270603597164154,0.03537251427769661,0.06029881164431572,0.06169535592198372,0.0355769582092762,0.03534447401762009,-0.047377053648233414,0.053076375275850296,-0.019250469282269478,-0.03837420791387558,-0.00834209006279707,0.031550273299217224,0.004682184662669897,0.0590718574821949,0.0326957181096077,-0.041941817849874496,-0.04179370403289795,-0.010403091087937355,0.11914990842342377,-0.049126915633678436,0.015761952847242355,-0.012162514962255955,-0.05942496284842491,0.04794146493077278,-0.06834675371646881,-0.03294386342167854,0.02242257259786129,0.0774146020412445,-0.1095564718246,0.023828692734241486,0.054935190826654434,0.0202674251049757,-0.057155776768922806,-0.009578827768564224,-0.051850661635398865,0.09117215871810913,-0.07315851002931595,-0.0019339871359989047,-0.05835318937897682,-0.058747921139001846,-0.05519327148795128,-0.014699703082442284,-0.0020833320450037718,-0.05721793323755264,0.055632084608078,0.006448595318943262,0.0034963993821293116,-0.031087594106793404,-0.09541762620210648,0.03679275885224342,-0.012651922181248665,-0.03897

Crawling Italian language site in Solr

2023-07-27 Thread Fiz N
Hi SOLR Experts,

 In Azure VM (Linux), we have installed Solr version 8.11.2 and Nutch
Crawler (apache-nutch-1.19). Crawling the site for Italian Language we
added the tokenizer. *In the Solr admin screen we could see the document
but in English language.*

Please see the below attached managed schema Code Changes.



Regards

Fiz A.