Re: Zk big files issues and model store

2023-10-19 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hello Florin.

Of course, do feel free to open an issue and/or draft pull request and/or pull 
request.

If the model is wrapped internally, it would be smaller than the original 
(since no two-space indentations) but slightly bigger than the compacted (with 
zero-space indentations) due to \" escaping for " characters.

Illustration:

$cat a.txt
"x" : {
  "y" : {
"z" : {
  "foobar" : 42
}
  }
}
$wc -c a.txt
  62 a.txt

$cat b.txt
"x" : {
"y" : {
"z" : {
"foobar" : 42
}
}
}
$wc -c b.txt
  44 b.txt

$cat c.txt
\"x\" : { \"y\" : { \"z\" : { \"foobar\" : 42 } } }
$wc -c c.txt
  52 c.txt
$

From: users@solr.apache.org At: 10/18/23 20:40:24 UTC+1:00To:  
users@solr.apache.org
Subject: Re: Zk big files issues and model store

Thanks for the suggestion Matthias. I will look into this.

Hello Christine. One of the concerns is the split nature but also that
if the file does not exist on disk when the replica reloads, the core
would not load. To keep the models in sync on each node can be quite
complicated. For example you will only have to reload the collection
only after the main model is present on all nodes, if you do it before
that the replicas will be unusable. For now we would like to load
models up to 100MB and that's why I explored this option.
I did some modifications in the code but I haven't tested them yet.
After I do the tests, I will come with a PR. Can I open an issue with
this?

If the model would be wrapped internal, wouldn't that be the same as
saving it as compacted json? It will be the approximately same size
and we will still need to load in memory the decoded object. To save
size we could reduce the features size to some abbreviations but that
will complicate the score debug.
Haven't looked yet in storing models in another format. Walter could
have a point in AVRO.

Thanks for the suggestion Eric. I am not familiar with the
/api/cluster/files endpoint. I will look into it.


În mie., 18 oct. 2023 la 01:47, Dmitri Maziuk  a scris:
>
> On 10/17/23 13:20, Walter Underwood wrote:
> >
> > Gzipping the JSON can be a big win, especially if there are lots of 
repeated keys, like in state.json. Gzip has the advantage that some editors can 
natively unpack it.
>
> It may save you some transfer time, provided the transport subsystem
> doesn't compress on the fly, but with JSON being all-or-nothing format,
> your problem's going to be RAM for the string representation plus RAM
> for the decoded object representation, of the entire store.
>
> If you want it scalable, you want an "incremental" format like asn.1,
> protocol buffers, or avro.
>
> Dima
>




Re: knn query parser, number of results and filtering by score

2023-10-19 Thread Mirko Sertic
I've prepared a testcase. Given the following documents with 
TESTEMBEDDING_EU_3 is a DenseVectorField with length 3 and euclidean 
distance function. They are written to a collection made of two shards 
with no further routing strategy, so they should be more or less evenly 
distributed between the two shards:


{
  id: 'Position1',
  TESTEMBEDDING_EU_3: [0, 0, 0]
}
{
  id: 'Position2',
  TESTEMBEDDING_EU_3: [0.1, 0.1, 0.1]
}
{
  id: 'Position3',
  TESTEMBEDDING_EU_3: [0.2, 0.2, 0.2]
}
{
  id: 'Position4',
  TESTEMBEDDING_EU_3: [0.3, 0.3, 0.3]
}
{
  id: 'Position5',
  TESTEMBEDDING_EU_3: [0.4, 0.4, 0.4]
}
{
  id: 'Position6',
  TESTEMBEDDING_EU_3: [0.5, 0.5, 0.5]
}
{
  id: 'Position7',
  TESTEMBEDDING_EU_3: [0.6, 0.6, 0.6]
}
{
  id: 'Position8',
  TESTEMBEDDING_EU_3: [0.7, 0.7, 0.7]
}
{
  id: 'Position9',
  TESTEMBEDDING_EU_3: [0.8, 0.8, 0.8]
}
{
  id: 'Position10',
  TESTEMBEDDING_EU_3: [0.9, 0.9, 0.9]
}
{
  id: 'Position11',
  TESTEMBEDDING_EU_3: [1.0, 1.0, 1.0]
}

How I'll do a {!knn f=TESTEMBEDDING_EU_3  topK=3}[1.0,1.0,1.0] query. 
I'd expect a result with 3 documents, id:Position11 should be an exact 
macht, and the nearst neighbors should be id:Position10 and 
id:Position9. I'd also expect that the explain logging should mark these 
tree as part of the topK=3. I get the following search result:


{
  "responseHeader": {
    "zkConnected": true,
    "status": 0,
    "QTime": 35
  },
  "response": {
    "numFound": 6,
    "start": 0,
    "maxScore": 1.0,
    "numFoundExact": true,
    "docs": [
  {
    "id": "Position11",
    "TESTEMBEDDING_3": [
  "1.0",
  "1.0",
  "1.0"
    ],
    "[shard]": 
"http://fusion-integ-solr-search-200gb-0.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_p9/|http://fusion-integ-solr-analytics-200gb-1.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_t7/|http://fusion-integ-solr-search-200gb-1.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_p11/|http://fusion-integ-solr-analytics-200gb-0.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_t5/";,

    "[explain]": "0.0 = not in top 3\n",
    "score": 1.0
  },
  {
    "id": "Position10",
    "TESTEMBEDDING_3": [
  "0.9",
  "0.9",
  "0.9"
    ],
    "[shard]": 
"http://fusion-integ-solr-search-200gb-0.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_p9/|http://fusion-integ-solr-analytics-200gb-1.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_t7/|http://fusion-integ-solr-search-200gb-1.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_p11/|http://fusion-integ-solr-analytics-200gb-0.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_t5/";,

    "[explain]": "0.0 = not in top 3\n",
    "score": 0.97087383
  },
  {
    "id": "Position9",
    "TESTEMBEDDING_3": [
  "0.8",
  "0.8",
  "0.8"
    ],
    "[shard]": 
"http://fusion-integ-solr-search-200gb-0.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_p17/|http://fusion-integ-solr-search-200gb-1.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_p21/|http://fusion-integ-solr-analytics-200gb-0.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_t13/|http://fusion-integ-solr-analytics-200gb-1.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_t15/";,

    "[explain]": "0.0 = not in top 3\n",
    "score": 0.89285713
  },
  {
    "id": "Position8",
    "TESTEMBEDDING_3": [
  "0.7",
  "0.7",
  "0.7"
    ],
    "[shard]": 
"http://fusion-integ-solr-search-200gb-0.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_p17/|http://fusion-integ-solr-search-200gb-1.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_p21/|http://fusion-integ-solr-analytics-200gb-0.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_t13/|http://fusion-integ-solr-analytics-200gb-1.fusion-integ-solr-analytics-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard2_replica_t15/";,

    "[explain]": "0.0 = not in top 3\n",
    "score": 0.78740156
  },
  {
    "id": "Position7",
    "TESTEMBEDDING_3": [
  "0.6",
  "0.6",
  "0.6"
    ],
    "[shard]": 
"http://fusion-integ-solr-search-200gb-0.fusion-integ-solr-search-200gb-headless:8983/solr/suchpool_atlas_2023_10_08_shard1_replica_p9/|http://fusion-integ-solr-analytics-200gb-1.fusion-i

Trouble with ZK Status and TLS

2023-10-19 Thread Jamie Gruener
We’re working on standing up a new Solr 9.4.0 cluster with ZooKeeper 3.8.3 
ensemble. We’ve configured mTLS for authentication, authorization, and comms 
for client <-> solr; TLS for solr <-> solr intra-cluster comms, and TLS for zk 
<-> zk intra-ensemble comms.

Where we are stuck is at the TLS configuration for solr<->zk comms. At least 
some parts are working since we can configure the url scheme and the 
security.json file, but when we try to browse the Solr UI to get ZK Status it 
doesn’t populate with any data. On the ZooKeeper side, we see these errors:

2023-10-19 16:08:06,403 [myid:] - ERROR 
[nioEventLoopGroup-7-1:o.a.z.s.NettyServerCnxnFactory$CertificateVerifier@468] 
- Unsuccessful handshake with session 0x0

From our testing with running `solr zk cp` command (used to upload the 
security.json file), we’re pretty sure that the problem is that solr isn’t 
trying to establish a TLS connection to satisfy the ZK Status request.

This ticket states that the TLS configuration works for at least one person, 
https://issues.apache.org/jira/browse/SOLR-16115, but I can’t find any more 
documentation about configuring this.

Any hints? Anyone get this working?

Thanks,

--Jamie


Re: Trouble with ZK Status and TLS

2023-10-19 Thread Jan Høydahl
The Admin UI ZK screen is not intended for the SSL connection to ZK, since 
status is not supported on that protocol. Instead we need to add support for 
the ZK AdminServer to get the same. Contributions are welcome.

Jan

> 19. okt. 2023 kl. 22:29 skrev Jamie Gruener :
> 
> We’re working on standing up a new Solr 9.4.0 cluster with ZooKeeper 3.8.3 
> ensemble. We’ve configured mTLS for authentication, authorization, and comms 
> for client <-> solr; TLS for solr <-> solr intra-cluster comms, and TLS for 
> zk <-> zk intra-ensemble comms.
> 
> Where we are stuck is at the TLS configuration for solr<->zk comms. At least 
> some parts are working since we can configure the url scheme and the 
> security.json file, but when we try to browse the Solr UI to get ZK Status it 
> doesn’t populate with any data. On the ZooKeeper side, we see these errors:
> 
> 2023-10-19 16:08:06,403 [myid:] - ERROR 
> [nioEventLoopGroup-7-1:o.a.z.s.NettyServerCnxnFactory$CertificateVerifier@468]
>  - Unsuccessful handshake with session 0x0
> 
> From our testing with running `solr zk cp` command (used to upload the 
> security.json file), we’re pretty sure that the problem is that solr isn’t 
> trying to establish a TLS connection to satisfy the ZK Status request.
> 
> This ticket states that the TLS configuration works for at least one person, 
> https://issues.apache.org/jira/browse/SOLR-16115, but I can’t find any more 
> documentation about configuring this.
> 
> Any hints? Anyone get this working?
> 
> Thanks,
> 
> --Jamie



Newbie Help: Replicating Between Two SolrCloud Instances (Solr 9.2.1)

2023-10-19 Thread David Filip
Dear Solr Community,

I am a Solr newbie, and have spent quite a few hours (across a couple of days) 
trying to research this, but am not quite sure what direction to go in, so 
would appreciate any suggestions.

I think I am getting confused between differences in Solr versions (most links 
seem to talk about Solr 6, and I’ve installed Solr 9), and SolrCloud vs. 
Standalone, when searching the ’Net …  so I am hoping that someone can point me 
towards what I need to do.  Apologies in advance for perhaps not using the 
correct Solr terminology.

I will describe what I have, and what I want to accomplish, to the best of my 
abilities.

I have installed Solr 9.2.1 on two separate physical nodes (different physical 
computers).  Both are running SolrCloud, and are running with the same 
(duplicate) configuration files.  Both are running their own local zookeeper, 
and are separate cores.  Let’s call them solr1 and solr2.  Right now I can 
index content on and search each one individually, but they do not know about 
each other (which is I think the fundamental problem I am trying to solve).

My goal is to replicate content from one to the other, so that I can take one 
down (e.g., solr1) and still search current collections (e.g., on solr2).  When 
I run Solr Admin web page, I can select: Collections=> {collection}, click on a 
Shard, and I see the [+ add replica] button, but I can’t add a new replica on 
the “other" node, because only the local node appears (e.g., 
10.0.x.xxx:8983_solr).  What I think I need to do is add the nodes (solr1 and 
solr2) together (?) so that I can add a new replica on the “other” node.

I’ve found references that tell me I need an odd number of zookeeper nodes (for 
quorum), so I’m not sure if I want both nodes to share a single zookeeper 
instance?  If I did do that, and let’s say that I pointed solr2 to zookeeper on 
solr1, could I still search against solr2 if solr1 zookeeper was down?  I would 
think not, but I’m not sure.

FWIW - this is NOT a high volume application, or an enterprise application, but 
I want to be able to take one node down — e.g., for maintenance or upgrades or 
backups — and not lose all of my search capabilities.

The gist of the problem, I think, is that Cloud => Nodes only shows the local 
node, as does Cloud => Graph for each shard.  I am trying to figure out how to 
replicate shards across both nodes, so that they are in-sync.  Ideally I’d like 
to be able to update from either, but if I have to designate one as master for 
updating and one as slave, I can probably live with that, as long as I could 
still search on the master.  Although from what I’ve read, I think (?) that 
with SolrCloud that master/slave concept goes away, and one node now becomes a 
leader?  But I may be confused about that ...

Nonetheless, I think I am also getting confused with terminology, and what I’m 
looking for the easiest (? best? most straight forward?) path so that I can 
search against either node if one is down, and still return search against the 
most current collection data.  What is the best way to make that happen?

Comments?  Suggestions?

Thanks,

Dave.