Solr @ Windows 10: how to delete an index

2023-03-20 Thread solr

Hi all,

I have to run a Solr project on a Windows 10 PC.
Everything went fine.
But now I have to delete an existing index, and neither the *ix  
command nor some "googled" Windows commands work.


This works fine on *ix:

curl 'http://localhost:8983/solr/my_core/update?commit=true' -d  
'*:*'


These 2 do not work on Windows:

1) C:\solr-9.1.1> java -Dc=my_core -Drecursive=yes -Dauto -jar  
example\exampledocs\post.jar '*:*'


SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/core_qst_2023/update...
Entering auto mode. File endings considered are  
xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log

Entering recursive mode, max depth=999, delay=0s
SimplePostTool: WARNING: No files or directories matching  
*:*<\query><\delete>

0 files indexed.
COMMITting Solr index changes to  
http://localhost:8983/solr/core_qst_2023/update...

Time spent: 0:00:00.497

2) PS C:\solr-9.1.1> java -Dc=my_core bin\post  
'*:*'


Error: Main Class bin\post could not be found or loaded (my  
translation from german)

Reason: java.lang.ClassNotFoundException: bin\post

Any idea on this?
Is there an *actual* manual / tutorial for Solr on Windows?

Thanks
Walter Claassen



Solr Security - Replication

2023-03-20 Thread Paul Ryder
Hi All

I've manage to configure my security.json so that a read only user can access 
the admin panel but not update any docs or create/edit collections... 
security.json is as below

One thing they *can* do, which I'd rather they couldn't, is click the "Disable 
Replication" button on the core replication screen and disable the 
replication... Any idea how to disable this for a given user/role?

Ta! Paul

{
  "authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{
  "solr-admin":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c=",
  "solr-read":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="},
"forwardCredentials":false,
"":{"v":0}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"user-role":{
  "solr-admin":["admin"],
  "solr-read":["readonly"]},
"permissions":[
  {
"name":"update",
"role":[
  "admin"],
"index":1},
  {
"name":"read",
"role":[
  "admin",
  "readonly"],
"index":2},
  {
"name":"security-edit",
"role":["admin"],
"index":3},
  {
"name":"security-read",
"role":["admin"],
"index":4},
  {
"name":"core-admin-edit",
"role":["admin"],
"index":5},
  {
"name":"collection-admin-edit",
"role":["admin"],
"index":6},
  {
"name":"config-edit",
"role":["admin"],
"index":7},
  {
"name":"config-read",
"role":["admin"],
"index":8},
  {
"name":"schema-edit",
"role":["admin"],
"index":9},
  {
"name":"filestore-write",
"role":["admin"],
"index":10},
  {
"name":"package-edit",
"role":["admin"],
"index":11},
  {
"name":"all",
"role":[
  "admin",
  "readonly"],
"index":12}],
"":{"v":0}}}



Request processing stalling when running full import from datasource

2023-03-20 Thread Abhinav Sathy
Hi,

We have a solrcloud setup indexing data for 4 data collections and we
recently upgraded our solrcloud system and version to improve scalability
and availability. For comparison:

Old system:
1. 2 solr nodes
2. 1 shard per collection
3. 2 tlog replicas per shard(1 leader and 1 follower). So, 2 replicas per
collection
4. Each solr node hosts 1 replica for each collection
5. SOLR 7.7.1
6. SOLR node information from admin console for one of the nodes(Both solr
nodes are nearly identical in configuration and resource allocation, but
running on different hosts):
Linux 4.19.0-8-amd64, 16cpu
   Memory: 94.4Gb
   File descriptors: 255/65535
   Disk: 6.0Tb used: 59%
   Load: 0.23

New system:
1. 4 solr nodes
2. 2 shards per collection
3. 2 tlog replicas per shard(1 leader and 1 follower). So, 4 replicas per
collection
4. Each solr node hosts 1 replica for each collection
5. SOLR 8.11.1
6. SOLR node information from admin console for one of the nodes(All solr
nodes are nearly identical in configuration and resource allocation, but
running on different hosts):
Linux 4.19.0-18-amd64, 16cpu
   Memory: 78.6Gb
   File descriptors: 321/65535
   Disk: 3.6Tb used: 33%
   Load: 2.86

We run a full indexing(full data import) job during the weekend for 3 of
the collections one after the other(1st one on Friday, 2nd one on Saturday
and 3rd one on Sunday). These jobs usually take anywhere between 17-30hrs
to finish running full import for the entire data in solr. The full import
happens in batches and we leverage the data import handler threads to
spread out the workload amongst 10 handlers on the leader replica. We send
a /dataimport request for a batch of ids to import with the handler
number/name(like /dataimport1). The data can be sparse based on the batch
that solr is importing and can vary in size.

In the new system we regularly see the SQL queries that SOLR runs during
full import(range query) getting stuck in the state "writing to net".
Looking at the process list and running transactions at the time,
the queries seem to have fetched the data, but seem to take a long time to
send it over the network. We also have a delta import that we run every
minute to index any new data that is added to the datasource after the max
indexed id in SOLR. So, whenever the full import stalls during the weekend,
it seems to take down the delta import with it causing the whole indexing
system to stall/hang.

When I looked at the SOLR server logs, I see the following exception being
thrown multiple times:

2023-03-18 00:38:47.807 ERROR (Thread-9693) [   ]
o.a.s.u.SolrCmdDistributor java.io.IOException: Request processing has
stalled for 100079ms with 100 remaining elements in the queue.
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:449)
~[?:?]
at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) ~[?:?]
at
org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:345)
~[?:?]
at
org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:338)
~[?:?]
at
org.apache.solr.update.SolrCmdDistributor.distribAdd(SolrCmdDistributor.java:244)
~[?:?]
at
org.apache.solr.update.processor.DistributedZkUpdateProcessor.doDistribAdd(DistributedZkUpdateProcessor.java:300)
~[?:?]
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:237)
~[?:?]
at
org.apache.solr.update.processor.DistributedZkUpdateProcessor.processAdd(DistributedZkUpdateProcessor.java:245)
~[?:?]
at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:106)
~[?:?]
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
~[?:?]
at
org.apache.solr.update.processor.FieldMutatingUpdateProcessor.processAdd(FieldMutatingUpdateProcessor.java:118)
~[?:?]
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
~[?:?]
at
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:469)
~[?:?]
at
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:80)
~[?:?]
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:271)
~[?:?]
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:547)
~[?:?]
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:435)
~[?:?]
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:350)
~[?:?]
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:235)
~[?:?]
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:427)
~[?:?]
at
org.apache.solr.handler.dataimport.

Re: Suggester index replication

2023-03-20 Thread r ohara
Would it work if we just copied over the directory? In my case the
blendedInfixSuggesterIndexDir?

Thanks

On Thu, Mar 2, 2023 at 7:17 PM Walter Underwood 
wrote:

> When we were using old style replication, I did have the suggester lexicon
> replicated along with other config files, and I think I triggered a
> suggester build
> on replication or maybe commit (which happens with every replication).
> I remember it being kind of fussy to set up. You might want to set up an
> extra
> downstream machine to play with until you get it right.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Mar 2, 2023, at 10:42 AM, gnandre  wrote:
> >
> > Thanks! I am using non-cloud mode at the moment. So, there is no way to
> > just index it to the index node and get it replicated to the search
> nodes?
> > Do I have to index to each search node?
> >
> > Do you know why the suggester indexing does not follow the usual search
> > indexing model?
> >
> > On Thu, Mar 2, 2023, 12:22 PM Walter Underwood 
> > wrote:
> >
> >> You need to send a build request to each node. I used to have some code
> to
> >> dig out the nodes from a cluster status, then send a build to each one,
> but
> >> I think that is marooned at my previous company. It isn’t super hard,
> just
> >> dig it out of the JSON.
> >>
> >> wunder
> >> Walter Underwood
> >> wun...@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>> On Mar 2, 2023, at 9:03 AM, gnandre  wrote:
> >>>
> >>> Can anybody please answer this? Many thanks in advance!
> >>>
> >>> On Wed, Feb 16, 2022 at 12:52 AM gnandre 
> >> wrote:
> >>>
>  Is there a way to get suggester index replicated to all search nodes
> >> from
>  index node? Do I need to build suggester index for each search node
>  separately?
> 
> >>
> >>
>
>


Re: Solr @ Windows 10: how to delete an index

2023-03-20 Thread Jan Høydahl
java -jar example\exampledocs\post.jar -h

Or form Admin UI "Documents" screen

http://localhost:8983/solr/#/my_core/documents
Change to XML and paste *:* into the box

Jan


> 20. mar. 2023 kl. 14:57 skrev s...@cid.is:
> 
> Hi all,
> 
> I have to run a Solr project on a Windows 10 PC.
> Everything went fine.
> But now I have to delete an existing index, and neither the *ix command nor 
> some "googled" Windows commands work.
> 
> This works fine on *ix:
> 
> curl 'http://localhost:8983/solr/my_core/update?commit=true' -d 
> '*:*'
> 
> These 2 do not work on Windows:
> 
> 1) C:\solr-9.1.1> java -Dc=my_core -Drecursive=yes -Dauto -jar 
> example\exampledocs\post.jar '*:*'
> 
> SimplePostTool version 5.0.0
> Posting files to [base] url http://localhost:8983/solr/core_qst_2023/update...
> Entering auto mode. File endings considered are 
> xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
> Entering recursive mode, max depth=999, delay=0s
> SimplePostTool: WARNING: No files or directories matching 
> *:*<\query><\delete>
> 0 files indexed.
> COMMITting Solr index changes to 
> http://localhost:8983/solr/core_qst_2023/update...
> Time spent: 0:00:00.497
> 
> 2) PS C:\solr-9.1.1> java -Dc=my_core bin\post 
> '*:*'
> 
> Error: Main Class bin\post could not be found or loaded (my translation from 
> german)
> Reason: java.lang.ClassNotFoundException: bin\post
> 
> Any idea on this?
> Is there an *actual* manual / tutorial for Solr on Windows?
> 
> Thanks
> Walter Claassen
> 



Re: Solr @ Windows 10: how to delete an index

2023-03-20 Thread Shawn Heisey

On 3/20/23 07:57, s...@cid.is wrote:

This works fine on *ix:

curl 'http://localhost:8983/solr/my_core/update?commit=true' -d 
'*:*'


These 2 do not work on Windows:

1) C:\solr-9.1.1> java -Dc=my_core -Drecursive=yes -Dauto -jar 
example\exampledocs\post.jar '*:*'


2) PS C:\solr-9.1.1> java -Dc=my_core bin\post 
'*:*'



The SimplePostTool is designed for indexing FILES to Solr, not a text 
string.  It is trying to interpret that XML data as a filename.


If you put that XML content into a file with an xml extension and run 
command number 1 with the path to that file instead of the xml data in 
single quotes, it will work.


Command number 2 is invalid.  Wherever you saw that is sharing incorrect 
information.


Or you can use curl on Windows.

Windows 10 and later come with curl.  But it has very different options 
compared to "standard" curl, and I haven't worked out how to change it 
to work with that command.


If you install Git for Windows, then you can run Git Bash and use curl 
in that prompt just like you're used to on *NIX platforms.  It even 
includes perl.


Or you can get curl for Windows.  https://curl.se/windows/

Thanks,
Shawn


Re: Solr Security - Replication

2023-03-20 Thread Shawn Heisey

On 3/20/23 10:19, Paul Ryder wrote:

I've manage to configure my security.json so that a read only user can access 
the admin panel but not update any docs or create/edit collections... 
security.json is as below





   {
 "name":"all",
 "role":[
   "admin",
   "readonly"],
 "index":12}


That section is most likely the problem.  You gave the readonly role the 
"all" permission.  Which means if the permission is not explicitly 
listed in the prior rules, the readonly role will be allowed to do it. 
Remove readonly from the all permission.  That might fix it.  I don't 
think there is an explicit permission that covers enabling or disabling 
replication, so it would fall under "all".


Because you have the "all" permission at the end, you do not need any of 
the other permissions that have been assigned only to admin, but those 
permissions are not hurting anything with the rest of the config the way 
it is.  The "read" permission DOES need both roles.  If you remove admin 
from that permission, then the admin user would not be able to do queries.


If you are running in Cloud mode or have a distributed index in 
standalone mode, disabling forwardCredentials as you have will most 
likely break queries.  Someone with more authentication expertise will 
need to confirm or refute that statement.


It took me a few hours to fully grasp how the rule-based authorization 
in Solr works.  Once it clicked, I could see why it was designed to work 
that way, which I admit is very non-intuitive.  It's enormously flexible.


--

Below is a security.json that I constructed entirely in the webui after 
uploading the sample authentication and authorization security.json 
found in the reference guide to zookeeper.  It features three users - 
read, update, and admin.


The read role has most of the read permissions found in Solr 
9.3.0-SNAPSHOT, plus health.  I excluded filestore-read and security-read.


The update role only has the "update" permission.  Which means that it 
can send update requests.


The admin role has the "all" permission.

The admin user has all three roles.  The update user has read and 
update.  The read user only gets the read role.  Due to the way the 
config is parsed, if you remove the read and update roles from the admin 
user, it won't work the way I envisioned it.


The read user can obtain a large subset of information from Solr but 
cannot make changes.  The update user can do everything read can do, 
plus make changes to the index.  The admin user has all permissions.


For better security the read role probably needs a few more permissions 
removed, particularly zk-read so only the admin can look directly into 
the ZK database.  Some experimentation is required.


I have set the coreadmin-edit permission to not allow ANYONE to do it, 
as my setup is in cloud mode.  Using CoreAdmin to make changes in cloud 
mode is a recipe for disaster.  It's probably not a good idea in 
standalone mode either, but it's FAR less likely to cause BIG problems 
in standalone mode.


-

{
  "authentication":{
"class":"solr.BasicAuthPlugin",
"credentials":{
  "admin":"REDACTED REDACTED",
  "read":"REDACTED REDACTED",
  "update":"REDACTED REDACTED"},
"":{"v":153}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"permissions":[
  {
"name":"read",
"role":["read"],
"index":1},
  {
"name":"collection-admin-read",
"role":["read"],
"index":2},
  {
"name":"config-read",
"role":["read"],
"index":3},
  {
"name":"core-admin-read",
"role":["read"],
"index":4},
  {
"name":"package-read",
"role":["read"],
"index":5},
  {
"name":"health",
"role":["read"],
"index":6},
  {
"name":"zk-read",
"role":["read"],
"index":7},
  {
"name":"schema-read",
"role":["read"],
"index":8},
  {
"name":"metrics-read",
"role":["read"],
"index":9},
  {
"name":"core-admin-edit",
"role":null,
"index":10},
  {
"name":"update",
"role":["update"],
"index":11},
  {
"name":"all",
"role":["admin"],
"index":12}],
"user-role":{
  "admin":[
"admin",
"read",
"update"],
  "read":["read"],
  "update":[
"read",
"update"]},
"":{"v":37}}}

Thanks,
Shawn


Re: Solr Security - Replication

2023-03-20 Thread Paul Ryder
Hi Shawn

Excellent answer as always….

Ta! Paul

Get Outlook for iOS

From: Shawn Heisey 
Sent: Monday, March 20, 2023 7:30:28 PM
To: users@solr.apache.org 
Subject: Re: Solr Security - Replication

On 3/20/23 10:19, Paul Ryder wrote:
> I've manage to configure my security.json so that a read only user can access 
> the admin panel but not update any docs or create/edit collections... 
> security.json is as below



>{
>  "name":"all",
>  "role":[
>"admin",
>"readonly"],
>  "index":12}

That section is most likely the problem.  You gave the readonly role the
"all" permission.  Which means if the permission is not explicitly
listed in the prior rules, the readonly role will be allowed to do it.
Remove readonly from the all permission.  That might fix it.  I don't
think there is an explicit permission that covers enabling or disabling
replication, so it would fall under "all".

Because you have the "all" permission at the end, you do not need any of
the other permissions that have been assigned only to admin, but those
permissions are not hurting anything with the rest of the config the way
it is.  The "read" permission DOES need both roles.  If you remove admin
from that permission, then the admin user would not be able to do queries.

If you are running in Cloud mode or have a distributed index in
standalone mode, disabling forwardCredentials as you have will most
likely break queries.  Someone with more authentication expertise will
need to confirm or refute that statement.

It took me a few hours to fully grasp how the rule-based authorization
in Solr works.  Once it clicked, I could see why it was designed to work
that way, which I admit is very non-intuitive.  It's enormously flexible.

--

Below is a security.json that I constructed entirely in the webui after
uploading the sample authentication and authorization security.json
found in the reference guide to zookeeper.  It features three users -
read, update, and admin.

The read role has most of the read permissions found in Solr
9.3.0-SNAPSHOT, plus health.  I excluded filestore-read and security-read.

The update role only has the "update" permission.  Which means that it
can send update requests.

The admin role has the "all" permission.

The admin user has all three roles.  The update user has read and
update.  The read user only gets the read role.  Due to the way the
config is parsed, if you remove the read and update roles from the admin
user, it won't work the way I envisioned it.

The read user can obtain a large subset of information from Solr but
cannot make changes.  The update user can do everything read can do,
plus make changes to the index.  The admin user has all permissions.

For better security the read role probably needs a few more permissions
removed, particularly zk-read so only the admin can look directly into
the ZK database.  Some experimentation is required.

I have set the coreadmin-edit permission to not allow ANYONE to do it,
as my setup is in cloud mode.  Using CoreAdmin to make changes in cloud
mode is a recipe for disaster.  It's probably not a good idea in
standalone mode either, but it's FAR less likely to cause BIG problems
in standalone mode.

-

{
   "authentication":{
 "class":"solr.BasicAuthPlugin",
 "credentials":{
   "admin":"REDACTED REDACTED",
   "read":"REDACTED REDACTED",
   "update":"REDACTED REDACTED"},
 "":{"v":153}},
   "authorization":{
 "class":"solr.RuleBasedAuthorizationPlugin",
 "permissions":[
   {
 "name":"read",
 "role":["read"],
 "index":1},
   {
 "name":"collection-admin-read",
 "role":["read"],
 "index":2},
   {
 "name":"config-read",
 "role":["read"],
 "index":3},
   {
 "name":"core-admin-read",
 "role":["read"],
 "index":4},
   {
 "name":"package-read",
 "role":["read"],
 "index":5},
   {
 "name":"health",
 "role":["read"],
 "index":6},
   {
 "name":"zk-read",
 "role":["read"],
 "index":7},
   {
 "name":"schema-read",
 "role":["read"],
 "index":8},
   {
 "name":"metrics-read",
 "role":["read"],
 "index":9},
   {
 "name":"core-admin-edit",
 "role":null,
 "index":10},
   {
 "name":"update",
 "role":["update"],
 "index":11},
   {
 "name":"all",
 "role":["admin"],
 "index":12}],
 "user-role":{
   "admin":[
 "admin",
 "read",
 "update"],
   "read":["read"],
   "update":[
 "read",
 "update"]},
 "":{"v":37}}}

Thanks,
Shawn


Re: Donating to Solr

2023-03-20 Thread gnandre
Thanks for all the wonderful suggestions!

On Thu, Mar 2, 2023 at 2:24 PM David Smiley  wrote:

> Another way to donate if it's a significant sum is to fund Outreachy
> https://www.outreachy.org with some note that it's intended for sponsoring
> an Apache Solr based intern project.  This basically pays stipends to an
> intern.  Unfortunately, sending money to the ASF doesn't work to fund this
> sort of thing due to ASF's policies to avoid any appearance of
> pay-for-work.  I'm wrapping up an Outreachy project as a mentor but it
> almost didn't happen due to lack of funding.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Mar 2, 2023 at 1:47 PM Jan Høydahl  wrote:
>
> > I know the project has been looking for more hardware resources for
> > running tests, and some companies are currently sponsoring benchmark
> > hardware and Jenkins servers for other OS'es. So that is one non-money
> way
> > to contribute. In any case you could start the dialogue with Apache
> > centrally and they will contact the Solr project for coordination of how
> to
> > channel solr-labeled donations.
> >
> > Jan
> >
> > > 2. mar. 2023 kl. 19:36 skrev gnandre :
> > >
> > > Thanks!
> > >
> > > On Thu, Mar 2, 2023, 1:02 PM Doug Turnbull
> > >  wrote:
> > >
> > >> Not sure about Solr, but you can donate to the Apache Software
> > Foundation:
> > >>
> > >> https://www.apache.org/foundation/contributing.html
> > >>
> > >> On Thu, Mar 2, 2023 at 12:04 PM gnandre 
> > wrote:
> > >>
> > >>> I find this open source project very useful. Is there any way to
> donate
> > >>> money for it?
> > >>>
> > >>
> >
> >
>


Re: docker arguments

2023-03-20 Thread Tim Clarke
Thanks Dima, that FM reference is the 8.x version of the 9.x one I've been
following :D
https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html

But there's no solrconfig.xml in that data folder - it must be inside the
docker image/container I think.

Tim

On Sat, 18 Mar 2023 at 18:37, dmitri maziuk  wrote:

> On 2023-03-18 12:17 PM, Tim Clarke wrote:
> >   Presumably that would be the CLASSPATH inside the docker image and/or
> the
> > libraries available within the image? I'd have to find out how to check
> > that. Not really a question for here, but does a docker image have a
> source
> > to check?
>
> I can't tell you about the container you're using, but in general
> Dockerfile and `docker exec ... /bin/sh` are where to look.
>
> Where's solrconfig.xml coming from, is it in your
> /home/timc/dev/solrdata ? -- maybe start there., check it against TFM
> link I sent earlier ... this:
>
> https://solr.apache.org/guide/8_11/uploading-data-with-solr-cell-using-apache-tika.html#configuring-the-extractingrequesthandler-in-solrconfig-xml
>
> Dima
>
>


How to specify multiples values as default for a multivalued field?

2023-03-20 Thread gnandre
This seems very trivial but it is not working for me and I am not able to
figure out why.

If I have multivalued field like below,



When I index a document, instead of creating an array of strings, it
creates just a string like "en,jp"?
How can I define the default values such that they show up as ["en","jp"]
instead?


Re: docker arguments

2023-03-20 Thread Tim Clarke
Sorry, I'd stopped it before answering you. I get this after starting:

timc@Debian:$ docker exec 9a65798bda0d ps aux | grep solr
solr   1  0.0  0.0   2504   524 ?Ss   20:34   0:00 tini --
solr -f
solr  12 11.7  5.0 5424836 715596 ?  Sl   20:34   0:14
/opt/java/openjdk/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
-XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
-XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
-XX:+ExplicitGCInvokesConcurrent
-Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
-Dsolr.jetty.inetaccess.includes= -Dsolr.jetty.inetaccess.excludes=
-Dsolr.log.dir=/var/solr/logs -Djetty.port=8983 -DSTOP.PORT=7983
-DSTOP.KEY=solrrocks -Duser.timezone=UTC -XX:-OmitStackTraceInFastThrow
-XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /var/solr/logs
-Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
-Dsolr.data.home= -Dsolr.install.dir=/opt/solr
-Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
-Dlog4j.configurationFile=/var/solr/log4j2.xml -Dsolr.jetty.host=0.0.0.0
-Xss256k
-XX:CompileCommand=exclude,com.github.benmanes.caffeine.cache.BoundedLocalCache::put
-Djava.security.manager
-Djava.security.policy=/opt/solr/server/etc/security.policy
-Djava.security.properties=/opt/solr/server/etc/security.properties
-Dsolr.internal.network.permission=* -DdisableAdminUI=false -jar start.jar
--module=http --module=requestlog --module=gzip
solr 119  0.0  0.0   8896  3220 ?Rs   20:36   0:00 ps aux
timc@Debian:$

Tim

On Sun, 19 Mar 2023 at 14:56, Shawn Heisey  wrote:

> On 3/18/23 12:21, Tim Clarke wrote:
> > CONTAINER ID   IMAGE COMMAND  CREATED
>  STATUS
> > PORTS NAMES
> > 3eebcc1a2b25   solr  "docker-entrypoint.s…"   5 hours ago
>  Exited
> > (137) 23 minutes ago my_solr2
> > 9a65798bda0d   solr  "docker-entrypoint.s…"   9 days ago
> Exited
> > (143) 5 hours agomy_solr
> > 25dd9eee328b   hello-world   "/hello" 9 days ago
> Exited
> > (0) 9 days ago   cool_davinci
> > 8b5591632935   hello-world   "/hello" 9 days ago
> Exited
> > (0) 9 days ago   vigilant_ride
> >
> > (The top solr is the container I'm running with the additional " -e
> > schemaless -Dsolr.modules=extraction" arguments per the solr set-up page
> > previously cited.
>
> It says none of those containers is running.  When you have one that IS
> running, use its ID value in a command like this:
>
> docker exec 3eebcc1a2b25 ps auxw | grep solr
>
> And send us the output.  Paste it into the message.  If you make it an
> attachment, the mailing list is likely to eat it so we never see it.
>
> Thanks,
> Shawn
>


Re: docker arguments

2023-03-20 Thread dmitri maziuk

On 2023-03-20 3:33 PM, Tim Clarke wrote:

Thanks Dima, that FM reference is the 8.x version of the 9.x one I've been
following :D
https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html

But there's no solrconfig.xml in that data folder - it must be inside the
docker image/container I think.


We're still on 8x so that what I have in my bookmarks.

If the container is actually running, creating a core should set it all 
up incl. solrconfig.xml -- presumably somewhere under your 
/home/timc/dev/solrdata


So what happens when you create the core for your documents?

Dima



Re: docker arguments

2023-03-20 Thread Tim Clarke
The example instruction I'm following uses this line:
docker exec -it my_solr post -c gettingstarted
example/exampledocs/manufacturers.xml
which sets up the folder
~/dev/solrdata/data/gettingstarted

And that's a good lead - thanks - it contains the elusive solrconfig.xml
file which I'd not been finding since I unexpectedly had no read rights to
the folder.

On Mon, 20 Mar 2023 at 20:52, dmitri maziuk  wrote:

> On 2023-03-20 3:33 PM, Tim Clarke wrote:
> > Thanks Dima, that FM reference is the 8.x version of the 9.x one I've
> been
> > following :D
> >
> https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-with-tika.html
> >
> > But there's no solrconfig.xml in that data folder - it must be inside the
> > docker image/container I think.
>
> We're still on 8x so that what I have in my bookmarks.
>
> If the container is actually running, creating a core should set it all
> up incl. solrconfig.xml -- presumably somewhere under your
> /home/timc/dev/solrdata
>
> So what happens when you create the core for your documents?
>
> Dima
>
>


Re: docker arguments

2023-03-20 Thread dmitri maziuk

On 2023-03-20 5:29 PM, Tim Clarke wrote:


And that's a good lead - thanks - it contains the elusive solrconfig.xml
file which I'd not been finding since I unexpectedly had no read rights to
the folder.


Yeah, right: don't never ever run as root. Except when you actually need to.

Dima



Fwd: Undeliverable: Re: docker arguments

2023-03-20 Thread dmitri maziuk
Could somebody please unsubscribe srinivas.as...@live.com ? -- best I 
can tell the "Resent-From: " is the culprit.


Or am I the only one geting these:


 Forwarded Message 
Subject:Undeliverable: Re: docker arguments
Date:   Tue, 21 Mar 2023 00:24:50 +
From:   postmas...@outlook.com
To: dmitri.maz...@gmail.com



*mx.google.com rejected your message to the following email addresses:*

srnvs.as...@gmail.com 
Your message wasn't delivered because the recipient's email provider 
rejected it.


*mx.google.com gave this error:
This mail is unauthenticated, which poses a security risk to the sender 
and Gmail users, and has been blocked. The sender must authenticate with 
at least one of SPF or DKIM. For this message, DKIM checks did not pass 
and SPF check for [gmail.com] did not pass with ip: 
[2a01:111:f400:7e8a::206]. The sender should visit 
https://support.google.com/mail/answer/81126#authentication for 
instructions on setting up authentication. 
d16-20020a056402001000b004fe92372d63si10165121edu.613 - gsmtp

*







*Diagnostic information for administrators:*

Generating server: SJ0PR10MB4510.namprd10.prod.outlook.com

srnvs.as...@gmail.com
mx.google.com
Remote server returned '550-5.7.26 This mail is unauthenticated, which 
poses a security risk to the 550-5.7.26 sender and Gmail users, and has 
been blocked. The sender must 550-5.7.26 authenticate with at least one 
of SPF or DKIM. For this message, 550-5.7.26 DKIM checks did not pass 
and SPF check for [gmail.com] did not pass 550-5.7.26 with ip: 
[2a01:111:f400:7e8a::206]. The sender should visit 550-5.7.26 
https://support.google.com/mail/answer/81126#authentication for 550 
5.7.26 instructions on setting up authentication. 
d16-20020a056402001000b004fe92372d63si10165121edu.613 - gsmtp'


Original message headers:

Received: from PH0PR10MB6959.namprd10.prod.outlook.com 
(2603:10b6:510:28f::15)

  by SJ0PR10MB4510.namprd10.prod.outlook.com (2603:10b6:a03:2d6::22) with
  Microsoft SMTP Server (version=TLS1_2,
  cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.37; Tue, 
21 Mar

  2023 00:24:49 +
Resent-From: 
Received: from PH0PR10MB6959.namprd10.prod.outlook.com ([::1]) by
  PH0PR10MB6959.namprd10.prod.outlook.com 
([fe80::6a24:a197:e583:e64b%8]) with

  Microsoft SMTP Server id 15.20.6178.037; Tue, 21 Mar 2023 00:24:48 +
Authentication-Results: spf=pass (sender IP is 3.227.148.255)
  smtp.mailfrom=solr.apache.org; dkim=fail (signature did not verify)
  header.d=gmail.com;dmarc=fail action=none header.from=gmail.com;
Received-SPF: Pass (protection.outlook.com: domain of solr.apache.org
  designates 3.227.148.255 as permitted sender)
  receiver=protection.outlook.com; client-ip=3.227.148.255;
  helo=mxout1-ec2-va.apache.org; pr=C
X-IncomingTopHeaderMarker: 
OriginalChecksum:EAEEF0134388AD7C2E3945577F2DE8B559080F0C5E53C755508160B41A2EE00D;UpperCasedChecksum:BC5697E83B6B56B00686A5BDEA35C8F34C1919A632482AD20F7C2BFE5232BEC1;SizeAsReceived:5923;Count:42

Mailing-List: contact users-h...@solr.apache.org; run by ezmlm
Precedence: bulk
List-Help: 
List-Unsubscribe: 
List-Post: 
List-Id: 
Reply-To: users@solr.apache.org
Delivered-To: mailing list users@solr.apache.org
Authentication-Results-Original: apache.org; auth=none
X-Virus-Scanned: Debian amavisd-new at spamproc1-he-fi.apache.org
X-Spam-Flag: NO
X-Spam-Score: -0.29
X-Spam-Level:
X-Spam-Status: No, score=-0.29 tagged_above=-999 required=6.31
tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.089, SPF_PASS=-0.001]
autolearn=disabled
Authentication-Results-Original: spamproc1-he-fi.apache.org (amavisd-new);
dkim=pass (2048-bit key) header.d=gmail.com
Received-SPF: Pass (mailfrom) identity=mailfrom; 
client-ip=2607:f8b0:4864:20::82c; helo=mail-qt1-x82c.google.com; 
envelope-from=dmitri.maz...@gmail.com; receiver=

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20210112; t=1679358252;
 h=content-transfer-encoding:in-reply-to:from:references:to
  :content-language:subject:user-agent:mime-version:date:message-id
  :from:to:cc:subject:date:message-id:reply-to;
 bh=T4buuetxEHOOKUhhnrba0c4MgR8NN40tFCfewMQyN5Q=;

b=S36/5p20Q057/OHwEjZBnSTpn+Gqf79c72r+Ck5vjZTMTdoJLDO2OxLzOt4mBAQcwv

i5o2k4to3Rp94gRqrsfd9AD3K6IFflDywBoXqDw00P3v9Sg6CrXF9hP4uW91C8NnE7zd

9tLbT32PXvY1MZghB2L26OgfZLaaGg+IaNh0MHZyZLishIFHMzT0ya6wVMWvSy52u0J3

L2glmbJqMqeAyHBH3d7fcUXDkmmDy2j++nMy51IwYQJSjPaRB6S9jf7I5owvluiCMjMT

gut4VwhawmowzdxnowMUPmcSGoTmG6hNze8yzFJsH/Y5yD6Mi0ZMxyOL7UmXvtjmJVKu
  CMNg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112; t=1679358252;
 h=content-transfer-encoding:in-reply-to:from:references:to
  :content-language:subject:user-a