Re: Solr GC Tuning causes issues and doesn't start Solr url

2022-04-27 Thread Shawn Heisey

On 4/27/22 05:08, Deeksha Shrivastava wrote:


The "etc" itself is included in the setting format. Please refer 
screenshot below:




The screenshot says "set GC_TUNE="-XX:NewRatio=3 -XX:SurvivorRatio=4 
etc." ... it does not say what the "etc" is, and we are going to need 
that information.



 2. As far as java version is concerned, I mentioned in the previous
email stating that, I doubt there is a JRE installed on the
machine because when I do a “java -version” on cmd it says “java
is not recognized as an internal or external command” which
probably means its not installed on that particular server.



You can't run Solr without Java.  If there really is no Java installed, 
then Solr is not going to start, no matter what options you have 
configured for it.  Maybe you have the JAVA_HOME environment variable 
defined, which tells the solr start script where to find Java?


Thanks,
Shawn



Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Neha Gupta

Dear Solr Community,

I would like to know what is the safe number of documents that can be 
returned from a SOLR.


Just for information I will be firing queries from Java application to 
SOLR using SOLRJ and would like to know how much maximum documents (i.e  
maximum number of rows that i can request in the query) can be returned 
safely from SOLR.


It would be great if you can please share your experience with regard to 
the same.



Thanks and Regards
Neha Gupta



Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Andy Lester


> On Apr 27, 2022, at 3:23 PM, Neha Gupta  wrote:
> 
> Just for information I will be firing queries from Java application to SOLR 
> using SOLRJ and would like to know how much maximum documents (i.e  maximum 
> number of rows that i can request in the query) can be returned safely from 
> SOLR.

It’s impossible to answer that. First, how do you mean “safe”? How big are your 
documents?

Let’s turn it around. Do you have a number in mind where you’re wondering if 
Solr can handle it? Like you’re thinking “Can Solr handle 10 million documents 
averaging 10K each”?  That’s much easier to address.

Andy

Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Neha Gupta

Hi Andy,

I have different cores with different number of documents.

1) Core 1: - 227625 docs and each document having approx 10 String fields.

2) Core 2: - Approx 3.5 million documents and each having 3 string fields.

So  my question is if i request in one request lets say approximate 10K 
documents using SOLRJ will that be OK. By safe here i mean approx. 
maximum number of documents that i can request without causing any 
problem in receiving a response from SOLR.


Is that enough to answer the question?

On 27/04/2022 22:26, Andy Lester wrote:



On Apr 27, 2022, at 3:23 PM, Neha Gupta  wrote:

Just for information I will be firing queries from Java application to SOLR 
using SOLRJ and would like to know how much maximum documents (i.e  maximum 
number of rows that i can request in the query) can be returned safely from 
SOLR.

It’s impossible to answer that. First, how do you mean “safe”? How big are your 
documents?

Let’s turn it around. Do you have a number in mind where you’re wondering if 
Solr can handle it? Like you’re thinking “Can Solr handle 10 million documents 
averaging 10K each”?  That’s much easier to address.

Andy

Re: Cannot post to SSL-secured core from command line [solved]

2022-04-27 Thread Christopher Schultz

Victoria,

On 4/26/22 16:17, Victoria Stuart (VictoriasJourney.com) wrote:
>
> [snip]
>

[victoria@victoria etc]$ sudo systemctl restart httpd
   [sudo] password for victoria:


I think this httpd restart/status are not relevant, no?


# 
# ADD CERTIFICATE TO JAVA TRUST STORE (cacerts):
# --

## cacerts p/w generally defaults to: changeit

[victoria@victoria etc]$ sudo keytool -import -trustcacerts -cacerts -storepass 
*** -noprompt -alias solr-ssl -file 
/mnt/Vancouver/apps/solr/solr-8.11.1/server/etc/solr-ssl-cert

   Certificate was added to keystore


I would highly recommend *against* modifying the platform's cacerts 
trust store. It should be possible to use a specific trust store for any 
client who needs to access your Solr server.



# 
# 2. INDEX DOCUMENTS TO SSL-HARDENED SOLR
# ===


> [snip]


# 
# solr.in.sh :
# 

## Note: basic authentication allows access to SSL-protected Solr from the 
console / command-line.

   SOLR_SSL_ENABLED=true

   
SOLR_SSL_KEY_STORE=/mnt/Vancouver/apps/solr/solr-8.11.1/server/etc/solr-ssl.keystore.p12
   SOLR_SSL_KEY_STORE_PASSWORD=secret
   SOLR_SSL_KEY_STORE_TYPE=PKCS12

   
SOLR_SSL_TRUST_STORE=/mnt/Vancouver/apps/solr/solr-8.11.1/server/etc/solr-ssl.keystore.p12
   SOLR_SSL_TRUST_STORE_PASSWORD=secret
   SOLR_SSL_TRUST_STORE_TYPE=PKCS12

   SOLR_AUTH_TYPE="basic"
   SOLR_AUTHENTICATION_OPTS="-Dbasicauth=pg-solr-admin:secret"

   SOLR_SSL_NEED_CLIENT_AUTH=false
   SOLR_SSL_WANT_CLIENT_AUTH=false


Hmm I could have sworn you were using mutual-TLS. Maybe not.


# 
# SOLR INDEXING (old, for reference; note: http://...):
# -

   /usr/lib/jvm/java-8-openjdk/jre//bin/java -classpath 
/mnt/Vancouver/apps/solr/solr-8.7.0/dist/solr-core-8.7.0.jar -Dauto=yes 
-Dc=core0 -Ddata=files org.apache.solr.util.SimplePostTool 
/mnt/Vancouver/programming/datasci/solr/test/d1.html 
/mnt/Vancouver/programming/datasci/solr/test/d2.html 
/mnt/Vancouver/programming/datasci/solr/test/d3.html 
/mnt/Vancouver/programming/datasci/solr/test/d4.html


If you add:
 -Djavax.net.ssl.trustStore=[path to trust store]
 -Djavax.net.ssl.trustStorePassword=[password]
 -Djavax.net.ssl.trustStoreType=[type]

... then you should not have to modify the platform's cacerts trust store.


   /usr/lib/jvm/java-18-openjdk/bin/java \
   -classpath /mnt/Vancouver/apps/solr/solr-8.11.1/dist/solr-core-8.11.1.jar \
   -Dbasicauth=pg-solr-admin:secret \
   
-Dsolr.default.confdir=/mnt/Vancouver/apps/solr/solr-8.11.1/server/solr/configsets/_default/conf/
 \
   
-Djavax.net.ssl.keyStore=/mnt/Vancouver/apps/solr/solr-8.11.1/server/etc/solr-ssl.keystore.p12
 \
   -Djavax.net.ssl.keyStoreType=PKCS12 \
   -Djavax.net.ssl.keyStorePassword=secret \
   
-Djavax.net.ssl.trustStore=/mnt/Vancouver/apps/solr/solr-8.11.1/server/etc/solr-ssl.keystore.p12
 \
   -Djavax.net.ssl.trustStoreType=PKCS12 \
   -Djavax.net.ssl.trustStorePassword=secret \


Yes, just like the above.

-chris


Problem with indexing a String field in SOLR.

2022-04-27 Thread Neha Gupta

Dear Solr Community,

I have a very weird situation with SOLR indexing and even after spending 
a day i am not able to find a proper reason so i request for your help.


I tried to index a string field by name "host_common_name". I created 
the field in the schema (schema got updated as well) via SOLR Admin GUI 
and after data import this field seems to be not getting indexed.


After searching i found out that in the Admin GUI, if i select this 
field then only Properties values are being shown while for other fields 
which are getting properly indexed along with properties, schema and 
indexed information is also shown.






I tried several ways like deleting the whole schema and then creating 
the new one and so no but still this field with this name is not getting 
indexed.


At last just as a try i created a different field with different name 
"hcn" and with this name field is getting indexed and in Admin Gui all 
values are being shown like properties, schema and so on.



So i was just wondering what can be the issue with the name 
"host_common_name". Did anyone came across similar issue? and would like 
to share some information on this.


Thanks in advance for all the help this community always offers.


Regards
Neha Gupta


Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Andy Lester
> 
> So  my question is if i request in one request lets say approximate 10K 
> documents using SOLRJ will that be OK. By safe here i mean approx. maximum 
> number of documents that i can request without causing any problem in 
> receiving a response from SOLR.

I’m still not clear what you’re asking. Are you asking if Solr can handle 
returning 10K docs in a result set? It seems like it to me, but “causing any 
problem” could be pretty much anything.

Do you have reason to think that it would be a problem? Did you try something 
that failed? If so, what did you try and what happened?

And if you haven’t tried it out, then I’d suggest you do that.

Andy

Re: Nested Facets and SortableTextField

2022-04-27 Thread WU, Zhiqing
Hi Michael,
Thanks for your responsible recommendation.
Yes, we could use TextField in our application but still hope to use
SortableTextField due to its Sorting functions
I have read your previous comments (Mar, 2019) in
https://issues.apache.org/jira/browse/SOLR-13056
Could your previous patch solve or partially solve the problem?
Kind regards,
Zhiqing

On Tue, 26 Apr 2022 at 01:03, Michael Gibney 
wrote:

> I was hoping that would "just work"; since it didn't, I dug a little more
> and I'm afraid that explicitly setting`method:uif` has no effect -- if
> docValues are there, they will be used:
>
>
> https://github.com/apache/solr/blob/c99af207c761ec34812ef1cc3054eb2804b7448b/solr/core/src/java/org/apache/solr/search/facet/FacetField.java#L161-L167
>
> Pending SOLR-8362 (or some other more narrow solution?), I think the only
> responsible recommendation is: don't use SortableTextField for faceting.
> Would it work to use TextField instead? TextField has to be uninverted, but
> at least it meets the requirement of indexed values being compatible with
> values over which bulk facet collection takes place.
>
> On Mon, Apr 25, 2022 at 3:52 PM WU, Zhiqing  wrote:
>
> > Hi Michael,
> > Thanks for your quick reply and related information.
> > I added "method":"uif" at 3 different places but it does not address my
> > problem -
> > 1.
> > {
> >   "query": "*:*",
> >   "method":"uif",
> >   "facet": {
> > "categories": {
> >   "type": "terms",
> >   "field": "name_txt_sort",
> >   "limit": -1,
> >   "facet": {
> > "sex_s": {
> >   "type": "terms",
> >   "field": "sex_s",
> >   "limit": -1
> > }
> >   }
> > }
> >   }
> > }
> >
> > Response:
> > "error":{
> > "metadata":[
> >   "error-class"...]}
> >
> > 2.
> > {
> >   "query": "*:*",
> >   "facet": {
> > "method":"uif",
> > "categories": {
> >   "type": "terms",
> >   "field": "name_txt_sort",
> >   "limit": -1,
> >   "facet": {
> > "sex_s": {
> >   "type": "terms",
> >   "field": "sex_s",
> >   "limit": -1
> > }
> >   }
> > }
> >   }
> > }
> >
> > Response:
> > "error":{
> > "metadata":[
> >   "error-class", ...
> >
> > 3.
> > {
> >   "query": "*:*",
> >   "facet": {
> > "categories": {
> >   "method":"uif",
> >   "type": "terms",
> >   "field": "name_txt_sort",
> >   "limit": -1,
> >   "facet": {
> > "sex_s": {
> >   "type": "terms",
> >   "field": "sex_s",
> >   "limit": -1
> > }
> >   }
> > }
> >   }
> > }
> >
> > Response:
> > "facets":{
> > "count":3,
> > "categories":{
> >   "buckets":[{
> >   "val":"Amelia Harris",
> >   "count":1},
> > {
> >   "val":"George Smith",
> >   "count":1},
> > {
> >   "val":"Olivia Wilson",
> >   "count":1}]}}}
> >
> > Should I try "method":"uif" at another place?
> > Kind regards,
> > Zhiqing
> >
> > On Mon, 25 Apr 2022 at 17:47, Michael Gibney 
> > wrote:
> >
> > > This is related to https://issues.apache.org/jira/browse/SOLR-13056
> > >
> > > I'm curious: if you set `method:uif` on the top-level facet, are you
> able
> > > to achieve the desired results? (Note that `method:uif` incurs the same
> > > heap memory overhead -- uninverting the indexed values -- as faceting
> > over
> > > a regular TextField). Doing this (if it works as I think it might)
> could
> > > address the core problem with faceting on SortableTextField: that
> > DocValues
> > > for SortableTextField are appropriate for _sorting_, but are different
> > from
> > > the _indexed_ values that would be used for refinement and nested
> domain
> > > filtering.
> > >
> > > See also https://issues.apache.org/jira/browse/SOLR-8362
> > >
> > > On Mon, Apr 25, 2022 at 11:59 AM WU, Zhiqing  wrote:
> > >
> > > > Hello,
> > > > I do not know why Nested Facets (
> > > > https://solr.apache.org/guide/8_11/json-facet-api.html#nested-facets
> )
> > > does
> > > > not work for _txt_sort field (SortableTextField).
> > > >
> > > > To reproduce the problem,
> > > > I created a new collection (Config set: _default) and add the
> following
> > > to
> > > > the collection
> > > > {
> > > > "name_txt_sort": ["Amelia Harris"],
> > > > "name_txt": ["Amelia Harris"],
> > > > "sex_s": "female"
> > > > },
> > > > {
> > > > "name_txt_sort": ["Olivia Wilson"],
> > > > "name_txt": ["Olivia Wilson"],
> > > > "sex_s": "female"
> > > > },
> > > > {
> > > > "name_txt_sort": ["George Smith"],
> > > > "name_txt": ["George Smith"],
> > > > "sex_s": "male"
> > > > }
> > > >
> > > > If my query is:
> > > > {
> > > >   "query": "*:*",
> > > >   "facet": {
> > > > "categories": {
> > > >   "type": "terms",
> > > >   "field": "name_txt",
> > > >   "limit": -1,
> > > >   "facet": {
> > > > "sex_s": {
> > > >   "type": "terms",
> > > >   

Re: Cannot post to SSL-secured core from command line [solved] [addendum: passwords - character issues]

2022-04-27 Thread Christopher Schultz

Victoria,

On 4/26/22 21:46, Victoria Stuart (VictoriasJourney.com) wrote:

# 
# Addendum - passwords - character issues.
# 


Hmm. You should not have had any of these issues. Can you please confirm:

1. You are saying that # does not work in a "SSL certificate password". 
Do you mwan the keystore password?


Remember that you are using a bourne-shell style .sh script to configure 
Solr, and that # is a special character.


SOLR_SSL_KEY_STORE_PASSWORD=secret#password

Isn't going to work as you expect. You may need to escape the # to get 
the whole password:


SOLR_SSL_KEY_STORE_PASSWORD=secret\#password

You could also use quotes:
SOLR_SSL_KEY_STORE_PASSWORD="secret#password"

2. Are you saying that # does not work in an "HTTP Basic" authentication 
scheme? If that's the case (and the first report I read showed a URL 
with http://username:password@hostname:port/...), then the problem is 
that the client is putting the authentication information into the URL 
and not into the HTTP headers where they belong.


Perhaps this is a problem with one of the tools being provided by Solr 
(e.g. 'post'); if so, please file a bug so it can be fixed.


-chris


Per my earlier message [appended below], I should have mentioned that in 
sorting out both my Solr Basic Authentication and SSL configuration that I had 
been vexed by keystore and Solr passwords (I use a password generator) 
containing special characters (# $ etc.), that silently cause Basic 
Authentication / SSL connection issues.

Particularly, I had issue with passwords containing the number/hash/pound 
character -  #  - echoed here:

   
https://www.wpsolr.com/forums/topic/unable-to-connect-to-index-when-solr-authentication-and-authorization/

 From various documentation on the web:

  ** This advice was errant:

 
https://docs.oracle.com/cd/E14571_01/install./e12002/oimscrn011.htm#INOIM1372=

 KeyStore password : a valid password can contain 6 to 30 characters, begin 
with an alphabetic character,
 and use only alphanumeric characters and special characters like 
underscore (_), dollar ($), pound (#).
 The password must contain at least one number.

  ** https://getfishtank.ca/blog/updating-ssl-certificates-in-solr

   Point of Note: when updating to Solr certificate, there's one thing you need 
to be aware of: the password should not contain any special characters.

   It's not uncommon for SSL certs to contain special characters, but Solr 
doesn't like them in the format we have to work with. It should be purely 
alpha-numeric.

   If it does, during the restart you may get a message that the service failed 
to restart. If you get that error, this is certainly something to check.

One of my original certificate p/w was apparently silently causing issues, such 
as the esoteric Solr console message:

   "... Javax.crypto.BadPaddingException:Given final block not properly padded 
solution ..."

While user passwords generated in the Solr Admin UI may caution

   Password not strong enough! Must contain at least one lowercase letter, one
   uppercase letter, one digit, and one of these special characters: 
!@#$%^&*_-[]()

As mentioned, one of my p/w contained # and so it - or the hashing/salt 
algorithm - resulted in silent errors (by silent I mean errors that gave no 
indication that the password character coding was an issue).

Here is a jetty post cautioning against the use of @ in passwords:

   https://www.eclipse.org/lists/jetty-users/msg07410.html

I would be wary of the use of non-alphanumeric "special characters" for 
keystore and Solr passwords. (If needed / concerned, one can increase the password length 
and complexity, e.g. mixed case, if concerned).

* What Are Alphanumeric Characters?
   https://studyqueries.com/alphanumeric-characters/

 Alphanumeric characters comprise the combination of the twenty-six characters 
of the alphabet (from A to Z) and the numbers 0 to 9. Therefore, 1, 2, q, f, m, p, 
and 10 are all examples of alphanumeric characters. Symbols like *, & and @ are 
also considered alphanumeric characters.

 These characters can also be used in combination. Examples of alphanumeric 
characters made of the combination of special symbols, numbers, and the characters 
of the alphabet are &AF54hh, jjHF47, @qw99O. The characters of the alphabet can 
either be in lower case or upper case. The context of use determines whether or not 
case sensitivity is applied.

* See also:

  ** 
https://stackoverflow.com/questions/34675756/http-basic-authentication-fail-with-password-with-non-iso-8859-1-characters

  ** https://bz.apache.org/bugzilla/show_bug.cgi?id=48985

  ** https://bugs.openjdk.java.net/browse/JDK-6979740

  ** https://issuetracker.google.com/issues/37135737 >> ... When keytool 
creates a KeyStore or key which is protected with a password containing non-ASCII 
characters, keytool may encode the password using the console's

Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Vincenzo D'Amore
Hi, I strongly discourage you from downloading so many documents in one
shot, doing this on a normal basis creates humongous memory allocations in
the JVM and this usually leads to have GC problems.
https://stackoverflow.com/questions/10039778/how-to-get-all-results-from-solr-query

There are really so many options to read a great amount of documents.

Natively you can use the export request handler

https://solr.apache.org/guide/8_11/exporting-result-sets.html#the-export-requesthandler

Here is an app I wrote long ago that uses Solr cursors

https://github.com/freedev/solr-import-export-json
https://solr.apache.org/guide/6_6/pagination-of-results.html#using-cursors

But even the simple solr pagination with the parameters start and rows can
do better than use rows alone.


On Wed, Apr 27, 2022 at 10:35 PM Neha Gupta  wrote:

> Hi Andy,
>
> I have different cores with different number of documents.
>
> 1) Core 1: - 227625 docs and each document having approx 10 String fields.
>
> 2) Core 2: - Approx 3.5 million documents and each having 3 string fields.
>
> So  my question is if i request in one request lets say approximate 10K
> documents using SOLRJ will that be OK. By safe here i mean approx.
> maximum number of documents that i can request without causing any
> problem in receiving a response from SOLR.
>
> Is that enough to answer the question?
>
> On 27/04/2022 22:26, Andy Lester wrote:
> >
> >> On Apr 27, 2022, at 3:23 PM, Neha Gupta  wrote:
> >>
> >> Just for information I will be firing queries from Java application to
> SOLR using SOLRJ and would like to know how much maximum documents (i.e
> maximum number of rows that i can request in the query) can be returned
> safely from SOLR.
> > It’s impossible to answer that. First, how do you mean “safe”? How big
> are your documents?
> >
> > Let’s turn it around. Do you have a number in mind where you’re
> wondering if Solr can handle it? Like you’re thinking “Can Solr handle 10
> million documents averaging 10K each”?  That’s much easier to address.
> >
> > Andy



-- 
Vincenzo D'Amore


Re: Nested Facets and SortableTextField

2022-04-27 Thread Michael Gibney
Do you want faceting to be based on tokenized values, or the original input
as a monolithic string? In any case, the patch attached to SOLR-13056 is
unlikely to help. The patch associated with SOLR-8362 might help (is
_designed_ to help with this kind of situation, in fact!). But that's a
monumental patch and I wouldn't recommend using it provisionally.

The good news is that SortableTextField is kind of a convenience, so
depending on whether you want to facet on the original string or the
post-tokenization values, you can probably achieve the outcome you want by
leveraging creative copyFields, etc...

One thing that I've always wondered about is the utility of
SortableTextField given that IIUC the sort values are not normalized
(casefolding, etc.). You configure one-and-only-one index-time analyzer,
which (again, IIUC) is used for tokenization. But the sort value (which one
might ordinarily normalize with KeywordTokenizer or something?) is based on
the pre-analysis raw input.

I'm making a bunch of assumptions here, but my recommendation if you want
normalized sort on full value _and_ faceting on post-analysis token values:
use a copyField to direct input to two separate fields -- one for sorting
(maybe ICUCollationField?) and one for faceting (TextField). The faceting
would require uninversion (no docValues for faceting over TextField). Some
interesting general discussion about post-tokenization faceting use cases
(mostly advising against) can be found here [1].

[1] https://issues.apache.org/jira/browse/LUCENE-10023

Michael

On Wed, Apr 27, 2022 at 5:01 PM WU, Zhiqing  wrote:

> Hi Michael,
> Thanks for your responsible recommendation.
> Yes, we could use TextField in our application but still hope to use
> SortableTextField due to its Sorting functions
> I have read your previous comments (Mar, 2019) in
> https://issues.apache.org/jira/browse/SOLR-13056
> Could your previous patch solve or partially solve the problem?
> Kind regards,
> Zhiqing
>
> On Tue, 26 Apr 2022 at 01:03, Michael Gibney 
> wrote:
>
> > I was hoping that would "just work"; since it didn't, I dug a little more
> > and I'm afraid that explicitly setting`method:uif` has no effect -- if
> > docValues are there, they will be used:
> >
> >
> >
> https://github.com/apache/solr/blob/c99af207c761ec34812ef1cc3054eb2804b7448b/solr/core/src/java/org/apache/solr/search/facet/FacetField.java#L161-L167
> >
> > Pending SOLR-8362 (or some other more narrow solution?), I think the only
> > responsible recommendation is: don't use SortableTextField for faceting.
> > Would it work to use TextField instead? TextField has to be uninverted,
> but
> > at least it meets the requirement of indexed values being compatible with
> > values over which bulk facet collection takes place.
> >
> > On Mon, Apr 25, 2022 at 3:52 PM WU, Zhiqing  wrote:
> >
> > > Hi Michael,
> > > Thanks for your quick reply and related information.
> > > I added "method":"uif" at 3 different places but it does not address my
> > > problem -
> > > 1.
> > > {
> > >   "query": "*:*",
> > >   "method":"uif",
> > >   "facet": {
> > > "categories": {
> > >   "type": "terms",
> > >   "field": "name_txt_sort",
> > >   "limit": -1,
> > >   "facet": {
> > > "sex_s": {
> > >   "type": "terms",
> > >   "field": "sex_s",
> > >   "limit": -1
> > > }
> > >   }
> > > }
> > >   }
> > > }
> > >
> > > Response:
> > > "error":{
> > > "metadata":[
> > >   "error-class"...]}
> > >
> > > 2.
> > > {
> > >   "query": "*:*",
> > >   "facet": {
> > > "method":"uif",
> > > "categories": {
> > >   "type": "terms",
> > >   "field": "name_txt_sort",
> > >   "limit": -1,
> > >   "facet": {
> > > "sex_s": {
> > >   "type": "terms",
> > >   "field": "sex_s",
> > >   "limit": -1
> > > }
> > >   }
> > > }
> > >   }
> > > }
> > >
> > > Response:
> > > "error":{
> > > "metadata":[
> > >   "error-class", ...
> > >
> > > 3.
> > > {
> > >   "query": "*:*",
> > >   "facet": {
> > > "categories": {
> > >   "method":"uif",
> > >   "type": "terms",
> > >   "field": "name_txt_sort",
> > >   "limit": -1,
> > >   "facet": {
> > > "sex_s": {
> > >   "type": "terms",
> > >   "field": "sex_s",
> > >   "limit": -1
> > > }
> > >   }
> > > }
> > >   }
> > > }
> > >
> > > Response:
> > > "facets":{
> > > "count":3,
> > > "categories":{
> > >   "buckets":[{
> > >   "val":"Amelia Harris",
> > >   "count":1},
> > > {
> > >   "val":"George Smith",
> > >   "count":1},
> > > {
> > >   "val":"Olivia Wilson",
> > >   "count":1}]}}}
> > >
> > > Should I try "method":"uif" at another place?
> > > Kind regards,
> > > Zhiqing
> > >
> > > On Mon, 25 Apr 2022 at 17:47, Michael Gibney <
> mich...@michaelgibney.net>
> > > wrote:
> > >
> > > > This is relat

Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread David Hastings
Very often I have used solr to return over 30 million documents with no
ramifications via a wget call and a LONG timeout.  granted it took a while
and the resulting file was in the multiple GB's of size, but there isnt any
issues with it I ever encountered.  I also used about a 31gb JVM head and
had a few hundred gb of memory on the server but it never got taxed too high

On Wed, Apr 27, 2022 at 5:23 PM Vincenzo D'Amore  wrote:

> Hi, I strongly discourage you from downloading so many documents in one
> shot, doing this on a normal basis creates humongous memory allocations in
> the JVM and this usually leads to have GC problems.
>
> https://stackoverflow.com/questions/10039778/how-to-get-all-results-from-solr-query
>
> There are really so many options to read a great amount of documents.
>
> Natively you can use the export request handler
>
>
> https://solr.apache.org/guide/8_11/exporting-result-sets.html#the-export-requesthandler
>
> Here is an app I wrote long ago that uses Solr cursors
>
> https://github.com/freedev/solr-import-export-json
> https://solr.apache.org/guide/6_6/pagination-of-results.html#using-cursors
>
> But even the simple solr pagination with the parameters start and rows can
> do better than use rows alone.
>
>
> On Wed, Apr 27, 2022 at 10:35 PM Neha Gupta 
> wrote:
>
> > Hi Andy,
> >
> > I have different cores with different number of documents.
> >
> > 1) Core 1: - 227625 docs and each document having approx 10 String
> fields.
> >
> > 2) Core 2: - Approx 3.5 million documents and each having 3 string
> fields.
> >
> > So  my question is if i request in one request lets say approximate 10K
> > documents using SOLRJ will that be OK. By safe here i mean approx.
> > maximum number of documents that i can request without causing any
> > problem in receiving a response from SOLR.
> >
> > Is that enough to answer the question?
> >
> > On 27/04/2022 22:26, Andy Lester wrote:
> > >
> > >> On Apr 27, 2022, at 3:23 PM, Neha Gupta
> wrote:
> > >>
> > >> Just for information I will be firing queries from Java application to
> > SOLR using SOLRJ and would like to know how much maximum documents (i.e
> > maximum number of rows that i can request in the query) can be returned
> > safely from SOLR.
> > > It’s impossible to answer that. First, how do you mean “safe”? How big
> > are your documents?
> > >
> > > Let’s turn it around. Do you have a number in mind where you’re
> > wondering if Solr can handle it? Like you’re thinking “Can Solr handle 10
> > million documents averaging 10K each”?  That’s much easier to address.
> > >
> > > Andy
>
>
>
> --
> Vincenzo D'Amore
>


Re: Regarding maximum number of documents that can be returned safely from SOLR to Java Application.

2022-04-27 Thread Vincenzo D'Amore
Ok, but the OP has to know that doing this often can be a serious issue.
For example if you are implementing an endpoint that can be called 10/100
times per hour, each call will result in a few humongous objects allocated
in the JVM.