Solr and Digital Humanities

2025-05-19 Thread Jason Gerlowski
Hey all, Is anyone out there using Solr as a part of a "Digital Humanities" project? I'd be very curious to hear what folks are doing in that domain, and what makes Solr a good (or bad) tool for those sort of problems. If anyone is involved with a use case (or knows someone who is) and is willin

Re: Re: Possible OOM risk after huge REINDEXCOLLECTION requests

2025-05-16 Thread Jason Gerlowski
ng > the problem myself. I could share them via download with you if you want to. > > Florian > > On 2025/04/30 12:28:19 Jan Høydahl wrote: > > We provide jattach for these purposes in the official docker image: > > https://solr.apache.org/guide/solr/latest/deployment-guide

Re: Unexpected behaviour of solr.NumFieldLimitingUpdateRequestProcessorFactory

2025-05-09 Thread Jason Gerlowski
Hi Andreas, Thanks for flagging this behavior; I'm sorry you hit this. I'm able to reproduce in a local setup, so I think this is an actual bug. I've filed https://issues.apache.org/jira/browse/SOLR-17758 describing things, and proposing a fix. Hopefully we can get this merged and released in S

Re: solrj: Asynchronous requests and batched updates in Solr Cloud

2025-05-05 Thread Jason Gerlowski
Hi Markos, I'll answer the easiest question first. The "requestAsync" method is relatively new to our SolrJ API. I don't know of any concrete plans, but I would expect it to be added to more client implementations over time (and ultimately end up on the SolrClient interface). Update batching is

Re: Possible OOM risk after huge REINDEXCOLLECTION requests

2025-04-30 Thread Jason Gerlowski
Hi Florian, I haven't heard any reports of a memory leak triggered by the REINDEXCOLLECTION codepath, but such a leak is always possible. I'd love to see what you find if you're able to take a heap dump! The typical way (afaik) to create heap dumps "on demand" is with jcmd or jmap. If that's no

[Operator] [ANNOUNCE] Apache Solr Operator v0.9.1 released

2025-03-25 Thread Jason Gerlowski
The Apache Solr PMC is pleased to announce the release of the Apache Solr Operator v0.9.1. The Apache Solr Operator is a safe and easy way of managing a Solr ecosystem in Kubernetes. This release contains numerous bug fixes, and optimizations, some of which are highlighted below. The release is a

Re: Medium vulnerability CVE-2024-6763 found in org.eclipse.jetty:jetty-http 10.0.22

2025-02-24 Thread Jason Gerlowski
Hi all, Published CVEs are public information, so as a project we try to discuss them on our "public" mailing lists only. So, no need to loop in ' secur...@solr.apache.org' in the future - that list is reserved for potential "new" vulnerabilities. See our Security Policy for more details. [1] T

CVE-2024-52012: Apache Solr: Configset upload on Windows allows arbitrary path write-access

2025-01-26 Thread Jason Gerlowski
Severity: moderate Affected versions: - Apache Solr 6.6 through 9.7.0 Description: Relative Path Traversal vulnerability in Apache Solr. Solr instances running on Windows are vulnerable to arbitrary filepath write-access, due to a lack of input-sanitation in the "configset upload" API.  Comm

CVE-2025-24814: Apache Solr: Core-creation with "trusted" configset can use arbitrary untrusted files

2025-01-26 Thread Jason Gerlowski
Severity: moderate Affected versions: - Apache Solr through 9.7 Description: Core creation allows users to replace "trusted" configset files with arbitrary configuration Solr instances that (1) use the "FileSystemConfigSetService" component (the default in "standalone" or "user-managed" mode

[Operator] [ANNOUNCE] Apache Solr Operator v0.9.0 released

2025-01-22 Thread Jason Gerlowski
The Apache Solr PMC is pleased to announce the release of the Apache Solr Operator v0.9.0, available for immediate download at: https://solr.apache.org/operator/artifacts.html The Apache Solr Operator is the official and recommended way of managing your Solr ecosystem on Kubernetes. Please report

Re: Admin UI Review Meeting - Feedback Requested

2024-11-22 Thread Jason Gerlowski
Hey Christos, The 26th works best for me personally. Though you and I have talked a good bit about the UI already, so feel free to ignore my vote if there's an opportunity to get new voices in the mix! Best, Jason On Wed, Nov 20, 2024 at 1:38 PM Christos Malliaridis wrote: > > Dear Solr-Commu

Re: "this.authenticationStore" is null after 9.7.0 Update

2024-10-28 Thread Jason Gerlowski
ot a fix that's going through review upstream. Will share here if there's any update on what release(s) that'll be going into. Best, Jason On Fri, Oct 25, 2024 at 10:34 AM Jason Gerlowski wrote: > > Hi guys, > > Thanks for the info! I managed to reproduce this local

Re: "this.authenticationStore" is null after 9.7.0 Update

2024-10-25 Thread Jason Gerlowski
Hi guys, Thanks for the info! I managed to reproduce this locally and created a JIRA ticket to investigate and release a fix: https://issues.apache.org/jira/browse/SOLR-17515 To me at least it looks like a pretty serious bug, and might end up resulting in a 9.7.1 release if other folks agree (an

Re: "this.authenticationStore" is null after 9.7.0 Update

2024-10-25 Thread Jason Gerlowski
Hi guys, Thanks for sharing your experience; sorry you've run into trouble here! I'm going to take a look at reproducing if I can - can you share some details about your cluster setup? - what authc/authz plugins are enabled on your cluster? If basicAuth is in use (as the stack suggests), is "fo

Re: New Solr Admin UI - POC Presentation

2024-10-06 Thread Jason Gerlowski
Thanks for putting this together Christos. Gonna aim to be at the Monday one! Jason On Sun, Oct 6, 2024 at 12:23 AM David Smiley wrote: > > Looking forward to it! > > On Mon, Sep 30, 2024 at 2:02 AM Christos Malliaridis < > c.malliari...@gmail.com> wrote: > > > Hello everyone, > > > > I would l

Re: TLOG/PULL query distribution

2024-08-26 Thread Jason Gerlowski
Hey Kevin, I hope I'm not replying here too late. The best docs on this are in the "SolrCloud Distributed Requests" page [1]. In short though - by default Solr won't have any preference, it does "just" a round-robin or random choice among the healthy replicas for each shard. Users may provide a

[Operator] [ANNOUNCE] Apache Solr Operator v0.8.1 released

2024-04-12 Thread Jason Gerlowski
The Apache Solr PMC is pleased to announce the release of the Apache Solr Operator v0.8.1. The Apache Solr Operator is a safe and easy way of managing a Solr ecosystem in Kubernetes. This release contains several bug fixes, some of which are highlighted below. It also resolves CVE-2024-31391, a c

CVE-2024-31391: Apache Solr Operator: Solr-Operator liveness and readiness probes may leak basic auth credentials

2024-04-12 Thread Jason Gerlowski
Severity: moderate Affected versions: - Apache Solr Operator 0.3.0 through 0.8.0 Description: Insertion of Sensitive Information into Log File vulnerability in the Apache Solr Operator. This issue affects all versions of the Apache Solr Operator from 0.3.0 through 0.8.0. When asked to boots

[ANNOUNCE] Apache Solr 9.5.0 released

2024-02-12 Thread Jason Gerlowski
The Solr PMC is pleased to announce the release of Apache Solr 9.5.0. Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Solr project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration,

Re: Receiving 405 error messages for alias deletion

2023-11-27 Thread Jason Gerlowski
uide/solr/latest/deployment-guide/collection-management.html#backup > 2. > > https://solr.apache.org/guide/solr/latest/deployment-guide/replica-management.html#addreplica > > > On Mon, 20 Nov 2023 at 17:39, Jason Gerlowski > wrote: > > > Good catch! This is definitely a

Re: Receiving 405 error messages for alias deletion

2023-11-20 Thread Jason Gerlowski
Good catch! This is definitely a bug that I introduced as a part of SOLR-16393 - sorry for the trouble. The problem, counter-intuitively, is this line. [1]. The annotations in this file are overriding the ones we need in 'DeleteAliasApi' (which is where the path and verb are specified). I'll get

Re: Solr Operator Tutorial

2023-11-07 Thread Jason Gerlowski
Hey, thanks for sharing. Each version of the operator supports a range of Solr versions. The latest operator version (0.8.0) only supports Solr versions >= 8.11. It looks like the tutorial you were following along with hasn't been updated to match the range of Solr versions, which is definitely

Re: Feedback/Review Requested: "Official" Solr Python Client

2023-10-30 Thread Jason Gerlowski
than just admin/management use-cases, but for the first pass that's where things are. Best, Jason On Mon, Oct 30, 2023 at 1:03 PM Jason Gerlowski wrote: > Hi all, > > On the development side of Solr I've been experimenting with creating API > clients in different program

Feedback/Review Requested: "Official" Solr Python Client

2023-10-30 Thread Jason Gerlowski
Hi all, On the development side of Solr I've been experimenting with creating API clients in different programming languages. These might not end up being as feature-rich as SolrJ (e.g. topology-aware routing, etc.), but the hope is that they'd give users a solid entry-point for interacting with

[Operator] [ANNOUNCE] Apache Solr Operator v0.8.0 released

2023-10-20 Thread Jason Gerlowski
The Apache Solr PMC is pleased to announce the release of the Apache Solr Operator v0.8.0. The Apache Solr Operator is a safe and easy way of managing a Solr ecosystem in Kubernetes. This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. Th

Re: [SOLR] JSONP callback wrapper not working on Solr 9.3.1

2023-09-19 Thread Jason Gerlowski
Hi Chris, I'm not all that familiar with the "json.wrf" functionality, so I could be off here. But starting in 9.3, Solr switched to using Jackson for serialization and away from the homegrown code it's used up to this point. I wonder whether that switch might've broken "json.wrf" in 9.3? I tri

Re: security.json not working for V2 syntax

2023-09-11 Thread Jason Gerlowski
Hi Craig, To be honest, I'm having a little trouble following all of the messages in this thread. This is at least my mail client's fault, as copy/paste has left the thread looking very jumbled in GMail. But I think it's also harder for us all to understand and help because we've been using a si

Re: Limiting Backup IO

2023-06-26 Thread Jason Gerlowski
Sounds like something that would be very useful for folks. I'm sure it'd be very dependent on your data and the type of backup, but I'm curious - if you can share Pierre - is there a number of cores-per-node being backed up where you start to see problems? Jason On Wed, Jun 21, 2023 at 8:34 AM P

Re: Solr Contributor Office Hours and Third Workshop on Testing Scheduled!

2022-11-16 Thread Jason Gerlowski
Small update: for a variety of scheduling reasons and to have sufficient prep time, we're pushing the third workshop back two weeks to December 1st. Hope to see you there! Best, Jason On Sat, Nov 5, 2022 at 11:53 AM Eric Pugh wrote: > > Need help setting up your IDE to write some code? Not sur

Re: Second Contributor Workshop Scheduled for November 3rd

2022-11-04 Thread Jason Gerlowski
drop in to meet, chat, and discuss/troubleshoot their newdev contributions. We haven't quite nailed down the timing yet, but look for an announcement about that in the coming few days. Thanks again to all who participated! Best, Jason On Wed, Oct 26, 2022 at 1:47 PM Eric Pugh wrote: > >

Re: V2 API 404 on all endpoints

2022-11-03 Thread Jason Gerlowski
Additionally, can you include the specific API requests you've tried? That, combined with the information Houston suggested above would help others to reproduce and debug. Best, Jason On Mon, Oct 31, 2022 at 3:41 PM Houston Putman wrote: > > Are there any differences between your jetty and solr

Re: Solr Contributor Bootcamp announced to coincide with ApacheCon USA

2022-10-21 Thread Jason Gerlowski
ng for > what I missed > > On Thu, Oct 20, 2022, 3:26 PM Jason Gerlowski wrote: > > > Hi Anakhe, > > > > Yes; still planned for today! We actually just finished our first > > time slot a few minutes ago. We'll be doing another session covering > &g

Re: Solr Contributor Bootcamp announced to coincide with ApacheCon USA

2022-10-20 Thread Jason Gerlowski
oth on this coming Thursday, October 20th. > > > > > > > > > > > > > > > > > > Option 1: 9:00am EST (2pm GMT) using the following Zoom link: > > > > > > > > > https://us02web.zoom.us/j/86936847178 > > > >

Re: Solr Contributor Bootcamp announced to coincide with ApacheCon USA

2022-10-17 Thread Jason Gerlowski
> people who would like a gentle introduction to contributing to this open > source search engine. You'll be mentored by Eric Pugh and Jason > Gerlowski, both active Solr committers - take a look! > https://opensourceconnections.com/blog/2022/10/03/solr-contributor-bootcamp/ > >

[Operator] [ANNOUNCE] Apache Solr Operator v0.6.0 released

2022-08-15 Thread Jason Gerlowski
The Apache Solr PMC is pleased to announce the release of the Apache Solr Operator v0.6.0. The Apache Solr Operator is a safe and easy way of managing a Solr ecosystem in Kubernetes. This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. Th

Re: Howto restore backup from solr cloud to solr standalone after upgrading to 9.0

2022-07-25 Thread Jason Gerlowski
e support then? With version 9.1? Or rather > 10.x? > > Michael > > On Thu, Jul 14, 2022 at 5:39 PM Jason Gerlowski > wrote: > > > Haha, a "happy accident" for sure! I never relish trimming > > functionality; it's necessary to keep the project even r

Re: Howto restore backup from solr cloud to solr standalone after upgrading to 9.0

2022-07-14 Thread Jason Gerlowski
ce software, i > don't know what is :) > > On Thu, Jul 14, 2022 at 8:41 AM Jason Gerlowski > wrote: > > > Hi Michael, > > > > As you mentioned, the community originally planned to remove support > > for creating new "snapshot-based" backups in 9

Re: Howto restore backup from solr cloud to solr standalone after upgrading to 9.0

2022-07-14 Thread Jason Gerlowski
Hi Michael, As you mentioned, the community originally planned to remove support for creating new "snapshot-based" backups in 9.0. But, luckily for your case: it fell off my radar and was never actually removed. So you should still be able to create snapshot backups using the incremental=false f

Re: backup and restore, standalone instance, windows paths

2022-02-01 Thread Jason Gerlowski
The "Provider \"c\" not installed" message seems to indicate that Solr is reading the "c:/" at the start of your "location" parameter as a URI protocol in the vein of "http://";, "file://", "hdfs://", etc. Outside of that though I'm not sure how you could tweak the "location" to prevent that, and

Re: Solrj BucketBasedJsonFacet.java not parsing "missing" counts

2022-02-01 Thread Jason Gerlowski
+1 - this looks like a gap in SolrJ's coverage of JSON Facets. Please create a JIRA ticket and mention the ticket here. I'm familiar with that piece of SolrJ and would be happy to review/merge if anyone has a chance to write a patch. Best, Jason On Tue, Jan 18, 2022 at 9:44 AM Joel Bernstein

Re: Incremental backup for Standalone Solr

2021-11-15 Thread Jason Gerlowski
Hey Artem, Incremental backups were written primarily with SolrCloud in mind. Many of the APIs (backup listing, backup deletion, etc.) work only in SolrCloud, and most of our automated tests around backups focus on SolrCloud setups. That said, incremental backup in SolrCloud relies on doing incre

Re: Can Solr 8.10 S3BackupRepository work without a shared NFS drive?

2021-11-02 Thread Jason Gerlowski
To your second question: no. Solr's backup process works by sending a message to each shard leader to fetch and restore the data in that shard. Shard leaders fetch this data from the backup repository (S3 in this case), and then send copies of this data to any other replicas that might exist in t

Re: Backup solr fail with error Set system property 'solr.allowPaths' to add other allowed paths."},

2021-10-18 Thread Jason Gerlowski
Hi Tran, I think you're specifying 'solr.allowPaths' in the right place, but you probably need to remove the wildcard ('*') from the path you're using. Most 'solr.allowPaths' usages I've seen (including the doc example you mentioned) specify paths as absolute paths without a trailing wildcard. H

Re: Collection configuration related...

2021-08-09 Thread Jason Gerlowski
Hey Jigar, I don't think the sort of thing you're asking about is possible today. Solr allows users to define default param-values on the "request-handler" backed APIs that are defined in each solrconfig.xml of each configset, but it has no equivalent of this (afaik) that would work for the collec

Re: Schema Changes are not Visible after Collection Restore - Solr 8.8.2

2021-06-15 Thread Jason Gerlowski
reated https://issues.apache.org/jira/browse/SOLR-15478 for it. I attached > your steps as a comment. > > Yes, we tried to use RELOAD the collection, but it did not help to make the > schema changes visible. > I'll try RESTORE with collection.configName > > Kind Regar

Re: Schema Changes are not Visible after Collection Restore - Solr 8.8.2

2021-06-11 Thread Jason Gerlowski
ORE call - this will cause the backed up configset to be uploaded to ZK under a different name - hopefully avoiding whatever caching is causing the trouble here. Best, Jason On Fri, Jun 11, 2021 at 8:38 AM Jason Gerlowski wrote: > > Hey Steffen, > > I took a quick look at the backup/res

Re: Schema Changes are not Visible after Collection Restore - Solr 8.8.2

2021-06-11 Thread Jason Gerlowski
Hey Steffen, I took a quick look at the backup/restore codepath involved here - surprisingly the restore code itself hasn't changed between 8.6.3 and 8.8.2. In both 8.6.3 and 8.8.2, if the configset mentioned in the backup has the same name as a config currently in ZooKeeper, the version in ZK is

Re: Cannot import documents in Tutorial, exercise 3

2021-05-10 Thread Jason Gerlowski
Hey Thomas, Thanks for reporting the discrepancy so thoroughly! The API "/extract" API used by "post.jar" still exists but was removed from default configurations starting back around 8.4. The tutorial in Solr's Ref Guide should have been updated around the same time, but it looks like a few nec

Re: Permission "all" gets evaluated before more specific ones

2021-05-10 Thread Jason Gerlowski
Hi Luca, Your permissions look correct, generally speaking. What version of Solr are you running? There are some known problems using the RuleBasedAuthorizationPlugin in standalone mode - see https://issues.apache.org/jira/browse/SOLR-13097 for more details. Normally I would suspect that you're

Re: Build Solr from Source - Failing Unit Tests in Intellij

2021-04-15 Thread Jason Gerlowski
Hi Phil, Solr has a number of tests that are flaky and fail seemingly at random. Some of this is true flakiness: bugs that only occur with certain timing behavior. Some of it is driven by the Solr Test Framework's heavy use of randomization in running test cases. ("ant test" assigns seeds for te

Re: Problem with Backup - Standalone Mode

2021-03-22 Thread Jason Gerlowski
Hi Adam, Solr's backup functionality integrated into the /replication handler is relatively simple - it just iterates over index files and copies them to the requested location. The only time that replication handler backup delete files is when the backup fails and Solr tries to clean up after it

Re: Rule Based Authorization

2021-03-17 Thread Jason Gerlowski
Hi, Solr's Rule-Based Authz is complex to configure but this should be possible. If you share a security.json configuration you attempted, I can suggest tweaks from there to get you what you need. In case you haven't seen them, the docs here might be helpful: https://solr.apache.org/guide/8_8/ru