Term Vector Component

2025-01-08 Thread Christine Feldmann
I’m looking at using the Term Vector Component to get the document term frequency for a field. To use the Term Vector Component, the termVector attribute must be enabled for the field. Questions: 1. Will enabling the termVector attribute on a field cause term vector data to be stored in th

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread anon anon
https://github.com/sourcegraph/zoekt is owned by sourcegraph I will choose livegrep or opengrok Le mer. 8 janv. 2025 à 22:16, Jan Høydahl a écrit : > I think a tailored code search engine is better for your job. Like > livegrep, zoekt or opengrok. > > Jan Høydahl > > > 8. jan. 2025 kl. 20:41 skr

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread Jan Høydahl
I think a tailored code search engine is better for your job. Like livegrep, zoekt or opengrok. Jan Høydahl > 8. jan. 2025 kl. 20:41 skrev anon anon : > > I already knew and tested grep.app. It is definitely a great soft! > > I need my own search because: > - I need regex on ALL search in a wa

Deprecating "collection" parameter for routing

2025-01-08 Thread David Smiley
For many years[1], SolrCloud has supported a "collection" query parameter for both routing a request to a collection as an alternative to "/solr/collectionNameHere/handler" [2],[3], as well as for doing distributed-search across a number of collections[4]. It's actually a comma delimited list of c

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread anon anon
Hello Gus! I already knew and tested grep.app. It is definitely a great soft! I need my own search because: - I need regex on ALL search in a way similar to sourcegraph that already does it - I plan to edit the search query in a custom way to set codeql rules in a far future - I need to choose a

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread Gus Heck
Perhaps you're looking for https://grep.app/ ? It does regex search vs github and was recently acquired by Vercel. It was written by a friend of mine. On Wed, Jan 8, 2025 at 9:44 AM anon anon wrote: > Markus: I probably misunderstood your remark. > > Could it be possible to use a git clone proto

Re: solr9.5 facet exclusion not working properly with vector search

2025-01-08 Thread Yue Yu
Thank you Alessandro. I've file the jira with more details and example: https://issues.apache.org/jira/browse/SOLR-17615 On Wed, Jan 8, 2025 at 8:56 AM Alessandro Benedetti wrote: > The best I got is : > > > https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html#implicit-

Re: solr9.5 facet exclusion not working properly with vector search

2025-01-08 Thread Alessandro Benedetti
The best I got is : https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html#implicit-pre-filtering This piece of work was added by Chris and it's quite complicated. If you feel there's a bug, please elaborate as much as possible, reproduce it and open a Jira ticket. We as

Re: Hybrid search with BoolQParser

2025-01-08 Thread Alessandro Benedetti
Hi, what do you mean by "The rows parameter is being overwritten by the topK parameter" ? The knn query parser builds a Lucene query that returns topK results max. This means that if you set rows>topK, you will get topK results, it's expected, it's how it works. If you want more results, you need

Re: [TOKYO Lucene/Solr Meetup] Lecture by Alessandro Benedetti and Mingchun Zhao

2025-01-08 Thread Alessandro Benedetti
Hi Roopa, this contribution has no target release. Actually, the developments are frozen and won't be resumed until we gather some sponsorship/funding. Cheers -- *Alessandro Benedetti* Director @ Sease Ltd. *Apache Lucene/Solr Committer* *Apache Solr PMC Member* e-mail: a

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread anon anon
Markus: I probably misunderstood your remark. Could it be possible to use a git clone protocol plugin please? Le mer. 8 janv. 2025 à 15:41, anon anon a écrit : > David: > > I also would like to ensure I clarified correctly. > > I absolutely need to index source code to my personal search engine

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread anon anon
David: I also would like to ensure I clarified correctly. I absolutely need to index source code to my personal search engine to run a regex in solr. I want to look for vulnerabilities with the regex. COuld you provide the steps for a such configuration of nutch and eventually solr please? Best

Re: I may fork nutch. Is it a good plan?

2025-01-08 Thread anon anon
Hello David, I need a git "clone" indexer to index an as huge as possible database of repo to make cyber security research for my job. Hello Markus, I am open to any proposition. I did not found in the doc how to make a git clone only of a repo url from the crawler indexer config regex. I also