I’m a big believer in the right tool for the job. Like what said before if you’re doing just a field:value query or four and no complications, sure use a standard rdbms. But if you inform the client that something like Leaves And whitm* title^3 with bf:title^3 author ^2 Is possible, the conversation changes with the right questions.
> On Mar 17, 2022, at 3:17 PM, Davis, Daniel (NIH/NLM) [C] > <daniel.da...@nih.gov.invalid> wrote: > > This is really a question of how big the haystack is and what sort of search > task users are trying to accomplish. > > If there is no IDF (a mistake I did *not* make at > https://www.indexengines.com/ despite using home-grown search BTW), then > there is an assumption both on the size of the documents being similar and > also on corpora linguistics. > > In any case, if users are basically doing "Known Item Search", e.g. entering > in keywords from a title, then PostgreSQL should do OK. > > On 3/17/22, 1:34 PM, "Alessandro Benedetti" <a.benede...@sease.io> wrote: > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you recognize the sender and are > confident the content is safe. > > > Ok Charlie, Eric, > we are on the same page. > I agree it's definitely possible with some custom proxy work on both Quepid > and RRE, I meant it's not possible to directly point to the DB (for example > via JDBC). > Thanks! > > Cheers > -------------------------- > Alessandro Benedetti > Apache Lucene/Solr PMC member and Committer > Director, R&D Software Engineer, Search Consultant > > www.sease.io > > >> On Thu, 17 Mar 2022 at 17:03, Bayer, Samuel <s...@mitre.org> wrote: >> >> You are, indeed :-). >> >> What appears to be the problem - and I'm not sure yet, but it sure seems >> like a good culprit - is that Postgres search, for reasons that mystify me, >> was implemented with TF but no notion of IDF. There are various extensions >> that add IDF-like properties to Postgres search. Why it didn't start out >> that way is a mystery to me, and I don't know how stable any of the >> extensions that do this actually are. >> >> At the moment, that's my diagnosis of the discrepancy. I'll probably >> follow up with the Postgres folks to see if they have any more insight into >> those extensions. >> >> Thanks to all who responded. >> >> Cordially, >> Sam Bayer >> The MITRE Corporation >> >>> On 3/17/22 12:42 PM, Eric Pugh wrote: >>> What I’ve done to compare other search engines with RRE and Quepid is to >> put a proxy in the middle that converts your query into what looks like a >> Solr request/response ;-). This works great for custom Search API’s, and I >> *guess* you could do it with database backed search? >>> >>> Now we are probably getting beyond what Sam was hoping to do! >>> >>> >>> >>> >>>> On Mar 17, 2022, at 11:56 AM, Alessandro Benedetti < >> a.benede...@sease.io> wrote: >>>> >>>> This is an interesting question. >>>> I second both comments so far (from Eric and David), but I am afraid at >> the >>>> moment the open-source tools for search quality evaluation can't really >>>> compare Postgres to Solr. >>>> As far as I know, both Quepid(Eric correct me if I am wrong) and RRE( >>>> https://github.com/SeaseLtd/rated-ranking-evaluator and also the >> Enterprise >>>> version) are able to compare only Apache Solr and Elasticsearch backed >>>> systems (against each other, or against different configurations). >>>> >>>> In general, I would recommend following David's suggestions: >>>> - collect your requirements(both functional and performance-wise) >>>> - compare >>>> >>>> I have seen in the past many times DB used as terrible search engines >> and >>>> search engines used as terrible DB. >>>> Many times I have seen queries on a search engine to perform poorly >> because >>>> they were designed as they were DB queries. >>>> >>>> Cheers >>>> >>>> -------------------------- >>>> Alessandro Benedetti >>>> Apache Lucene/Solr PMC member and Committer >>>> Director, R&D Software Engineer, Search Consultant >>>> >>>> www.sease.io >>>> >>>> >>>> On Sat, 5 Mar 2022 at 05:04, David Smiley <dsmi...@apache.org> wrote: >>>> >>>>> Hello Sam, >>>>> >>>>> You are a familiar name from my MITRE days :-) >>>>> >>>>> Check out Solr's feature list and see how it compares to that of >> Postgres. >>>>> If you are only doing the most basic default relevancy ranked top-N >> search >>>>> with default text analysis, then the tech/maintenance overhead might >> not be >>>>> worth it. I'm looking at this as such an example: >>>>> https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=solr >>>>> >>>>> On the other hand, if you want to ensure that you're able to make >> search >>>>> the best it can be for your users, then keeping Solr and using it more >> will >>>>> get you there; a database won't. To a database, full-text-search is >> just >>>>> one checkbox of many concerns. The capabilities there are usually very >>>>> simple. It's fine for a demo/POC -- getting started. >>>>> >>>>> One feature in particular I want to call out is faceting. To some >> apps, >>>>> it's a game changer that can pivot the UX from merely having a basic >> search >>>>> box to having navigation filters and everything else, at which point >> Solr >>>>> is the foundation of what's driving the UX. I've seen people/apps miss >>>>> this -- the user experience is so clumsy without it for rich/structured >>>>> data in particular. If you've ever used a Maven repository manager >> like >>>>> Nexus or it's competitors (last I checked), they are still stuck in the >>>>> stone-age -- it's painful when you've been exposed to so much better. >> On >>>>> the backend, if all you know is a database, you may not see how to >> make a >>>>> faceting UI work because it's rather unnatural for SQL. >>>>> >>>>> Eric's response was great too. >>>>> >>>>> ~ David Smiley >>>>> Apache Lucene/Solr Search Developer >>>>> http://www.linkedin.com/in/davidwsmiley >>>>> >>>>> >>>>> On Fri, Mar 4, 2022 at 9:33 AM Bayer, Samuel <s...@mitre.org> wrote: >>>>> >>>>>> Hi all - >>>>>> >>>>>> In the interest of reducing my technology stack, I'm exploring whether >>>>>> using Postgres full-text search instead of Solr might be an option >> when I >>>>>> need both complex querying and full-text search. In my experience, so >>>>> far, >>>>>> Postgres can't compare to Solr, but I'm trying to understand why, in >>>>> order >>>>>> to have more of an ability to evaluate the functionality/complexity >>>>>> tradeoffs. I know something about search technologies, but I'm not an >>>>>> expert by any stretch of the imagination, and I've been looking for >>>>> sources >>>>>> that talk about the comparison in an informed way - people, blogs, >>>>>> articles. So far, everything I've found is extremely basic. Does >> anyone >>>>>> have any pointers for me? >>>>>> >>>>>> Thanks in advance - >>>>>> Sam Bayer >>>>>> The MITRE Corporation >>>>>> s...@mitre.org >>>>>> >>>>> >>> >>> _______________________ >>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | >> http://www.opensourceconnections.com < >> http://www.opensourceconnections.com/> | My Free/Busy < >> http://tinyurl.com/eric-cal> >>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed < >> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> >> >>> This e-mail and all contents, including attachments, is considered to be >> Company Confidential unless explicitly stated otherwise, regardless of >> whether attachments are marked as such. >>> >>> >> >