Ok Charlie, Eric, we are on the same page. I agree it's definitely possible with some custom proxy work on both Quepid and RRE, I meant it's not possible to directly point to the DB (for example via JDBC). Thanks!
Cheers -------------------------- Alessandro Benedetti Apache Lucene/Solr PMC member and Committer Director, R&D Software Engineer, Search Consultant www.sease.io On Thu, 17 Mar 2022 at 17:03, Bayer, Samuel <s...@mitre.org> wrote: > You are, indeed :-). > > What appears to be the problem - and I'm not sure yet, but it sure seems > like a good culprit - is that Postgres search, for reasons that mystify me, > was implemented with TF but no notion of IDF. There are various extensions > that add IDF-like properties to Postgres search. Why it didn't start out > that way is a mystery to me, and I don't know how stable any of the > extensions that do this actually are. > > At the moment, that's my diagnosis of the discrepancy. I'll probably > follow up with the Postgres folks to see if they have any more insight into > those extensions. > > Thanks to all who responded. > > Cordially, > Sam Bayer > The MITRE Corporation > > On 3/17/22 12:42 PM, Eric Pugh wrote: > > What I’ve done to compare other search engines with RRE and Quepid is to > put a proxy in the middle that converts your query into what looks like a > Solr request/response ;-). This works great for custom Search API’s, and I > *guess* you could do it with database backed search? > > > > Now we are probably getting beyond what Sam was hoping to do! > > > > > > > > > >> On Mar 17, 2022, at 11:56 AM, Alessandro Benedetti < > a.benede...@sease.io> wrote: > >> > >> This is an interesting question. > >> I second both comments so far (from Eric and David), but I am afraid at > the > >> moment the open-source tools for search quality evaluation can't really > >> compare Postgres to Solr. > >> As far as I know, both Quepid(Eric correct me if I am wrong) and RRE( > >> https://github.com/SeaseLtd/rated-ranking-evaluator and also the > Enterprise > >> version) are able to compare only Apache Solr and Elasticsearch backed > >> systems (against each other, or against different configurations). > >> > >> In general, I would recommend following David's suggestions: > >> - collect your requirements(both functional and performance-wise) > >> - compare > >> > >> I have seen in the past many times DB used as terrible search engines > and > >> search engines used as terrible DB. > >> Many times I have seen queries on a search engine to perform poorly > because > >> they were designed as they were DB queries. > >> > >> Cheers > >> > >> -------------------------- > >> Alessandro Benedetti > >> Apache Lucene/Solr PMC member and Committer > >> Director, R&D Software Engineer, Search Consultant > >> > >> www.sease.io > >> > >> > >> On Sat, 5 Mar 2022 at 05:04, David Smiley <dsmi...@apache.org> wrote: > >> > >>> Hello Sam, > >>> > >>> You are a familiar name from my MITRE days :-) > >>> > >>> Check out Solr's feature list and see how it compares to that of > Postgres. > >>> If you are only doing the most basic default relevancy ranked top-N > search > >>> with default text analysis, then the tech/maintenance overhead might > not be > >>> worth it. I'm looking at this as such an example: > >>> https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=solr > >>> > >>> On the other hand, if you want to ensure that you're able to make > search > >>> the best it can be for your users, then keeping Solr and using it more > will > >>> get you there; a database won't. To a database, full-text-search is > just > >>> one checkbox of many concerns. The capabilities there are usually very > >>> simple. It's fine for a demo/POC -- getting started. > >>> > >>> One feature in particular I want to call out is faceting. To some > apps, > >>> it's a game changer that can pivot the UX from merely having a basic > search > >>> box to having navigation filters and everything else, at which point > Solr > >>> is the foundation of what's driving the UX. I've seen people/apps miss > >>> this -- the user experience is so clumsy without it for rich/structured > >>> data in particular. If you've ever used a Maven repository manager > like > >>> Nexus or it's competitors (last I checked), they are still stuck in the > >>> stone-age -- it's painful when you've been exposed to so much better. > On > >>> the backend, if all you know is a database, you may not see how to > make a > >>> faceting UI work because it's rather unnatural for SQL. > >>> > >>> Eric's response was great too. > >>> > >>> ~ David Smiley > >>> Apache Lucene/Solr Search Developer > >>> http://www.linkedin.com/in/davidwsmiley > >>> > >>> > >>> On Fri, Mar 4, 2022 at 9:33 AM Bayer, Samuel <s...@mitre.org> wrote: > >>> > >>>> Hi all - > >>>> > >>>> In the interest of reducing my technology stack, I'm exploring whether > >>>> using Postgres full-text search instead of Solr might be an option > when I > >>>> need both complex querying and full-text search. In my experience, so > >>> far, > >>>> Postgres can't compare to Solr, but I'm trying to understand why, in > >>> order > >>>> to have more of an ability to evaluate the functionality/complexity > >>>> tradeoffs. I know something about search technologies, but I'm not an > >>>> expert by any stretch of the imagination, and I've been looking for > >>> sources > >>>> that talk about the comparison in an informed way - people, blogs, > >>>> articles. So far, everything I've found is extremely basic. Does > anyone > >>>> have any pointers for me? > >>>> > >>>> Thanks in advance - > >>>> Sam Bayer > >>>> The MITRE Corporation > >>>> s...@mitre.org > >>>> > >>> > > > > _______________________ > > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | > http://www.opensourceconnections.com < > http://www.opensourceconnections.com/> | My Free/Busy < > http://tinyurl.com/eric-cal> > > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed < > https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > > > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless of > whether attachments are marked as such. > > > > >