As you mentioned currently only the equi-join is supported. But you could pretty quickly adapt an existing join to do what you want.
https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/io/stream/LeftOuterJoinStream.java Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Dec 29, 2021 at 10:23 AM Eric Pugh <ep...@opensourceconnections.com> wrote: > > https://github.com/epugh/playing-with-solr-streaming-expressions/tree/master/streaming_expressions/src/main/java/com/o19s/solr/streaming > has an example of parsing JSONL formatted docs and an example of using > atomic updates ;-) > > > https://github.com/epugh/playing-with-solr-streaming-expressions/blob/interact_with_tika_server/streaming_expressions/src/main/java/com/o19s/solr/streaming/SpaCyStream.java > is an example of interacting with SpaCy ;-) > > > > > On Dec 29, 2021, at 10:01 AM, Damiano Albani <damiano.alb...@gmail.com> > wrote: > > > > Hi Eric, > > > > Thanks for your feedback, I highly appreciate it. > > I don't mind going the route of implementing something myself. I will > have > > a try. > > By any chance, apart from looking at the official codebase, do you know > of > > any examples out there I could draw my inspiration from? > > > > Regards, > > > > On Wed, Dec 29, 2021 at 3:08 PM Eric Pugh < > ep...@opensourceconnections.com <mailto:ep...@opensourceconnections.com>> > > wrote: > > > >> Damiano, I don’t really have a direct answer for you. However, one of > >> the aspects of Streaming that I really like is that it’s relatively > easy to > >> create your own operators and add them to Solr. I find that I often > just > >> create my own operator to fill in the gap of what is available. > >> > >> I do think joining disparate datasets to make new datasets is one of the > >> most interesting uses of Streaming, so would love to see what you cook > up. > >> > >> Eric > >> > >>> On Dec 29, 2021, at 6:39 AM, Damiano Albani <damiano.alb...@gmail.com> > >> wrote: > >>> > >>> Hello, > >>> > >>> I'm new to streaming expressions, so I'm trying to understand their > >>> features and limitations. > >>> In particular the so-called "stream operators" implementing join > >> operations. > >>> Like "innerJoin", "leftOuterJoin", etc. > >>> > >>> I see that they support a "on" parameter, defining the *equality* check > >> to > >>> be performed. > >>> But, coming from the SQL world, I'm used to being able to use a variety > >> of > >>> comparison operators in join predicates. That is, not only equality, as > >> in > >>> "equi-joins". > >>> > >>> Is there a reason why the current implementation of Solr supports > >>> equi-joins only? Would it be technically possible (and desired) to > >> support > >>> other comparison operators with joins? > >>> And maybe somehow allow the use of the available stream evaluators > >>> <https://solr.apache.org/guide/8_11/stream-evaluator-reference.html>? > >>> > >>> To give the context of my question: I'm trying to join 2 sets of > >> documents > >>> with a hierarchical relationship. > >>> My goal is to join them using a "path" field on one side and > >>> "descendent_path" field on the other side. > >>> But it looks like that only doc values are accessible (and not analyzed > >>> ones) in streams, so I suppose I'd be left with a join criteria like > this > >>> pseudo-code: > >>> > >>>> on="starts_with(right.path, left.path)" > >>> > >>> Where, in this hypothetical example: > >>> > >>>> left.path=/categories/category1" > >>>> right.path=/categories/category1/sub-categories/sub-category-a" > >>> > >>> > >>> Or do I completely misunderstand how Solr (streams) work? ;-) > >>> Thanks for your help! > >>> > >>> Regards, > >>> > >>> -- > >>> Damiano Albani > >> > >> _______________________ > >> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | > >> http://www.opensourceconnections.com < > >> http://www.opensourceconnections.com/ < > http://www.opensourceconnections.com/>> | My Free/Busy < > >> http://tinyurl.com/eric-cal <http://tinyurl.com/eric-cal>> > >> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed < > >> > https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw > < > https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw > >> > >> > >> This e-mail and all contents, including attachments, is considered to be > >> Company Confidential unless explicitly stated otherwise, regardless of > >> whether attachments are marked as such. > >> > >> > > > > -- > > Damiano Albani > > _______________________ > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | > http://www.opensourceconnections.com < > http://www.opensourceconnections.com/> | My Free/Busy < > http://tinyurl.com/eric-cal> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed < > https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> > > This e-mail and all contents, including attachments, is considered to be > Company Confidential unless explicitly stated otherwise, regardless of > whether attachments are marked as such. > >