[jira] [Resolved] (SOLR-6526) Solr Streaming API

Joel Bernstein (JIRA) Thu, 05 Feb 2015 11:57:06 -0800

     [ 
https://issues.apache.org/jira/browse/SOLR-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Joel Bernstein resolved SOLR-6526.
----------------------------------
    Resolution: Duplicate

This ticket has been superseded by SOLR-7082. 

> Solr Streaming API
> ------------------
>
>                 Key: SOLR-6526
>                 URL: https://issues.apache.org/jira/browse/SOLR-6526
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>            Reporter: Joel Bernstein
>             Fix For: Trunk
>
>         Attachments: SOLR-6526.patch
>
>
> It would be great if there was a SolrJ library that could connect to Solr's 
> /export handler (SOLR-5244) and perform streaming operations on the sorted 
> result sets.
> This ticket defines the base interfaces and implementations for the Streaming 
> API. The base API contains three classes:
> *SolrStream*: This represents a stream from a single Solr instance. It speaks 
> directly to the /export handler and provides methods to read() Tuples and 
> close() the stream
> *CloudSolrStream*: This represents a stream from a SolrCloud collection. It 
> speaks with Zk to discover the Solr instances in the collection and then 
> creates SolrStreams to make the requests. The results from the underlying 
> streams are merged inline to produce a single sorted stream of tuples.
> *Tuple*: The data structure returned by the read() method of the SolrStream 
> API. It is nested to support grouping and Cartesian product set operations.
> Once these base classes are implemented it paves the way for building 
> *Decorator* streams that perform operations on the sorted Tuple sets. For 
> example:
> {code}
> //Create three CloudSolrStreams to different solr cloud clusters. They could 
> be anywhere in the world.
> SolrStream stream1 = new CloudSolrStream(zkUrl1, queryRequest1, "a"); // 
> Alias this stream as "a"
> SolrStream stream2 = new CloudSolrStream(zkUrl2, queryRequest2, "b"); // 
> Alias this stream as "b"
> SolrStream stream3 = new CloudSolrStream(zkUrl3, queryRequest3, "c"); // 
> Alias this stream as "c"
> // Merge Join stream1 and stream2 using a comparator to compare tuples.
> MergeJoinStream joinStream1 = new MergeJoinStream(stream1, stream2, new 
> MyComp());
> //Hash join the tuples from the joinStream1 with stream3 the HashKey()'s 
> define the hashKeys for tuples 
> HashJoinStream joinStream2 = new HashJoinStream(joinStream1,stream3, new 
> HashKey(), new HashKey());
> //Sum the aliased fields from the joined tuples.
> SumStream sumStream1 = new SumStream(joinStream2, "a.field1");
> SumStream sumStream2 = new SumStream(sumStream1, "b.field2");
> Tuple t = null;
> //Read from the stream until it's finished.
> while((t != sumStream2().read()) != null);
> //Get the sums from the joined data.
> long sum1 = sumStream1.getSum();
> long sum2 = sumStream2.getSum();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SOLR-6526) Solr Streaming API

Reply via email to