Me and my team are building upon this solcolator:
https://github.com/SOLR4189/solcolator

Currently the processor is build for Solr 6.5.1, we are working on updating
our Solr and I hope to release a complete version of our Solcolator  as
open source then (it will be for version 8.6.x).

Making it an update processor (either make it the last element and replace
the usual processor that index the document, or by using it as the one from
last processor in the collection, and so allow monitoring also atomic
updates [which is relatively costly]).

By making it an update processor we don't rely on the streaming deamon,
which we found unsatisfying as we wish to allow users to define their own
monitors over the index.

On Mon, Sep 6, 2021, 8:25 PM Charlie Hull <ch...@opensourceconnections.com>
wrote:

> Are you trying to monitor a stream of emails for certain patterns? In
> which case you might look at the Lucene Monitor
>
> https://lucene.apache.org/core/8_2_0/monitor/index.html?overview-summary.html
> https://issues.apache.org/jira/browse/LUCENE-8766, which was originally
> Luwak - at my previous company Flax we helped build several large-scale
> monitoring systems with this https://github.com/flaxsearch/luwak . It's
> not officially surfaced in Solr yet although my colleague Scott Stults
> has been working on some ideas: https://github.com/o19s/solr-monitor
>
> best
> Charlie
>
> On 06/09/2021 14:32, Dan Rosher wrote:
> > Hi,
> >
> > I was wondering if anyone had tried email alerts with streaming
> > expressions, and what their experience was if attempting this with say 12
> > million emails / day? Traditionally this might have been done with a
> > database cursor iterator daily.
> >
> > I was thinking if something like the following pseudocode expression with
> > 'kafka' as a custom push expression:
> >
> > daemon(id="alertId",
> >         runInterval="1000",
> >         kafka(
> >          kafka_topic,
> >          alertId,
> >          topic(email_alerts,
> >            doc_collection,
> >            q="email query",
> >            fl="id, title, abstract",
> >            id="alertId",
> >            initialCheckpoint=0)
> >          )
> >
> > If you have done something like this 'where' would you typically run the
> > daemon, on replicas away from replicas running web queries?
> >
> > Many thanks in advance for any advice / suggestions,
> >
> > Dan
> >
>
> --
> Charlie Hull - Managing Consultant at OpenSource Connections Limited
> <www.o19s.com>
> Founding member of The Search Network <https://thesearchnetwork.com/>
> and co-author of Searching the Enterprise
> <https://opensourceconnections.com/about-us/books-resources/>
> tel/fax: +44 (0)8700 118334
> mobile: +44 (0)7767 825828
>
> OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
> Amtsgericht Charlottenburg | HRB 230712 B
> Geschäftsführer: John M. Woodell | David E. Pugh
> Finanzamt: Berlin Finanzamt für Körperschaften II
>

Reply via email to