On 18 April 2011 13:30, Roland Villemoes <r...@alpha-solutions.dk> wrote:
> Hi Gérard > > > > Thank you for your reply. > You're welcome ;-) > I will absolutely look into it J > > > > Still wondering where the Solr Community will bring this in the future? > As for Lucene, SolR is mainly focussed on indexing and not preprocessing. I'm convince that it's good for an opensource project to stick on its core added-value. I'm seen to much projects dying because they tried to do everything. But perhaps this could be done in Nutch which cover almost every part of a search engine. UIMA (as it was suggested) may also be a good solution. > Looking at commercial products (we use this a lot here at Alpha Solutions) > products like Exalead and FAST really does have impressive content (and > search) pipelines, and most of all impressive tools included. And as the > future for FAST is extremely uncertain now FAST customers moving to Solr > will lack the pipelines and the tools. > But I don't suggest open source project to follow any of these commercial product roadmap ;-) I rather prefer modular self contained and efficient open source projects. Then for integration, we need another layer like UIMA or WebLab. > Well as consultants we can establish functionality developing the missing > pieces – but tools are still missing. And where customers could (almost) > administer and work on pipelines themselves – they now need developers. > That's the tricky part. Hopefully these projects will mature to a level where administration of high level orchestration is easy. But to be frank, it's not really easy in many way and if the end-users want to administrate this part themselves, they will still need some basic understanding and training. > Thanks for input – looking forward to see more J > Good luck, keep me informed. gd > > > Roland Villemoes > > *From:* Gérard Dupont [mailto:ger.dup...@gmail.com] > *Sent:* 18. april 2011 12:50 > *To:* dev@lucene.apache.org > *Subject:* Re: PipeLine for Solr > > > > Hi Roland, > > > > We are proposing exactly this kind of integration facility with our open > source WebLab-project (see weblab-project.org). The tutorials are not > perfect, but we are a team of 15-like engineers on the project which has > more than 4 years history and is currently used in our projects. Our goal is > to rely as much as possible on standards and thus each processing step > (SourceReader, Normaliser, Analyser...) are defined as Webservice. Then the > global orchestration is done in BPEL. On the plus side we have a SolR > indexer, but I'm quite sure it's not very optimised ;-). > > > > If you are interested I'll be happy to support you (I'm paid for that > already ;-). > > > > cheers > > > > On 18 April 2011 12:37, Roland Villemoes <r...@alpha-solutions.dk> wrote: > > Hi All, > > > > I know this question may have been asked before – but I really did not find > any usable answers browsing the archives. So I have to try the developer > list here. > > > > We at Alpha Solutions often need a Pipeline for handling crawling, > analyzing and routing before we hit the UpdateRequestHandler in Solr. I know > we could actually use the UpdateRequestHandler for this - but often we like > to perform all these tasks before hitting Solr. > > We have been using OpenPipeline which does offer a GUI also which makes it > rather nice to administer (if you tweak the GUI a bit!). I does seem though, > that OpenPipeline will not really get going. Nothing happens, and there is > not really any community around it – and it doesn’t seem that the guys > that’s behind this will ever move this further. > > > > So we are looking around towards other “pipeline” projects that can work > well with Solr. > > > > So – does any of you have any ideas on this? Any recommendations? Or any > plans of this for Solr? > > > > Thanks a lot > > *Med venlig hilsen / Best regards* > > *Roland Villemoes* > *Tel:* (+45) 22 69 59 62 > *E-mail:* r...@alpha-solutions.dk > > *Alpha Solutions A/S* > Borgergade 2, 3.sal, DK-1300 Copenhagen K > *Tel:* (+45) 70 20 65 38 > *Web:* www.alpha-solutions.dk > > > ** This message including any attachments may contain confidential and/or > privileged information > intended only for the person or entity to which it is addressed. If you are > not the intended recipient > you should delete this message. Any printing, copying, distribution or > other use of this message is strictly prohibited. > If you have received this message in error, please notify the sender > immediately by telephone > or e-mail and delete all copies of this message and any attachments from > your system. Thank you. > > > > > > > -- > Gérard Dupont > Information Processing Control and Cognition (IPCC) > > CASSIDIAN - an EADS company > > > Document & Learning team - LITIS Laboratory > > > -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document & Learning team - LITIS Laboratory