Hi, if we also need other complementary connectors, we could consider to use Apache ManifoldCF for crawling. Now we have a lot of new connectors and a new GUI in the next coming release!
^__^ 2017-03-28 21:44 GMT+02:00 Grant Ingersoll <gsing...@apache.org>: > https://github.com/lucidworks/searchhub has all the crawlers/setup already > setup for a number of ASF projects (email, Github, websites, wikis, Stack > Overflow) and a pretty easy framework for specifying others (I looked at > the FOAF stuff, but it wasn't consistent enough to automate). Lucidworks > (my employer/company) is happy to donate licenses of Fusion, our commercial > product on top of Solr and Spark, if the ASF will provide hardware. Or, if > someone will put up the Pull Request to add all the projects, we can host > it, as we already have a multinode cluster setup and we have read only APIs > available, so it would just take UI integration. > > -Grant > > > On Tue, Mar 28, 2017 at 1:16 PM Dave Fisher <dave2w...@comcast.net> wrote: > > > Hi - > > > > I’ve got knowledge too and I also have some ideas I am thinking about. I > > also have some bandwidth now that I am going into job search mode. > > > > I think an important step is to think through what the taxonomy should be > > as that will help inform the common schema. > > > > Regards, > > Dave > > > > > On Mar 28, 2017, at 9:34 AM, Alexandre Rafalovitch <arafa...@gmail.com > > > > wrote: > > > > > > Just to provide links: > > > http://jirasearch.mikemccandless.com/search.py?index=jira - Lucene > > > (not Solr) based search of issues for several projects. Very deep > > > understanding of the domain. Adding more is probably not that hard. > > > http://search-lucene.com/ - Solr-based, search over mailing lists, > > > wikis, issues, etc for a bunch (a larger number) of projects. Run by > > > Sematext (Otis' company) > > > http://find.searchhub.org/ - commercial LucidWorks' Fusion-based IIRC > > > (though some bits are open-source). Lots of projects and sources. But > > > it feels a bit dogfoody, so the attention it gets is uneven. > > > > > > So, I think Nick/Chris' point is valid that the definition of the > > > project may need to take this into account and it is entirely possible > > > that expanding these (if the project owners would agree) might be > > > actually the easiest path forward. > > > > > > > > > Regards, > > > Alex. > > > ---- > > > http://www.solr-start.com/ - Resources for Solr users, new and > > experienced > > > > > > > > > On 28 March 2017 at 12:20, Chris Mattmann <mattm...@apache.org> wrote: > > >> +1 I think that minimizing the requirement to run specific > > infrastructure, and trying > > >> to convince those already running such services I believe like Otis > and > > Grant/others > > >> from Lucid are optimal choices. > > >> > > >> Cheers, > > >> Chris > > >> > > >> > > >> > > >> > > >> On 3/28/17, 12:19 PM, "Nick Burch" <n...@apache.org> wrote: > > >> > > >> On Tue, 28 Mar 2017, Shane Curcuru wrote: > > >>> As has been pondered many times (recently by Rich and Sally, among > many > > >>> others), it would be really nice to better help newcomers find the > > right > > >>> information at the ASF or our projects. We have one of the > industry's > > >>> leading search tools right here: why aren't we using it, and even > > >>> better, semi-consistently across apache.org sites that want to? > > >> > > >> Some Apache projects do have externally hosted instances of SOLR > > indexing > > >> and searching their project sites. Tika and Lucene are two such > > sites, off > > >> the top of my head. Would asking the committers maintaining those > > about > > >> adding some more sites be an option? > > >> > > >> Nick > > >> > > >> ------------------------------------------------------------ > --------- > > >> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > > >> For additional commands, e-mail: dev-h...@community.apache.org > > >> > > >> > > >> > > >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > > >> For additional commands, e-mail: dev-h...@community.apache.org > > >> > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > > > For additional commands, e-mail: dev-h...@community.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org > > For additional commands, e-mail: dev-h...@community.apache.org > > > > > -- Piergiorgio Lucidi Technology Evangelist @ Sourcesense Author and Technical Reviewer @ Packt Publishing Mentor / PMC Member / Committer @ Apache Software Foundation Wiki Gardener / Forum Moderator / Certified Instructor, Engineer and Administrator @ Alfresco Top Community Contributor @ Crafter Project Leader / Committer @ JBoss Technology Advisory Team Member @ Microsoft http://www.open4dev.com