Re: About Using Hadoop in SolrCloud

David Smiley Thu, 23 Feb 2023 11:41:18 -0800

I agree with Eric, but wish to add one point:  Separation of compute from
storage to get: better redundancy (HDFS or S3 will do it better, maybe
cheaper), better elasticity (since Solr nodes become stateless; easy to add
more nodes), better cost?  Sacrifice indexing performance and a bit of
query.  Admittedly I don't have real experience here but this is my
thinking.  The most annoying thing about Solr's HDFS support is that
SolrCloud's replication is quite redundant/wasteful with that at the
storage layer, thus adding cost inefficiency. There is potential for
improvements there.


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Feb 23, 2023 at 7:45 AM Eric Pugh <ep...@opensourceconnections.com>
wrote:

> I am replying, but just to the users mailing list, as it’s not appropriate
> for dev@.
>
> I think the short answer is that if you are already super into the Hadoop
> ecosystem, then you already have strong reasons why, and you can answer all
> of your questions listed already ;-).  You then look at Solr on Hadoop as
> “hey, it works with what I am already doing” at my enterprise.
>
> If you aren’t already in the Hadoop ecosystem, then there isn’t any
> special Solr specific reason to go this way, and indeed many reasons NOT
> to.   Hadoop isn’t for the faint of heart….
>
> Not an answer per se….
>
> > On Feb 23, 2023, at 5:57 AM, Zara Parst <edotserv...@gmail.com> wrote:
> >
> > Hi,
> >
> > I read at many places about using Hadoop in solrCloud. I try to find the
> > reason why to use Hadoop in place of a local file system. Can someone
> > briefly explain why to use Hadoop with SolrCloud when solr is just using
> > Hadoop for indexing and storing logs in Hadoop. Is there any compelling
> > reason to do that?
> >
> > Is Hadoop having any advantage over the local file system with solr,
> since
> > I can achieve cloud mod storing index in the local file system and can
> > still use shard and replica.  So my question is what advantage Hadoop
> will
> > give me, does Hadoop do indexing fast, does Hadoop take less space to
> store
> > index, is that distributed file system is better in Hadoop, like
> sharding,
> > replication etc. Or does it take backup automatically?
> >
> > Please do answer this question as much as possible,
>
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com <
> http://www.opensourceconnections.com/> | My Free/Busy <
> http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
>
>

Re: About Using Hadoop in SolrCloud

Reply via email to