I'm curious about this too. There's a bunch of difficult to maintain code
in our codebase relating to HDFS and a lot of the HDFS tests are super
flakey. I've had the impression it was mostly added because there was a
point at which "HDFS all the things" was a fad. I haven't personally ever
seen it in actual use at a customer. I have a suspicion it's mostly
attractive to places that made a big HDFS investment. Not having heard
success stories if a customer asked I'd advise them "don't do it unless
someone is forcing you onto HDFS, I don't know anyone using it and the
tests fail frequently so it may be buggy." I don't think I've ever gone to
http://fucit.org/solr-jenkins-reports/failure-report.html and not seen 1/2
to 1/4 of the tests with recent failures not have HDFS in the name...

On Sat, Aug 31, 2024 at 3:04 AM ufuk yılmaz <uyil...@vivaldi.net.invalid>
wrote:

> Hi,
>
> It is possible to put Solr index on  hdfs instead of a regular disk, but I
> wonder if there is a significant upside of that approach?
>
> Is it to take advantage of hdfs’ replication to protect data from disk
> failures?
>
> Is it mostly for the situation “I already have a functioning hdfs cluster,
> lets just reuse that instead of dealing with local disks” or is there still
> an upside of setting up hdfs just to use with Solr?
>
> —ufuk
>
> —
>


-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)

Reply via email to