[ https://issues.apache.org/jira/browse/SOLR-7393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Pugh resolved SOLR-7393. ----------------------------- Resolution: Won't Fix HDFS has been removed in Solr 10. > HDFS poor indexing performance > ------------------------------ > > Key: SOLR-7393 > URL: https://issues.apache.org/jira/browse/SOLR-7393 > Project: Solr > Issue Type: Bug > Components: Hadoop Integration, hdfs, SolrCloud > Affects Versions: 4.7.2, 4.10.3 > Environment: HDP 2.2 / HDP Search + LucidWorks Hive SerDe > Reporter: Hari Sekhon > Priority: Critical > > When switching SolrCloud from local dataDir to HDFS directory factory > indexing performance falls through the floor. > I've also observed very high latency on both QTime and code timer on HDFS > writes compares to local dataDir writes (using check_solr_write.pl from > https://github.com/harisekhon/nagios-plugins). Single test document write > latency jumps from a few dozen milliseconds to 700-1700 millisecs, over 2000 > on some runs. > A previous bulk online indexing job from Hive to SolrCloud that took 2 hours > for 620M rows ended up taking a projected 20+ hours and never completing, > usually breaking around the 16-17 hour timeframe when left overnight. > It's worth noting that I had to disable the HDFS write cache which was > causing index corruption (SOLR-7255) on the advice of Mark Miller, who tells > me this doesn't make much performance difference anway. > This is probably also related to SolrCloud not respecting HDFS replication > factor, effectively making 4 copies of data instead of 2 (SOLR-6528), but > that solely doesn't account for the massive performance drop going from > vanilla SolrCloud to SolrCloud on HDFS HA + Kerberos. > Hari Sekhon > http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org