[ https://issues.apache.org/jira/browse/SOLR-15381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mayya Sharipova updated SOLR-15381: ----------------------------------- Security: (was: Public) > SimplePostTool.java PageFetcher error > ------------------------------------- > > Key: SOLR-15381 > URL: https://issues.apache.org/jira/browse/SOLR-15381 > Project: Solr > Issue Type: Bug > Components: SimplePostTool > Reporter: QualiteSys QualiteSys > Priority: Major > > The SimplePostTool fails to grab web pages in simple cases. > The getLinksFromWebPage process fails to detect url within the html page in > line 1252. Seams to be a problem when the html page is not perfect, from the > xml point of view. > > Example to reproduce the problem : > java -Dc=techproducts -Ddata=web -Drecursive=3 -jar > example\exampledocs\post.jar [http://www.google.com/] > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org