[ 
https://issues.apache.org/jira/browse/SOLR-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-10017:
----------------------------------
    Description: The crawl Streaming Expression will wrap a stream that emits 
root URL's to crawl. It will then crawl the URL's using a library such as 
Crawler4j. It will emit tuples that can be indexed into a Solr Cloud collection 
using the update function. Solr's classifier can be used to curate content as 
it's being crawled or classify sites based on the content which it contains. 
The links between pages and sites can be indexed as graphs and then explored 
and visualized with graph expressions.  (was: The crawl Streaming Expression 
will wrap a stream that emits root URL's to crawl. It will then crawl the URL's 
using a library such as Crawl4j. It will emit tuples that can be indexed into a 
Solr Cloud collection using the update function. Solr's classifier can be used 
to curate content as it's being crawled or classify sites based on the content 
which it contains. The links between pages and sites can be indexed as graphs 
and then explored and visualized with graph expressions.)

> Add the crawl Streaming Expression
> ----------------------------------
>
>                 Key: SOLR-10017
>                 URL: https://issues.apache.org/jira/browse/SOLR-10017
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Joel Bernstein
>
> The crawl Streaming Expression will wrap a stream that emits root URL's to 
> crawl. It will then crawl the URL's using a library such as Crawler4j. It 
> will emit tuples that can be indexed into a Solr Cloud collection using the 
> update function. Solr's classifier can be used to curate content as it's 
> being crawled or classify sites based on the content which it contains. The 
> links between pages and sites can be indexed as graphs and then explored and 
> visualized with graph expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to