[
https://issues.apache.org/jira/browse/SOLR-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15577338#comment-15577338
]
Dennis Gove edited comment on SOLR-9525 at 10/15/16 4:27 AM:
-------------------------------------------------------------
Full implementation and tests for a split operation. Because it's implemented
as an operation this will work as part of a select(....) stream.
Valid expression forms:
{code}
split(fieldA, on=",") // replace value of fieldA with List<String> of split
values
split(fieldA, on=",", as="fieldB") // splits value of fieldA into List<String>
and puts into fieldB
{code}
was (Author: dpgove):
Full implementation and tests for a split operation. Because it's implementing
as an operation this will work as part of a select(....) stream.
Valid expression forms:
{code}
split(fieldA, on=",") // replace value of fieldA with List<String> of split
values
split(fieldA, on=",", as="fieldB") // splits value of fieldA into List<String>
and puts into fieldB
{code}
> split() function for streaming
> ------------------------------
>
> Key: SOLR-9525
> URL: https://issues.apache.org/jira/browse/SOLR-9525
> Project: Solr
> Issue Type: Wish
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Mike Thomsen
> Attachments: SOLR-9525.patch
>
>
> This is the original description I posted on solr-user:
> Read this article and thought it could be interesting as a way to do
> ingestion:
> https://dzone.com/articles/solr-streaming-expressions-for-collection-auto-upd-1
> Example from the article:
> daemon(id="12345",
> runInterval="60000",
> update(users,
> batchSize=10,
> jdbc(connection="jdbc:mysql://localhost/users?user=root&password=solr",
> sql="SELECT id, name FROM users", sort="id asc",
> driver="com.mysql.jdbc.Driver")
> )
> What's the best way to handle a multivalue field using this API? Is there a
> way to tokenize something returned in a database field?
> Joel Bernstein responded with this:
> Unfortunately there currently isn't a way to split a field. But this would
> be nice functionality to add.
> The approach would be to an add a split operation that would be used by the
> select() function. It would look like this:
> select(jdbc(...), split(fieldA, delim=","), ...)
> This would make a good jira issue.
> So the TL;DR version is that I need the ability to specify in such a
> streaming operation certain fields to tokenize into multivalue fields. In one
> schema I may have to support, there are probably a half a dozen such fields.
> Perhaps I am missing a feature here, but until this is done it looks like
> this new capability cannot handle multivalue fields until something like this
> is in place.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]