epugh commented on code in PR #2479: URL: https://github.com/apache/solr/pull/2479#discussion_r1774352198
########## solr/solr-ref-guide/modules/query-guide/pages/stream-tool.adoc: ########## @@ -0,0 +1,167 @@ += Stream Tool +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +The Stream tool allows you to run a xref:streaming-expressions.adoc[] and see the results from the command line. +It is very similar to the xref:stream-screen.adoc[], but is part of the `bin/solr` CLI. + +To run it, open a window and enter: + +[,console] +---- +$ bin/solr stream --header -c techproducts --delimiter=\| 'search(techproducts,q="name:memory",fl="name,price")' +---- + +This will run the provided streaming expression on the `techproducts` collection and produce: + +[,console] +---- +name|price +CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail|185.0 +CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail|74.99 +A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM| +---- + +TIP: Notice how we used the pipe character (|) as the delimiter? It required a backslash for escaping it so it wouldn't be treated as a pipe with in the shell script. + +You can also specify a file with the suffix `.expr` containing your streaming expression. +This is useful for longer expressions or if you having command line parsing issues with your expression. + +Assuming you have create the file `stream.expr` with the contents: + +---- +# Stream a search + +search( + techproducts, + q="name:memory", + fl="name,price", + sort="price desc" +) +---- + +Then you can run it on the Solr collection `techproducts`, specifying you want a header row: + +[,console] +---- +$ bin/solr stream --header -c techproducts stream.expr +---- + +And this will produce: + +[,console] +---- +name price +CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail 185.0 +CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail 74.99 +A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM +---- + +The `--help` (or simply `-h`) option will output information on its usage (i.e., `bin/solr stream --help)`. + +== Using the bin/solr stream Tool + +To use the tool you need to provide the streaming expression either inline as the last argument, or provide a file ending in `.expr` that contains the expression. + +The basic usage of `bin/solr stream` is: + +[source,plain] +---- +usage: bin/solr stream [--array-delimiter <CHARACTER>] [-c <NAME>] [--delimiter <CHARACTER>] [-f <FIELDS>] [--header] + [-u <credentials>] [-url <HOST>] [--workers <arg>] [-z <HOST>] + +List of options: + --array-delimiter <CHARACTER> The delimiter multi-valued fields. (default=|) + -c,--name <NAME> Name of the collection to execute on if workers are 'solr'. Required for 'solr' + worker. + --delimiter <CHARACTER> The output delimiter. (default=tab). + -f,--fields <FIELDS> The fields in the tuples to output. (defaults to fields in the first tuple of result + set). + --header Whether or not to include a header line. (default=false) + -u,--credentials <credentials> Credentials in the format username:password. Example: --credentials solr:SolrRocks + -url,--solr-url <HOST> Base Solr URL, which can be used to determine the zk-host if that's not known; + defaults to: http://localhost:8983. + -e,--execution <CONTEXT> Execution context is either 'local' or 'solr'. Default is 'solr' + -z,--zk-host <HOST> Zookeeper connection string; unnecessary if ZK_HOST is defined in solr.in.sh; + otherwise, defaults to localhost:9983. +---- + +== Examples Using bin/solr stream + +There are several ways to use `bin/solr stream`. +This section presents several examples. + +=== Executing Expression Locally + +Streaming Expressions by default are executed in the Solr cluster. +However there are use cases where you want to interact with data in your local environment, or even run a streaming expression independent of Solr. + +The Stream Tool allows you to specify `--execution local` to process the expression in the Solr CLI's JVM. + +Assuming you have create the file `load_data.expr` with the contents: + +---- +# Index CSV File + +update( + gettingstarted, + parseCSV( + cat(./example/exampledocs/books.csv, maxLines=2) + ) +) +---- + +Running this expression will read in the local file and send the first two lines to the collection `gettingstarted`. Review Comment: okay, I took a stab at it... If I didn't quite nail it, please edit it but with myabe the suggestion thing? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org