[jira] [Commented] (FLINK-3640) Add support for SQL queries in DataSet programs

ASF GitHub Bot (JIRA) Mon, 11 Apr 2016 08:29:39 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235289#comment-15235289
 ]


ASF GitHub Bot commented on FLINK-3640:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1867#discussion_r59224953
  
    --- Diff: docs/apis/batch/libs/table.md ---
    @@ -408,3 +428,132 @@ Here, `literal` is a valid Java literal and `field 
reference` specifies a column
     column names follow Java identifier syntax.
     
     Only the types `LONG` and `STRING` can be casted to `DATE` and vice versa. 
A `LONG` casted to `DATE` must be a milliseconds timestamp. A `STRING` casted 
to `DATE` must have the format "`yyyy-MM-dd HH:mm:ss.SSS`", "`yyyy-MM-dd`", 
"`HH:mm:ss`", or a milliseconds timestamp. By default, all timestamps refer to 
the UTC timezone beginning from January 1, 1970, 00:00:00 in milliseconds.
    +
    +{% top %}
    +
    +SQL
    +----
    +The Table API also supports embedded SQL queries.
    +In order to use a `Table` or `DataSet` in a SQL query, it has to be 
registered in the `TableEnvironment`, using a unique name.
    +A registered `Table` can be retrieved back from the `TableEnvironment` 
using the `scan` method:
    +
    +<div class="codetabs" markdown="1">
    +<div data-lang="java" markdown="1">
    +{% highlight java %}
    +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    +// create a Table environment
    +TableEnvironment tableEnv = new TableEnvironment();
    +// reset the translation context: this will erase existing registered 
Tables
    +TranslationContext.reset();
    +// read a DataSet from an external source
    +DataSet<Tuple2<Integer, Long>> ds = env.readTextFile(...);
    +// register the DataSet under the name "MyTable"
    +tableEnv.registerDataSet("MyTable", ds);
    +// retrieve "MyTable" into a new Table
    +Table t = tableEnv.scan("MyTable");
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="scala" markdown="1">
    +{% highlight scala %}
    +val env = ExecutionEnvironment.getExecutionEnvironment
    +// create a Table environment
    +val tEnv = new TableEnvironment
    +// reset the translation context: this will erase existing registered 
Tables
    +TranslationContext.reset()
    +// read a DataSet from an external source
    +val ds = env.readTextFile(...)
    +// register the DataSet under the name "MyTable"
    +tEnv.registerDataSet("MyTable", ds)
    +// retrieve "MyTable" into a new Table
    +val t = tEnv.scan("MyTable")
    +{% endhighlight %}
    +</div>
    +</div>
    +
    +*Note: Table names are not allowed to follow the `^_DataSetTable_[0-9]+` 
pattern, as this is reserved for internal use only.*
    +
    +When registering a `DataSet`, one can also give names to the `Table` 
columns. For example, if "MyTable" has three columns, `user`, `product`, and 
`order`, we can give them names upon registering the `DataSet` as shown below:
    +
    +<div class="codetabs" markdown="1">
    +<div data-lang="java" markdown="1">
    +{% highlight java %}
    +// register the DataSet under the name "MyTable" with columns user, 
product, and order
    +tableEnv.registerDataSet("MyTable", ds, "user, product, order");
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="scala" markdown="1">
    +{% highlight scala %}
    +// register the DataSet under the name "MyTable" with columns user, 
product, and order
    +tEnv.registerDataSet("MyTable", ds, 'user, 'product, 'order)
    +{% endhighlight %}
    +</div>
    +</div>
    +
    +A `Table` can be registered in a similar way:
    +
    +<div class="codetabs" markdown="1">
    +<div data-lang="java" markdown="1">
    +{% highlight java %}
    +// read a DataSet from an external source
    +DataSet<Tuple2<Integer, Long>> ds = env.readTextFile(...);
    +// create a Table from the DataSet with columns user, product, and order
    +Table t = tableEnv.fromDataSet(ds).as("user, product, order");
    +// register the Table under the name "MyTable"
    +tableEnv.registerTable("MyTable", t);
    +{% endhighlight %}
    +</div>
    +
    +<div data-lang="scala" markdown="1">
    +{% highlight scala %}
    +// read a DataSet from an external source and
    +// create a Table from the DataSet with columns user, product, and order
    +val t = env.readTextFile(...).as('user, 'product, 'order)
    +// register the Table under the name "MyTable"
    +tEnv.registerTable("MyTable", t)
    +{% endhighlight %}
    +</div>
    +</div>
    +
    +After registering a `Table` or `DataSet`, one can use them in SQL queries. 
A SQL query is executed using the `sql` method of the `TableEnvironment`.
    --- End diff --
    
    A SQL query is executed -> A SQL query is defined
    
    Execution happens later when the program is executed 
(ExecutionEnvironment.execute/print/collect). 


> Add support for SQL queries in DataSet programs
> -----------------------------------------------
>
>                 Key: FLINK-3640
>                 URL: https://issues.apache.org/jira/browse/FLINK-3640
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>    Affects Versions: 1.1.0
>            Reporter: Vasia Kalavri
>            Assignee: Vasia Kalavri
>
> This issue covers the task of supporting SQL queries embedded in DataSet 
> programs. In this mode, the input and output of a SQL query is a Table. For 
> this issue, we need to make the following additions to the Table API:
> - add a {{tEnv.sql(query: String): Table}} method for converting a query 
> result into a Table
> - integrate Calcite's SQL parser into the batch Table API translation process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3640) Add support for SQL queries in DataSet programs

Reply via email to