sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add 
documentation for the new Table & SQL API type system
URL: https://github.com/apache/flink/pull/9161#discussion_r305026826
 
 

 ##########
 File path: docs/dev/table/types.md
 ##########
 @@ -0,0 +1,1201 @@
+---
+title: "Data Types"
+nav-parent_id: tableapi
+nav-pos: 1
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Due to historical reasons, the data types of Flink's Table & SQL API were 
closely coupled to Flink's
+`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and 
DataStream API and is
+sufficient to describe all information needed to serialize and deserialize 
JVM-based objects in a
+distributed setting.
+
+However, `TypeInformation` was not designed to properly represent logical 
types independent of an
+actual JVM class. In the past, it was difficult to properly map SQL standard 
types to this abstraction.
+Furthermore, some types were not SQL-compliant and were introduced without a 
bigger picture in mind.
+
+Starting with Flink 1.9, the Table & SQL API will receive a new type system 
that serves as a long-term
+solution for API stablility and standard compliance.
+
+Reworking the type system is a major effort that touches almost all 
user-facing interfaces. Therefore, its introduction
+spans multiple releases and the community aims to finish this effort by Flink 
1.10.
+
+Due to the simultaneous addition of a new planner for table programs (see 
[FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)),
+not every combination of planner and data type is supported. Furthermore, 
planners might not support every
+data type with the desired precision or parameter.
+
+<span class="label label-danger">Attention</span> Please see the planner 
compatibility table and limitations
+section before using a data type.
+
+* This will be replaced by the TOC
+{:toc}
+
+Data Type
+---------
+
+A *data type* describes the data type of a value in the table ecosystem. It 
can be used to declare input and/or
+output types of operations.
+
+Flink's data types are similar to the SQL standard's *data type* terminology 
but also contain information
+about the nullability of a value for efficient handling of scalar expressions.
+
+Examples of data types are:
+- `INT`
+- `INT NOT NULL`
+- `INTERVAL DAY TO SECOND(3)`
+- `ROW<myField ARRAY<BOOLEAN>, myOtherField TIMESTAMP(3)>`
+
+A list of all pre-defined data types can be found in 
[below](#list-of-data-types).
+
+### Data Types in the Table API
+
+Users of the JVM-based API are dealing with instances of 
`org.apache.flink.table.types.DataType` within the Table API or when
+defining connectors, catalogs, or user-defined functions.
+
+A `DataType` instance has two responsibilities:
+- **Declaration of a logical type** which does not imply a concrete physical 
representation for transmission
+or storage but defines the boundaries between JVM-based languages and the 
table ecosystem.
+- *Optional:* **Giving hints about the physical representation of data to the 
planner** which is useful at the edges to other APIs .
+
+For JVM-based languages, all pre-defined data types are available in 
`org.apache.flink.table.api.DataTypes`.
+
+It is recommended to add a star import to your table programs for having a 
fluent API:
+
+<div class="codetabs" markdown="1">
+
+<div data-lang="Java" markdown="1">
+{% highlight java %}
+import static org.apache.flink.table.api.DataTypes.*;
+
+DataType t = INTERVAL(DAY(), SECOND(3));
+{% endhighlight %}
+</div>
+
+<div data-lang="Scala" markdown="1">
+{% highlight scala %}
+import org.apache.flink.table.api.DataTypes._
+
+val t: DataType = INTERVAL(DAY(), SECOND(3));
+{% endhighlight %}
+</div>
+
+</div>
+
+#### Physical Hints
+
+Physical hints are required at the edges of the table ecosystem. Hints 
indicate the data format that an implementation
+expects.
+
+For example, a data source could express that it produces values for logical 
`TIMESTAMP`s using a `java.sql.Timestamp` class
+instead of using `java.time.LocalDateTime` which would be the default. With 
this information, the runtime is able to convert
+the produced class into its internal data format. In return, a data sink can 
declare the data format it consumes from the runtime.
+
+Here are some examples of how to declare a bridging conversion class:
+
+<div class="codetabs" markdown="1">
+
+<div data-lang="Java" markdown="1">
+{% highlight java %}
+// tell the runtime to not produce or consume java.time.LocalDateTime instances
+// but java.sql.Timestamp
+DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class);
+
+// tell the runtime to not produce or consume boxed integer arrays
+// but primitive int arrays
+DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class);
+{% endhighlight %}
+</div>
+
+<div data-lang="Scala" markdown="1">
+{% highlight scala %}
+// tell the runtime to not produce or consume java.time.LocalDateTime instances
+// but java.sql.Timestamp
+val t: DataType = 
DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]);
+
+// tell the runtime to not produce or consume boxed integer arrays
+// but primitive int arrays
+val t: DataType = 
DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]);
+{% endhighlight %}
+</div>
+
+</div>
+
+<span class="label label-danger">Attention</span> Please note that physical 
hints are usually only required if the
+API is extended. Users of predefined sources/sinks/functions do not need to 
define such hints. Hints within
+a table program (e.g. `field.cast(TIMESTAMP(3).bridgedTo(Timestamp.class))`) 
are ignored.
+
+Planner Compatibility
+---------------------
+
+As mentioned in the introduction, reworking the type system will span multiple 
releases and the support of each data
+types depends on the used planner. This section aims to summarize the biggest 
differences.
+
+### Old Planner
+
+Flink's old planner that was introduced before Flink 1.9 primarily supports 
type information. It has only limited
+support of data types. It is possible to declare data types that can be 
translated into type information such that the
 
 Review comment:
   ```suggestion
   support for data types. It is possible to declare data types that can be 
translated into type information such that the
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to