(seatunnel) branch dev updated: [Feature][doc][Connector-V2][Common] Add Common connector documentation (#5453)

wanghailin Fri, 14 Jun 2024 19:02:22 -0700

This is an automated email from the ASF dual-hosted git repository.

wanghailin pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/seatunnel.git



The following commit(s) were added to refs/heads/dev by this push:
     new e5da7e840d [Feature][doc][Connector-V2][Common] Add Common connector 
documentation (#5453)
e5da7e840d is described below

commit e5da7e840d67f0e845166d2ab8c9c70afe143cdb
Author: ZhilinLi <zhilinli0...@gmail.com>
AuthorDate: Sat Jun 15 10:02:13 2024 +0800

    [Feature][doc][Connector-V2][Common] Add Common connector documentation 
(#5453)
---
 docs/en/connector-v2/sink/common-options.md   | 21 +++-----
 docs/en/connector-v2/source/common-options.md | 78 +++++++++++++++++++++------
 docs/en/transform-v2/common-options.md        | 66 ++++++++++++++++++-----
 3 files changed, 125 insertions(+), 40 deletions(-)

diff --git a/docs/en/connector-v2/sink/common-options.md 
b/docs/en/connector-v2/sink/common-options.md
index 2addc49278..bfcdc26a2b 100644
--- a/docs/en/connector-v2/sink/common-options.md
+++ b/docs/en/connector-v2/sink/common-options.md
@@ -2,24 +2,19 @@
 
 > Common parameters of sink connectors
 
-|       name        |  type  | required | default value |
-|-------------------|--------|----------|---------------|
-| source_table_name | string | no       | -             |
-| parallelism       | int    | no       | -             |
+|       Name        |  Type  | Required | Default |                            
                                                                                
                         Description                                            
                                                                                
          |
+|-------------------|--------|----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| source_table_name | String | No       | -       | When `source_table_name` 
is not specified, the current plug-in processes the data set `dataset` output 
by the previous plugin in the configuration file <br/> When `source_table_name` 
is specified, the current plug-in is processing the data set corresponding to 
this parameter. |
 
-### source_table_name [string]
+# Important note
 
-When `source_table_name` is not specified, the current plug-in processes the 
data set `dataset` output by the previous plugin in the configuration file;
+When the job configuration `source_table_name` you must set the 
`result_table_name` parameter
 
-When `source_table_name` is specified, the current plug-in is processing the 
data set corresponding to this parameter.
+## Task Example
 
-### parallelism [int]
+### Simple:
 
-When `parallelism` is not specified, the `parallelism` in env is used by 
default.
-
-When parallelism is specified, it will override the parallelism in env.
-
-## Examples
+> This is the process of passing a data source through two transforms and 
returning two different pipiles to different sinks
 
 ```bash
 source {
diff --git a/docs/en/connector-v2/source/common-options.md 
b/docs/en/connector-v2/source/common-options.md
index a9e607b28e..079f40663a 100644
--- a/docs/en/connector-v2/source/common-options.md
+++ b/docs/en/connector-v2/source/common-options.md
@@ -2,32 +2,80 @@
 
 > Common parameters of source connectors
 
-|       name        |  type  | required | default value |
-|-------------------|--------|----------|---------------|
-| result_table_name | string | no       | -             |
-| parallelism       | int    | no       | -             |
+|       Name        |  Type  | Required | Default |                            
                                                                                
                                                                                
                                                                                
              Description                                                       
                                                                                
              [...]
+|-------------------|--------|----------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| result_table_name | String | No       | -       | When `result_table_name` 
is not specified, the data processed by this plugin will not be registered as a 
data set `(dataStream/dataset)` that can be directly accessed by other plugins, 
or called a temporary table `(table)` <br/>When `result_table_name` is 
specified, the data processed by this plugin will be registered as a data set 
`(dataStream/dataset)` that can be directly accessed by other plugins, or 
called a temporary table `(table [...]
+| parallelism       | Int    | No       | -       | When `parallelism` is not 
specified, the `parallelism` in env is used by default. <br/>When parallelism 
is specified, it will override the parallelism in env.                          
                                                                                
                                                                                
                                                                                
                 [...]
 
-### result_table_name [string]
+# Important note
 
-When `result_table_name` is not specified, the data processed by this plugin 
will not be registered as a data set `(dataStream/dataset)` that can be 
directly accessed by other plugins, or called a temporary table `(table)` ;
+When the job configuration `result_table_name` you must set the 
`source_table_name` parameter
 
-When `result_table_name` is specified, the data processed by this plugin will 
be registered as a data set `(dataStream/dataset)` that can be directly 
accessed by other plugins, or called a temporary table `(table)` . The data set 
`(dataStream/dataset)` registered here can be directly accessed by other 
plugins by specifying `source_table_name` .
+## Task Example
 
-### parallelism [int]
+### Simple:
 
-When `parallelism` is not specified, the `parallelism` in env is used by 
default.
-
-When parallelism is specified, it will override the parallelism in env.
-
-## Example
+> This registers a stream or batch data source and returns the table name 
`fake_table` at registration
 
 ```bash
 source {
     FakeSourceStream {
-        result_table_name = "fake"
+        result_table_name = "fake_table"
     }
 }
 ```
 
-> The result of the data source `FakeSourceStream` will be registered as a 
temporary table named `fake` . This temporary table can be used by any 
`Transform` or `Sink` plugin by specifying `source_table_name` .
+### Multiple Pipeline Simple
+
+> This is to convert the data source fake and write it to two different sinks
+
+```bash
+env {
+  job.mode = "BATCH"
+}
+
+source {
+  FakeSource {
+    result_table_name = "fake"
+    row.num = 100
+    schema = {
+      fields {
+        id = "int"
+        name = "string"
+        age = "int"
+        c_timestamp = "timestamp"
+        c_date = "date"
+        c_map = "map<string, string>"
+        c_array = "array<int>"
+        c_decimal = "decimal(30, 8)"
+        c_row = {
+          c_row = {
+            c_int = int
+          }
+        }
+      }
+    }
+  }
+}
+
+transform {
+  Sql {
+    source_table_name = "fake"
+    result_table_name = "fake1"
+    # the query table name must same as field 'source_table_name'
+    query = "select id, regexp_replace(name, '.+', 'b') as name, age+1 as age, 
pi() as pi, c_timestamp, c_date, c_map, c_array, c_decimal, c_row from fake"
+  }
+  # The SQL transform support base function and criteria operation
+  # But the complex SQL unsupported yet, include: multi source table/rows JOIN 
and AGGREGATE operation and the like
+}
+
+sink {
+  Console {
+    source_table_name = "fake1"
+  }
+   Console {
+    source_table_name = "fake"
+  }
+}
+```
 
diff --git a/docs/en/transform-v2/common-options.md 
b/docs/en/transform-v2/common-options.md
index c45b4ba167..ce88ce8528 100644
--- a/docs/en/transform-v2/common-options.md
+++ b/docs/en/transform-v2/common-options.md
@@ -1,23 +1,65 @@
 # Transform Common Options
 
-> Common parameters of source connectors
+> This is a process of intermediate conversion between the source and sink 
terminals,You can use sql statements to smoothly complete the conversion process
 
-|       name        |  type  | required | default value |
-|-------------------|--------|----------|---------------|
-| result_table_name | string | no       | -             |
-| source_table_name | string | no       | -             |
+|       Name        |  Type  | Required | Default |                            
                                                                                
                                                                                
                                                                  Description   
                                                                                
                                                                                
              [...]
+|-------------------|--------|----------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| result_table_name | String | No       | -       | When `source_table_name` 
is not specified, the current plug-in processes the data set `(dataset)` output 
by the previous plug-in in the configuration file; <br/>When 
`source_table_name` is specified, the current plugin is processing the data set 
corresponding to this parameter.                                                
                                                                                
                                   [...]
+| source_table_name | String | No       | -       | When `result_table_name` 
is not specified, the data processed by this plugin will not be registered as a 
data set that can be directly accessed by other plugins, or called a temporary 
table `(table)`; <br/>When `result_table_name` is specified, the data processed 
by this plugin will be registered as a data set `(dataset)` that can be 
directly accessed by other plugins, or called a temporary table `(table)` . The 
dataset registered here  [...]
 
-### source_table_name [string]
+## Task Example
 
-When `source_table_name` is not specified, the current plug-in processes the 
data set `(dataset)` output by the previous plug-in in the configuration file;
+### Simple:
 
-When `source_table_name` is specified, the current plugin is processing the 
data set corresponding to this parameter.
+> This is the process of converting the data source to fake and write it to 
two different sinks, Detailed reference `transform`
 
-### result_table_name [string]
+```bash
+env {
+  job.mode = "BATCH"
+}
 
-When `result_table_name` is not specified, the data processed by this plugin 
will not be registered as a data set that can be directly accessed by other 
plugins, or called a temporary table `(table)`;
+source {
+  FakeSource {
+    result_table_name = "fake"
+    row.num = 100
+    schema = {
+      fields {
+        id = "int"
+        name = "string"
+        age = "int"
+        c_timestamp = "timestamp"
+        c_date = "date"
+        c_map = "map<string, string>"
+        c_array = "array<int>"
+        c_decimal = "decimal(30, 8)"
+        c_row = {
+          c_row = {
+            c_int = int
+          }
+        }
+      }
+    }
+  }
+}
 
-When `result_table_name` is specified, the data processed by this plugin will 
be registered as a data set `(dataset)` that can be directly accessed by other 
plugins, or called a temporary table `(table)` . The dataset registered here 
can be directly accessed by other plugins by specifying `source_table_name` .
+transform {
+  Sql {
+    source_table_name = "fake"
+    result_table_name = "fake1"
+    # the query table name must same as field 'source_table_name'
+    query = "select id, regexp_replace(name, '.+', 'b') as name, age+1 as age, 
pi() as pi, c_timestamp, c_date, c_map, c_array, c_decimal, c_row from fake"
+  }
+  # The SQL transform support base function and criteria operation
+  # But the complex SQL unsupported yet, include: multi source table/rows JOIN 
and AGGREGATE operation and the like
+}
 
-## Examples
+sink {
+  Console {
+    source_table_name = "fake1"
+  }
+   Console {
+    source_table_name = "fake"
+  }
+}
+```

(seatunnel) branch dev updated: [Feature][doc][Connector-V2][Common] Add Common connector documentation (#5453)

Reply via email to