chncaesar opened a new pull request, #9593:
URL: https://github.com/apache/seatunnel/pull/9593

   ### Purpose of this pull request
   This pr addresses the 
[issue-9592](https://github.com/apache/seatunnel/issues/9592). It adds two 
options into source : sub_table and field_names, one option to sink : 
field_names.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. New tdengine connector options are added . Users can specify sub_table 
and field_name when retrieving data from tdengine; and specify field_name when 
inserting into tdegnine. However, these are optional. Meaning old tdengine 
source and sink pipelines are not affected by the change, they still work as 
expected. 
   
   The documentation is updated in this PR.  
   
   ### How to use new options to retreive data
   The following pipeline retrieves 3 sub tables of super table signal_data, 
ignoring the rest. And only selects specified fields.
   Please note that the tags should always be the last element of field_names, 
in our case  it's signal_id. 
   
   ```
   env {
     execution.parallelism = 1
     job.mode = "BATCH"
   }
   
   source {
     TDengine {
       url : "jdbc:TAOS-RS://192.168.1.1:6041/"
       username : "root"
       password : "taosdata"
       database : "signal"
       stable : "signal_data"
       lower_bound : "2024-07-15 00:00:00.000"
       upper_bound : "2025-07-11 00:00:00.000"
       sub_table: "signal_data_342,signal_data_358,signal_data_349"
       field_names = "ts,gateway_id,signal_name,return_value,signal_id"
     }
   }
   
   sink {
     LocalFile {
       path = "/mnt/sdc/data/taos/"
       file_format_type = "parquet"
       compress_codec = "snappy"
     }
   }
   ```
   
   ### How to use new options on the sink side
   ```
   env {
     execution.parallelism = 2
     job.mode = "BATCH"
   }
   
   source {
     LocalFile {
       path = "/work/data/taos_mfrs_signal_data_current/"
       file_format_type = "parquet"
       compress_codec = "snappy"
     }
   }
   
   
   sink {
     TDengine {
       url : "jdbc:TAOS-RS://192.168.1.1:6041/"
       username : "root"
       password : "taosdata"
       database : "signal"
       stable : "signal_data"
       field_names = "ts,gateway_id,signal_name,return_value"
     }
   }
   ```
   When specifying field_names on the sink side, please ignore the tag column, 
"signal_id" in the example. The tdegnine automatically puts the tags column in 
the insert statement.  Here's an example.
   ```sql
   insert into signal.signal_data_342 
   using (signal_data) 
   tags ( 1 )
   ( ts,gateway_id,signal_name,return_value) 
   values ("2025-01-01 00:00:00", 1, "name", 0.1)
   ```
   
   ### How was this patch tested?
   - Added unit test to cover sub_table and field_name on the source side. 
Please see TDengineSourceReaderTest.java 
   - Added e2e test to cover sub_table and field_name on the source side, 
field_name on the sink side. Please see TDengine.java 
   
   
   ### Check list
   
   * [x] If any new Jar binary package adding in your PR, please add License 
Notice according
     [New License 
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
   * [x] If necessary, please update the documentation to describe the new 
feature. https://github.com/apache/seatunnel/tree/dev/docs
   * [x] If you are contributing the connector code, please check that the 
following files are updated:
     1. Update 
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
 and add new connector information in it
     2. Update the pom file of 
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
     3. Add ci label in 
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
     4. Add e2e testcase in 
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
     5. Update connector 
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to