Hi Hong, Thanks for the comments and suggestions, really appreciate it!
> Clarify Cloudwatch API limitations and handling in the sink. [1] Great to see > we're being explicit with the Java API model! Let's be explicit about the > PutMetricsData API constraints and how we handle them (it would be good to > have a separate section): > 1. Request size and count. Maximum size per request is 1MB, with limit > of 1000 metrics. We need to enforce this during batching to prevent users > from shooting themselves in the foot! > 2. For Values/Counts, limit is 150 unique metrics within a single > request. > 3. Data type limitations. CloudWatch API uses Java double, but it > doesn't support Double.NaN. We need to be explicit to handle improperly > formatted data. We can consider failing fast/slow as you have suggested. > Consider using "StrictEntityValidation" in the failure handling. [1] (For the > design, we can simply mention, but work out the details when we implement) > 4. Timestamp limitations. Cloudwatch also has limitations around > accepted timestamps (as you have noted). Metric data can be 48h in the past > or 2h in the future. Let's clarify how we handle invalid values. > 5. Data ordering. CW API doesn't seem to specify limitations around > out-of-order / repeat data. That's great, and it would be good to be explicit > about and validate this behavior. This is very detailed, thank you, I have updated the FLIP outlining these limitations. In summary, here’s how they translate to limitation in the AsyncSink configuration: * Maximum size per CW PutMetricDataRequest is 1MB → maxBatchSizeInBytes cannot be more than 1 MB * Maximum number of MetricDatum per CW PutMetricDataRequest is 1000 → maxBatchSize cannot be more than 1000 * Maximum 150 unique values in MetricDatum.Values → maxRecordSizeInBytes cannot be more than 150 Bytes (assuming each 1 value size is 1 byte) * CloudWatch API uses Java double, but it doesn't support Double.NaN → Use StrictEntityValidation * MetricDatum Timestamp limitations (up to 2 weeks in the past and up to 2 hours into the future) → Validation against this with user choice of error handling behavior for this case * Data ordering. Yes I have validated CW accepts out-of-order data, I have updated the FLIP to point this out. > The PutMetricData API supports two data modes, EntityMetricData and > MetricData [1]. Since we are only supporting MetricData for now, let's make > sure our interface allows the extension to support EntityMetricData in the > future. For example, we should make sure we use our own data model classes > instead of using AWS SDK classes. Also, we currently propose to use wrappers > instead of primitives. Let's use the primitive where we can Yes, agree on making the interface allows extension to support EntityMetricData in the future. We are using our own data model “MetricWriteRequest” and have updated the FLIP to use primitives. > - PutMetricData supports StrictEntityValidation [1]. As mentioned above, > let's use this. > - I like that we enforce a single namespace per sink, since that is aligned > with the PutMetricData API interface. Let's be explicit on the reasoning in > the FLIP! > - Clarify sink data semantics. Since we're using the async sink, we only > provide at-least-once semantics. Let’s make this guarantee explicit. Agree and updated FLIP > 4. CW sink interface. Currently we are proposing to have a static input data > type instead of generic input type. This would require user to use a map > separately (As you have noted). For future extensibility, I would prefer if > we exposed an ElementConverter directly to the user. That way, we can provide > a custom class "MetricWriteRequest" in the output interface of the > ElementConverter that can be extended to support additional features (e.g. > EntityMetricData) in the future. Thanks, I agree with both suggestions on exposing ElementConverter to user, and provide a custom class “MetricWriteRequest” in the output for extensibility. Updated the FLIP as well. > 5. Table API design. > - I'm not a fan of the way we currently use dimensions in the properties. > - It would be better to use a Flink-native SQL support like PRIMARY KEY > instead [2]. This also enforces that the specified dimension cannot be null. Thanks for the suggestion, but I also see limitation in this approach for when user wants to define more than 1 dimension columns with PRIMARY KEY, and CloudWatch also allows dimensions to be optional as well. Hence, I see the current approach as being more flexible for user to configure, let me know what your thoughts are here. > 6. Housekeeping > - It would be good to tidy up the public interfaces linked. For example, we > don't make any explicit usage of the public interfaces in FLIP-191, so we can > remove that. Thanks for raising this, agreed and have updated the FLIP. Regards, Daren On 08/04/2025, 12:02, "Hong Liang" <h...@apache.org <mailto:h...@apache.org>> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Daren, Thanks for the contribution — exciting to see support for new sinks! I’ve added a few comments and suggestions below: 1. Clarify Cloudwatch API limitations and handling in the sink. [1] Great to see we're being explicit with the Java API model! Let's be explicit about the PutMetricsData API constraints and how we handle them (it would be good to have a separate section): 1. Request size and count. Maximum size per request is 1MB, with limit of 1000 metrics. We need to enforce this during batching to prevent users from shooting themselves in the foot! 2. For Values/Counts, limit is 150 unique metrics within a single request. 3. Data type limitations. CloudWatch API uses Java double, but it doesn't support Double.NaN. We need to be explicit to handle improperly formatted data. We can consider failing fast/slow as you have suggested. Consider using "StrictEntityValidation" in the failure handling. [1] (For the design, we can simply mention, but work out the details when we implement) 4. Timestamp limitations. Cloudwatch also has limitations around accepted timestamps (as you have noted). Metric data can be 48h in the past or 2h in the future. Let's clarify how we handle invalid values. 5. Data ordering. CW API doesn't seem to specify limitations around out-of-order / repeat data. That's great, and it would be good to be explicit about and validate this behavior. 2. Clarify supported features [1] - The PutMetricData API supports two data modes, EntityMetricData and MetricData [1]. Since we are only supporting MetricData for now, let's make sure our interface allows the extension to support EntityMetricData in the future. For example, we should make sure we use our own data model classes instead of using AWS SDK classes. Also, we currently propose to use wrappers instead of primitives. Let's use the primitive where we can :). - PutMetricData supports StrictEntityValidation [1]. As mentioned above, let's use this. - I like that we enforce a single namespace per sink, since that is aligned with the PutMetricData API interface. Let's be explicit on the reasoning in the FLIP! 3. Clarify sink data semantics. Since we're using the async sink, we only provide at-least-once semantics. Let’s make this guarantee explicit. 4. CW sink interface. Currently we are proposing to have a static input data type instead of generic input type. This would require user to use a map separately (As you have noted). For future extensibility, I would prefer if we exposed an ElementConverter directly to the user. That way, we can provide a custom class "MetricWriteRequest" in the output interface of the ElementConverter that can be extended to support additional features (e.g. EntityMetricData) in the future. 5. Table API design. - I'm not a fan of the way we currently use dimensions in the properties. - It would be better to use a Flink-native SQL support like PRIMARY KEY instead [2]. This also enforces that the specified dimension cannot be null. 6. Housekeeping - It would be good to tidy up the public interfaces linked. For example, we don't make any explicit usage of the public interfaces in FLIP-191, so we can remove that. Overall, nice FLIP! Thanks for the detail and making it an easy read. Hope the above helps! Cheers, Hong [1] https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutMetricData.html <https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutMetricData.html> [2] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#primary-key <https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#primary-key> On Mon, Apr 7, 2025 at 9:24 PM Ahmed Hamdy <hamdy10...@gmail.com <mailto:hamdy10...@gmail.com>> wrote: > Hi Daren thanks for the FLIP > > Just a couple of questions and comments? > > > Usable in both DataStream and Table API/SQL > What about python API? this is sth we should consider ahead since the > abstract element converter doesn't have a Flink type mapping to be used > from python, this is a issue we faced with DDB before > > > Therefore, the connector will provide a CloudWatchMetricInput model that > user can use to pass as input to the connector. For example, in DataStream > API, it could be a MapFunction called just before passing to the sink as > follows: > I am not quite sure I follow, are you suggesting we introduce a > specific new converter class or relay that to users? also since you > mentioned FLIP-171, are you suggesting to implement this sink as an > extension to Async Sink, in that case It is more confusing to me how we are > going to use the map function with the AsyncSink.ElementConvertor. > > >public class SampleToCloudWatchMetricInputMapper implements MapFunction< > Sample, CloudWatchMetricInput> > > Is CloudWatchMetricInput a newly introduced model class, I couldn't find it > in the sdkv2, If we are introducing it then it might be useful to add to > the FLIP since this is part of the API. > > > > Supports both Bounded (Batch) and Unbounded (Streaming) > > What do you propose to handle them differently? I can't find a specific > thing in the FLIP > > Regarding table API > > > 'metric.dimension.keys' = 'cw_dim', > > I am not in favor of doing this as this will complicate the schema > validation on table creation, maybe we can use the whole schema as > dimensions excluding the values and the count, let me know your thoughts > here. > > > 'metric.name.key' = 'cw_metric_name', > > So we are making the metric part of the row data? have we considered not > doing that instead and having 1 table map to 1 metric instead of namespace? > It might be more suitable to enforce some validations on the dimensions > schema this way. Ofc this will probably have is introduce some intermediate > class in the model to hold the dimensions, values and counts without the > metric name and namespace that we will extract from the sink definition, > let me know your thoughts here? > > > >`cw_value` BIGINT, > Are we going to allow all numeric types for values? > > > protected void submitRequestEntries( > List<MetricDatum> requestEntries, > Consumer<List<MetricDatum>> requestResult) > > nit: This method should be deprecated after 1.20. I hope the repo is > upgraded by the time we implement this > > > Error Handling > Away from poison pills, what error handling are you suggesting? Are we > following the footsteps of the other AWS connectors with error > classification, is there any effort to abstract it on the AWS side? > > And on the topic of poison pills, If I understand correctly that is a topic > that has been discussed for a while, this ofc breaks the at-least-once > semantic and might be confusing to the users, additionally since cloud > watch API fails the full batch how are you suggesting we identify the > poison pills? I am personally in favor of global failures in this case but > would love to hear the feedback here. > > > > Best Regards > Ahmed Hamdy > > > On Mon, 7 Apr 2025 at 11:29, Wong, Daren <daren...@amazon.co.uk.inva > <mailto:daren...@amazon.co.uk.inva>lid> > wrote: > > > Hi Dev, > > > > I would like to start a discussion about FLIP: Amazon CloudWatch Metric > > Sink Connector > > > https://docs.google.com/document/d/1G2sQogV8S6M51qeAaTmvpClOSvklejjEXbRFFCv_T-c/edit?usp=sharing > > <https://docs.google.com/document/d/1G2sQogV8S6M51qeAaTmvpClOSvklejjEXbRFFCv_T-c/edit?usp=sharing> > > > > This FLIP is proposing to add support for Amazon CloudWatch Metric sink > in > > flink-connector-aws repo. Looking forward to your feedback, thank you > > > > Regards, > > Daren > > >