[ https://issues.apache.org/jira/browse/BEAM-14511?focusedWorklogId=775270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-775270 ]
ASF GitHub Bot logged work on BEAM-14511: ----------------------------------------- Author: ASF GitHub Bot Created on: 26/May/22 22:27 Start Date: 26/May/22 22:27 Worklog Time Spent: 10m Work Description: lostluck commented on code in PR #17754: URL: https://github.com/apache/beam/pull/17754#discussion_r883127503 ########## sdks/go/pkg/beam/io/rtrackers/offsetrange/offsetrange.go: ########## @@ -223,3 +227,102 @@ func (tracker *Tracker) GetRestriction() interface{} { func (tracker *Tracker) IsBounded() bool { return true } + +// RangeEndEstimator provides the estimated end offset of the range. Users must implement this interface to +// use the offsetrange.GrowableTracker. +type RangeEndEstimator interface { + // Estimate is called to get the end offset in TrySplit() or GetProgress() functions. + // + // The end offset is exclusive for the range. The estimated end is not required to + // monotonically increase as it will only be taken into consideration when the + // estimated end offset is larger than the current position. + // Returning math.MaxInt64 as the estimate implies the largest possible position for the range + // is math.MaxInt64 - 1. Return math.MinInt64 if an estimate can not be provided. + // + // Providing a good estimate is important for an accurate progress signal and will impact + // splitting decisions by the runner. + Estimate() int64 +} + +// GrowableTracker tracks a growable offset range restriction that can be represented as a range of integer values, +// for example for byte offsets in a file, or indices in an array. Note that this tracker makes +// no assumptions about the positions of blocks within the range, so users must handle validation +// of block positions if needed. +type GrowableTracker struct { + Tracker + rangeEndEstimator RangeEndEstimator +} + +// NewGrowableTracker is a constructor for an GrowableTracker given a offsetrange.Restriction and RangeEndEstimator. +// This tracker should be used when dealing with streaming use cases where the end of the restriction is +// undefined (math.MaxInt64) in this case. Otherwise, this tracker works the same as offsetrange.Tracker, so it is +// recommended to use that directly for bounded restrictions. Review Comment: I think the JavaDoc is clearer in this case, at describing how the tracker behaves for user use. As written, the doc here repeats the constructor method signature, calls it a constructor (which has no special meaning in Go), and then buries the critical detail on how infinite ranges are handled in the middle of the paragraph. The last sentence though is excellent information useful to the user. So here's what I would suggest as an alternative, as cribbed from the Java doc version: ```suggestion // NewGrowableTracker creates a GrowableTracker for handling a growable offset range. // math.MaxInt64 is used as the end of the range to indicate infinity for an unbounded range. // // An OffsetRange is considered growable when the end offset could grow (or change) // during execution time (e.g. Kafka topic partition offset, appended file, ...). // // The growable range is marked as done by claiming math.MaxInt64. // // For bounded restrictions, this tracker works the same as offsetrange.Tracker. // Use that directly if you have no need of estimating the end of a bound. ``` Issue Time Tracking ------------------- Worklog Id: (was: 775270) Time Spent: 4h 20m (was: 4h 10m) > Implement Growable Tracker for Go SDK > ------------------------------------- > > Key: BEAM-14511 > URL: https://issues.apache.org/jira/browse/BEAM-14511 > Project: Beam > Issue Type: Improvement > Components: sdk-go > Reporter: Ritesh Ghorse > Assignee: Ritesh Ghorse > Priority: P2 > Time Spent: 4h 20m > Remaining Estimate: 0h > > Add a growable tracker for > [OffsetRange|https://github.com/apache/beam/blob/3e683606d9a03e7da3d37a83eb16c3a6b96068cd/sdks/go/pkg/beam/io/rtrackers/offsetrange/offsetrange.go#L60] > Restriction in Go SDK. This would be useful for strreaming/unbounded > restrictions in SDF. -- This message was sent by Atlassian Jira (v8.20.7#820007)