Related to a previous thread about custom triggering on GlobalWindows [0], are there general recommendations for controlling size of output files from FileIO.Write?
A general pattern I've seen in systems that need to batch individual records to files is that they offer both a maximum file size and a maximum latency. If you specify 1 GB and 1 minute respectively, the system would create multiple 1 GB files per minute when throughput is high, and a single smaller file per minute when throughput is below 1 GB/minute. >From the discussion in [0], it sounds like windowing and triggering semantics are not sufficient to provide such guarantees. Bounded runners are free to ignore triggers as being non-deterministic. Are there other techniques I'm missing to limit files sizes, or is windowing on record timestamp the only tool available that applies to both batch and streaming? [0] https://lists.apache.org/thread.html/7b583c73d55d13389a49a35dec2b42128d114361de3c1f0822d9ded4@%3Cuser.beam.apache.org%3E