[ https://issues.apache.org/jira/browse/BEAM-12857 ]
Kenneth Knowles deleted comment on BEAM-12857: ---------------------------------------- was (Author: beamjirabot): This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > Unable to write to GCS due to IndexOutOfBoundsException in FileSystems > ---------------------------------------------------------------------- > > Key: BEAM-12857 > URL: https://issues.apache.org/jira/browse/BEAM-12857 > Project: Beam > Issue Type: Bug > Components: io-java-gcp > Affects Versions: 2.31.0, 2.32.0 > Environment: Beam 2.31.0/2.32.0, Java 11, GCP Dataflow > Reporter: Patrick Lucas > Priority: P2 > Labels: stale-P2 > > I have a simple batch job, running on Dataflow, that reads from a GCS bucket, > filters the data, and windows and writes the matching data back to a > different path in the same bucket. > The job seems to succeed in reading and filtering the data, as well as > writing temporary files to GCS, but appears to fail when trying to rename the > temporary files to their final destination. > The IndexOutOfBoundsException is thrown from > [FileSystems.java:429|https://github.com/apache/beam/blob/v2.32.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L429] > (in 2.32.0), when the code calls {{.get(0)}} on the list returned by a call > to {{MatchResult#metadata()}}. > The javadoc for > [{{MatchResult#metadata()}}|https://github.com/apache/beam/blob/v2.32.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/MatchResult.java#L75-L80] > says, > {code:java} > /** > * {@link Metadata} of matched files. Note that if {@link #status()} is > {@link Status#NOT_FOUND}, > * this may either throw a {@link java.io.FileNotFoundException} or return > an empty list, > * depending on the {@link EmptyMatchTreatment} used in the {@link > FileSystems#match} call. > */ > {code} > So possibly GCS is not returning any metadata for the (missing) destination > object? That seems unlikely, as I would expect many others would have already > run into this, but I don't see how this could be caused by my user code. > I have tested this on 2.31.0 and 2.32.0 getting the same error, but it's > worth noting that the logic in FileSystems.java changed a decent amount > recently in [#15301|https://github.com/apache/beam/pull/15301], maybe having > an effect on this, but I haven't been able to test it since I'm working in a > closed environment and can only easily use released versions of Beam. Once a > version containing this change is released, I will upgrade and try again. -- This message was sent by Atlassian Jira (v8.20.1#820001)