I came across an interesting user report at https://github.com/apache/beam/issues/32866 which made me realize that providing metadata about a bad element in the "bad records" output is useful, we don't make it easy to extract the output into a PCollection of the original elements. The output schema contains the original element as well as metadata about what error occurred, and in an ordinary Beam pipeline one could easily apply a Map(lambda error_row: error_row.element) but YAML doesn't have Map, just MapToFields (primarily to be more schema friendly).
There are a couple of options: (0) Leave things as they are. One can write type: MapToFields config: fields: fld1: element.fld1 fld2: element.fld2 ... This is of course a bit ugly as one needs to enumerate (and know) the set of original fields. (1a) Provide a special operation "Unnest" that takes a single field and emits it as the top-level element. This can of course result in unschema'd PCollections (which are supported, but generally don't play as well with the other operations, including xlang ones). (1b) Just provide a Map. This is a generalization of 1a, but on the other hand would be more prone to abuse. (1c) We could name this type: MapToFields config: fields: *: element IIRC, we already have the special case of "*" in our join syntax, and we could re-use a bunch of the MapToFields infrastructure. But maybe it's too obscure? (2) Add an optional argument to error_handling to omit the metadata. This would require a bit of a hack to support ubiquitously, and wouldn't solve the more general problem. Maybe there are some other ideas as well?