To the first issue your are facing:
In BeamSQL, we tried to solve the similar requirement.
BeamSQL supports reading JSON format message from Pubsub, writing to
Bigquery and writing messages that fail to parse in another Pubsub topic.
BeamSQL uses the pre-processing transform to parse JSON payload
I'm trying to write a robust pipeline that takes input from PubSub and
writes to BigQuery. For every PubsubMessage that is not successfully
written to BigQuery, I'd like to get the original PubsubMessage back and be
able to write to an error output collection. I'm not sure this is quite
possible, t