Thanks Ankur I will explore your suggestions and get back if I am lost Regards
On Wed, Jan 4, 2023, 5:36 AM Ankur Goenka <ankurgoe...@gmail.com> wrote: > Hi Marco, > It is not very clear as to which checks are you interested in. > Beam does not have any standard business-specific data quality checks. > However, you can add your checks in various stages of the pipeline. > The checks will broadly fall into 2 categories. > 1. Check a single element: There are easy to do as you can write a > transform to check a single element. > 2. Checks that co-relate data across elements such as "at least a single > domain has 2 pages" etc: For these, you can use aggregation and then apply > the check that you need. > > Thanks, > Ankur > > > On Sun, 1 Jan 2023 at 11:53, Sofia’s World <mmistr...@gmail.com> wrote: > >> Hi all >> Are there any facilities to do dqchecks on apache beam? >> Got few jobs that download data from web..do some filters transformation >> and aggregation.. >> Want to introduce dqchecks so Job fails if certain conditions are not met >> eg number of outputs.... >> Is that achievable in beam? >> Thanks. Marco >> >