subject:"Re\: Observable Metrics on Spark Datasets"

Re: Observable Metrics on Spark Datasets

2021-03-19 Thread Enrico Minack

I'll sketch out a PR so we can talk code and move the discussion there. Am 18.03.21 um 14:55 schrieb Wenchen Fan: I think a listener-based API makes sense for streaming (since you need to keep watching the result), but may not be very reasonable for batch queries (you only get the result once

Re: Observable Metrics on Spark Datasets

2021-03-18 Thread Wenchen Fan

I think a listener-based API makes sense for streaming (since you need to keep watching the result), but may not be very reasonable for batch queries (you only get the result once). The idea of Observation looks good, but we should define what happens if `observation.get` is called before the batch

Re: Observable Metrics on Spark Datasets

2021-03-16 Thread Jungtaek Lim

Please follow up the discussion in the origin PR. https://github.com/apache/spark/pull/26127 Dataset.observe() relies on the query listener for the batch query which is an "unstable" API - that's why we decided to not add an example for the batch query. For streaming query, it relies on the stream

Re: Observable Metrics on Spark Datasets

2021-03-16 Thread Enrico Minack

I am focusing on batch mode, not streaming mode. I would argue that Dataset.observe() is equally useful for large batch processing. If you need some motivating use cases, please let me know. Anyhow, the documentation of observe states it works for both, batch and streaming. And in batch mode,

Re: Observable Metrics on Spark Datasets

2021-03-15 Thread Jungtaek Lim

If I remember correctly, the major audience of the "observe" API is Structured Streaming, micro-batch mode. From the example, the abstraction in 2 isn't something working with Structured Streaming. It could be still done with callback, but it remains the question how much complexity is hidden from

Re: Observable Metrics on Spark Datasets

Re: Observable Metrics on Spark Datasets

Re: Observable Metrics on Spark Datasets

Re: Observable Metrics on Spark Datasets

Re: Observable Metrics on Spark Datasets

5 matches

Site Navigation

Mail list logo

Footer information