I have been using Athena/Presto to read the parquet files in datalake, if
your are already saving data to s3 I think this is the easiest option.
Then I use Redash or Metabase to build dashboards (they have different
limitations), both are very intuitive to use and easy to setup with docker.
--
S
Hi Roland,
Thank you for your reply. That's quite helpful. I think I should try
influxDB then. But I am curious if in case of prometheus writing a custom
exporter be a good choice and solve the purpose efficiently? Grafana is not
something I want to drop.
Best,
Aniruddha
---
ᐧ
On Tue, F
Hi Ani,
Prometheus is not well suited for ingesting explicit timeseries data. Its
purpose is for technical monitoring. If you want to monitor your spark jobs
with prometheus you can publish the metrics so prometheus can scrape it.
What you propably are looking for is a timeseries database that you
Hello,
I am trying to build a data pipeline that uses spark structured streaming
with delta project and runs into Kubernetes. Due to this, I get my output
files only into parquet format. Since I am asked to use the prometheus and
grafana
for building the dashboard for this pipeline, I run an anoth