Subject: Data copy from HDFS to MinIO regularly
 
Hello Team,

There is an application that was developed a long time ago, and this 
application processes 10GB of binary data per hour using MapReduce and 
generates 100GB of data, which is then written to the HDFS file system.

My goal is to move a portion of the processed data (approximately 25%) to a 
MinIO cluster that I plan to use as new object storage. I want this operation 
to be repeated every time new data is added to the HDFS cluster.

What kind of solution would you suggest to complete this task? Additionally, I 
would like to remind you that I have requirements related to monitoring the 
pipeline I am developing.

Thank you.

  

Reply via email to