Hello, I have written a GO program which downloads a 5G compressed CSV from Amazon S3, decompresses it and uploads the decompressed CSV (20G) to Amazon S3.
Amazon S3 provides a default concurrent uploader/downloader and I am using a multithreaded approach to download files in parallel, decompress and upload. The program seems to work fine, however I believe the program could be optimized further. And not all the cores are used though I have parallelized for the no. of CPUs available . The CPU usage is only around 30-40% . I see a IO wait around 30/40% percent. The download happens faster, The decompression takes 5-6 minutes and the upload happens in parallel but takes almost an hour to upload a set of 8 files. For decompression, I use reader, err := gzip.NewReader(gzipfile) writer, err := os.Create(outputFile) err = io.Copy(writer, reader) I use a 16CPU, 122 GB RAM, 500 GB SSD instance Are there any other methodologies where I can optimize compresssion part and upload part I am pretty new to Golang. Any guidance is very much appreciated. Regards Mukund -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.