Hi there,

This thread is to address John's review comments on S3TemplateDownloader 
implementation. From previous thread, there are two major concerns for this 
class implementation.

1. We have used HttpClient library in this class.  For this comment, I can 
explain why I need that HttpClient during downloading object to S3. Current our 
download logic is like this:

-- get object total size and InputStream from a http url by invoking HttpClient 
library method.
-- invoke S3Utils api to download an InputStream to S3, this is totally S3 api, 
and get actual object size downloaded on S3 once completion.
-- compare object total size and actual download size to check if they are 
equal to report any truncation error.

John's concern is on step 1 above. We can get ride of HttpClient library use to 
get InputStream from an URL, but I don't know how I can easily get the object 
size from a URL. In previous email, John you mentioned that I can use S3 api 
getObjectMetaData to get the object size, but my understanding is that that API 
only applies to the object already in S3. In my flow, I need to get the size of 
object that is to be downloaded to S3, but not in S3. Willing to hear your 
suggestion here.

2. John pointed out an issue with current download method implementation in 
this class, where I have used S3 low-level api PutObjectRequest to put an 
InputStream to S3, this has a bug that it cannot handle object > 5GB. That is 
true after reading several S3 documentation on MultiPart upload, sorry that I 
am not expert on S3 and thus didn't know that earlier when I implemented this 
method. To change that, it should not take too long to code based on this 
sample on AWS 
(http://docs.aws.amazon.com/AmazonS3/latest/dev/HLTrackProgressMPUJava.html) by 
using TransferManager, just need some testing time.  IMHO, this bug should not 
become a major issue blocking object_store branch merge, just need several days 
to fix and address assuming that we have extension. Even without extension, I 
personally think that this definitely can be resolved in master with a simple 
bug fix.

Thanks
-min

Reply via email to