Re: [MERGE]object_store branch into master

Min Chen Thu, 23 May 2013 13:54:40 -0700

Thanks for your suggestions, John. Current code is working fine
functionality-wise, can we do such code cleanup and optimization
post-merge?


-min

On 5/23/13 12:54 PM, "John Burwell" <[email protected]> wrote:

>Min,
>
>TL;DR There are rare circumstances where directly access the AWS HTTP API
>makes sense, but generally, you should use a client.
>
>If you need the size of an S3 object in advance, I recommend using
>AmazonS3Client#getObjectMeta method.  Also, large swathes of the class
>appear to be copied directly from HTTPTemplateDownloader and don't make
>sense when interacting S3.  For example, S3 does not support chunked HTTP
>operations or HTTP authentication.  As such, this class could be greatly
>simplified to operate in terms of S3Utils (likely with a few additional
>methods) to get this information.
>
>In terms of progress tracking of downloads and transfer operations, I
>recommend retrofitting S3Utils to leverage the TransferManager mechanism.
> In addition to providing progress information (TransferProgress), it
>also provides various operation optimizations (particularly for uploads).
> It also simplifies the interaction tremendously (e.g. it builds the
>PutRequest objects for you).
>
>Thanks,
>-John  
>
>On May 23, 2013, at 12:12 PM, Min Chen <[email protected]> wrote:
>
>> John, sorry for my ignorance. In S3TemplateDownloader.download, we are
>> only using HttpClient method to get the size and InputStream from a
>>given
>> URL, then we are invoking S3 client library putObject(PutObjectRequest)
>>to
>> download the content to S3. By using PutObjectRequest, I can set
>> ProgressListener to show download progress to end user. What is your
>> recommendation in this case?
>> 
>> Thanks
>> -min
>> 
>> On 5/23/13 7:20 AM, "John Burwell" <[email protected]> wrote:
>> 
>>> Min,
>>> 
>>> The com.cloud.storage.template.S3TemplateDownloader is directly
>>>accessing
>>> the S3 API using HTTP client.
>>> 
>>> Thanks,
>>> John
>>> 
>>> On May 22, 2013, at 5:57 PM, Min Chen <[email protected]> wrote:
>>> 
>>>> John,
>>>> 
>>>>    Can you clarify a bit on your last comment about directly accessing
>>>>S3
>>>> HTTP API? We are only invoking routines in S3Utils to perform
>>>>operations
>>>> with S3, not invoke any REST api if that is what you meant.
>>>> 
>>>>    Thanks
>>>>    -min
>>>> 
>>>> On 5/22/13 2:49 PM, "John Burwell" <[email protected]> wrote:
>>>> 
>>>>> Edison,
>>>>> 
>>>>> For changes that take as long as described, it should be expected
>>>>>that
>>>>> the review will take a proportional amount of time.  In future
>>>>> releases, we should think through ways to divide changes such as
>>>>>these
>>>>> into a set of smaller patches submitted throughout the course of the
>>>>> release cycle.
>>>>> 
>>>>> So far, I can say I am very concerned about failure scenarios and
>>>>> potential race conditions around the NFS cache. However, I am only a
>>>>> quarter of the way through the code so my concerns may be resolved by
>>>>> the end of the process.
>>>>> 
>>>>> I am also concerned about the correctness S3 implementation.  Why did
>>>>> you choose to directly access the S3 HTTP API rather using the client
>>>>> library?
>>>>> 
>>>>> Thanks,
>>>>> -John
>>>>> 
>>>>> On May 22, 2013, at 5:25 PM, Edison Su <[email protected]> wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Chip Childers [mailto:[email protected]]
>>>>>>> Sent: Wednesday, May 22, 2013 1:26 PM
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>> 
>>>>>>> On Wed, May 22, 2013 at 08:15:41PM +0000, Animesh Chaturvedi wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Chip Childers [mailto:[email protected]]
>>>>>>>>> Sent: Wednesday, May 22, 2013 12:08 PM
>>>>>>>>> To: [email protected]
>>>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>>>> 
>>>>>>>>> On Wed, May 22, 2013 at 07:00:51PM +0000, Animesh Chaturvedi
>>>>>>>>>wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: John Burwell [mailto:[email protected]]
>>>>>>>>>>> Sent: Tuesday, May 21, 2013 8:51 AM
>>>>>>>>>>> To: [email protected]
>>>>>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>>>>>> 
>>>>>>>>>>> Edison,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks, I will start going through it today.  Based on other
>>>>>>>>>>> $dayjob responsibilities, it may take me a couple of days.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> -John
>>>>>>>>>> [Animesh>] John we are just a few days away  from 4.2 feature
>>>>>>>>>> freeze, can
>>>>>>>>> you provide your comments by Friday 5/24.   I would like all
>>>>>>>>> feature
>>>>>>> threads
>>>>>>>>> to be resolved sooner so that we don't have last minute rush.
>>>>>>>>> 
>>>>>>>>> I'm just going to comment on this, but not take it much
>>>>>>>>>further...
>>>>>>>>> this type of change is an "architectural" change.  We had
>>>>>>>>> previously
>>>>>>>>> discussed (on several
>>>>>>>>> threads) that the appropriate time for this sort of thing to hit
>>>>>>>>> master was
>>>>>>>>> *early* in the release cycle.  Any reason that that consensus
>>>>>>>>> doesn't apply here?
>>>>>>>> [Animesh>] Yes it is an architectural change and discussion on
>>>>>>>>this
>>>>>>>> started a
>>>>>>> few weeks back already, Min and Edison wanted to get it in sooner
>>>>>>>by
>>>>>>> 4/30
>>>>>>> but it took longer than anticipated in  preparing for merge and
>>>>>>> testing on
>>>>>>> feature branch.
>>>>>>> 
>>>>>>> You're not following me I think.  See this thread on the Javelin
>>>>>>> merge:
>>>>>>> 
>>>>>>> http://markmail.org/message/e6peml5ddkqa6jp4
>>>>>>> 
>>>>>>> We have discussed that our preference is for architectural changes
>>>>>>>to
>>>>>>> hit
>>>>>>> master shortly after a feature branch is cut.  Why are we not doing
>>>>>>> that here?
>>>>>> 
>>>>>> This kind of refactor takes time, a lot of time. I think I worked on
>>>>>> the merge of primary storage refactor into master and bug fixes
>>>>>>during
>>>>>> 
>>>>>> 
>>>>>>March(http://comments.gmane.org/gmane.comp.apache.cloudstack.devel/14
>>>>>>46
>>>>>> 9)
>>>>>> , then started to work on the secondary storage refactor in
>>>>>> April(http://markmail.org/message/cspb6xweeupfvpit). Min and I
>>>>>> finished
>>>>>> the coding at end of April, then tested for two weeks, send out the
>>>>>> merge request at middle of May.
>>>>>> With the refactor, the  storage code will be much cleaner, and the
>>>>>> performance of S3 will be improved, and integration with other
>>>>>>storage
>>>>>> vendor will be much easier, and the quality is ok(33 bugs fired,
>>>>>>only
>>>>>> 5
>>>>>> left: 
>>>>>> 
>>>>>> 
>>>>>>https://issues.apache.org/jira/issues/?jql=text%20~%20%22Object_Store
>>>>>>_R
>>>>>> ef
>>>>>> actor%22). Anyway, it's up to the community to decide, merge it or
>>>>>> not,
>>>>>> we already tried our best to get it done ASAP.
>>>>>> 
>>>> 
>>> 
>> 
>

Re: [MERGE]object_store branch into master

Reply via email to