Re: [MERGE]object_store branch into master

John Burwell Thu, 23 May 2013 13:56:24 -0700

Min,

Yes, I feel that class needs to be fixed before it can be merged into master.


Thanks,
-John

On May 23, 2013, at 4:54 PM, Min Chen <min.c...@citrix.com> wrote:

> Thanks for your suggestions, John. Current code is working fine
> functionality-wise, can we do such code cleanup and optimization
> post-merge?
> 
> -min
> 
> On 5/23/13 12:54 PM, "John Burwell" <jburw...@basho.com> wrote:
> 
>> Min,
>> 
>> TL;DR There are rare circumstances where directly access the AWS HTTP API
>> makes sense, but generally, you should use a client.
>> 
>> If you need the size of an S3 object in advance, I recommend using
>> AmazonS3Client#getObjectMeta method.  Also, large swathes of the class
>> appear to be copied directly from HTTPTemplateDownloader and don't make
>> sense when interacting S3.  For example, S3 does not support chunked HTTP
>> operations or HTTP authentication.  As such, this class could be greatly
>> simplified to operate in terms of S3Utils (likely with a few additional
>> methods) to get this information.
>> 
>> In terms of progress tracking of downloads and transfer operations, I
>> recommend retrofitting S3Utils to leverage the TransferManager mechanism.
>> In addition to providing progress information (TransferProgress), it
>> also provides various operation optimizations (particularly for uploads).
>> It also simplifies the interaction tremendously (e.g. it builds the
>> PutRequest objects for you).
>> 
>> Thanks,
>> -John  
>> 
>> On May 23, 2013, at 12:12 PM, Min Chen <min.c...@citrix.com> wrote:
>> 
>>> John, sorry for my ignorance. In S3TemplateDownloader.download, we are
>>> only using HttpClient method to get the size and InputStream from a
>>> given
>>> URL, then we are invoking S3 client library putObject(PutObjectRequest)
>>> to
>>> download the content to S3. By using PutObjectRequest, I can set
>>> ProgressListener to show download progress to end user. What is your
>>> recommendation in this case?
>>> 
>>> Thanks
>>> -min
>>> 
>>> On 5/23/13 7:20 AM, "John Burwell" <jburw...@basho.com> wrote:
>>> 
>>>> Min,
>>>> 
>>>> The com.cloud.storage.template.S3TemplateDownloader is directly
>>>> accessing
>>>> the S3 API using HTTP client.
>>>> 
>>>> Thanks,
>>>> John
>>>> 
>>>> On May 22, 2013, at 5:57 PM, Min Chen <min.c...@citrix.com> wrote:
>>>> 
>>>>> John,
>>>>> 
>>>>>   Can you clarify a bit on your last comment about directly accessing
>>>>> S3
>>>>> HTTP API? We are only invoking routines in S3Utils to perform
>>>>> operations
>>>>> with S3, not invoke any REST api if that is what you meant.
>>>>> 
>>>>>   Thanks
>>>>>   -min
>>>>> 
>>>>> On 5/22/13 2:49 PM, "John Burwell" <jburw...@basho.com> wrote:
>>>>> 
>>>>>> Edison,
>>>>>> 
>>>>>> For changes that take as long as described, it should be expected
>>>>>> that
>>>>>> the review will take a proportional amount of time.  In future
>>>>>> releases, we should think through ways to divide changes such as
>>>>>> these
>>>>>> into a set of smaller patches submitted throughout the course of the
>>>>>> release cycle.
>>>>>> 
>>>>>> So far, I can say I am very concerned about failure scenarios and
>>>>>> potential race conditions around the NFS cache. However, I am only a
>>>>>> quarter of the way through the code so my concerns may be resolved by
>>>>>> the end of the process.
>>>>>> 
>>>>>> I am also concerned about the correctness S3 implementation.  Why did
>>>>>> you choose to directly access the S3 HTTP API rather using the client
>>>>>> library?
>>>>>> 
>>>>>> Thanks,
>>>>>> -John
>>>>>> 
>>>>>> On May 22, 2013, at 5:25 PM, Edison Su <edison...@citrix.com> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com]
>>>>>>>> Sent: Wednesday, May 22, 2013 1:26 PM
>>>>>>>> To: dev@cloudstack.apache.org
>>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>>> 
>>>>>>>> On Wed, May 22, 2013 at 08:15:41PM +0000, Animesh Chaturvedi wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com]
>>>>>>>>>> Sent: Wednesday, May 22, 2013 12:08 PM
>>>>>>>>>> To: dev@cloudstack.apache.org
>>>>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>>>>> 
>>>>>>>>>> On Wed, May 22, 2013 at 07:00:51PM +0000, Animesh Chaturvedi
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: John Burwell [mailto:jburw...@basho.com]
>>>>>>>>>>>> Sent: Tuesday, May 21, 2013 8:51 AM
>>>>>>>>>>>> To: dev@cloudstack.apache.org
>>>>>>>>>>>> Subject: Re: [MERGE]object_store branch into master
>>>>>>>>>>>> 
>>>>>>>>>>>> Edison,
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks, I will start going through it today.  Based on other
>>>>>>>>>>>> $dayjob responsibilities, it may take me a couple of days.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> -John
>>>>>>>>>>> [Animesh>] John we are just a few days away  from 4.2 feature
>>>>>>>>>>> freeze, can
>>>>>>>>>> you provide your comments by Friday 5/24.   I would like all
>>>>>>>>>> feature
>>>>>>>> threads
>>>>>>>>>> to be resolved sooner so that we don't have last minute rush.
>>>>>>>>>> 
>>>>>>>>>> I'm just going to comment on this, but not take it much
>>>>>>>>>> further...
>>>>>>>>>> this type of change is an "architectural" change.  We had
>>>>>>>>>> previously
>>>>>>>>>> discussed (on several
>>>>>>>>>> threads) that the appropriate time for this sort of thing to hit
>>>>>>>>>> master was
>>>>>>>>>> *early* in the release cycle.  Any reason that that consensus
>>>>>>>>>> doesn't apply here?
>>>>>>>>> [Animesh>] Yes it is an architectural change and discussion on
>>>>>>>>> this
>>>>>>>>> started a
>>>>>>>> few weeks back already, Min and Edison wanted to get it in sooner
>>>>>>>> by
>>>>>>>> 4/30
>>>>>>>> but it took longer than anticipated in  preparing for merge and
>>>>>>>> testing on
>>>>>>>> feature branch.
>>>>>>>> 
>>>>>>>> You're not following me I think.  See this thread on the Javelin
>>>>>>>> merge:
>>>>>>>> 
>>>>>>>> http://markmail.org/message/e6peml5ddkqa6jp4
>>>>>>>> 
>>>>>>>> We have discussed that our preference is for architectural changes
>>>>>>>> to
>>>>>>>> hit
>>>>>>>> master shortly after a feature branch is cut.  Why are we not doing
>>>>>>>> that here?
>>>>>>> 
>>>>>>> This kind of refactor takes time, a lot of time. I think I worked on
>>>>>>> the merge of primary storage refactor into master and bug fixes
>>>>>>> during
>>>>>>> 
>>>>>>> 
>>>>>>> March(http://comments.gmane.org/gmane.comp.apache.cloudstack.devel/14
>>>>>>> 46
>>>>>>> 9)
>>>>>>> , then started to work on the secondary storage refactor in
>>>>>>> April(http://markmail.org/message/cspb6xweeupfvpit). Min and I
>>>>>>> finished
>>>>>>> the coding at end of April, then tested for two weeks, send out the
>>>>>>> merge request at middle of May.
>>>>>>> With the refactor, the  storage code will be much cleaner, and the
>>>>>>> performance of S3 will be improved, and integration with other
>>>>>>> storage
>>>>>>> vendor will be much easier, and the quality is ok(33 bugs fired,
>>>>>>> only
>>>>>>> 5
>>>>>>> left: 
>>>>>>> 
>>>>>>> 
>>>>>>> https://issues.apache.org/jira/issues/?jql=text%20~%20%22Object_Store
>>>>>>> _R
>>>>>>> ef
>>>>>>> actor%22). Anyway, it's up to the community to decide, merge it or
>>>>>>> not,
>>>>>>> we already tried our best to get it done ASAP.
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>

Re: [MERGE]object_store branch into master

Reply via email to