Thanks for your suggestions, John. Current code is working fine functionality-wise, can we do such code cleanup and optimization post-merge?
-min On 5/23/13 12:54 PM, "John Burwell" <jburw...@basho.com> wrote: >Min, > >TL;DR There are rare circumstances where directly access the AWS HTTP API >makes sense, but generally, you should use a client. > >If you need the size of an S3 object in advance, I recommend using >AmazonS3Client#getObjectMeta method. Also, large swathes of the class >appear to be copied directly from HTTPTemplateDownloader and don't make >sense when interacting S3. For example, S3 does not support chunked HTTP >operations or HTTP authentication. As such, this class could be greatly >simplified to operate in terms of S3Utils (likely with a few additional >methods) to get this information. > >In terms of progress tracking of downloads and transfer operations, I >recommend retrofitting S3Utils to leverage the TransferManager mechanism. > In addition to providing progress information (TransferProgress), it >also provides various operation optimizations (particularly for uploads). > It also simplifies the interaction tremendously (e.g. it builds the >PutRequest objects for you). > >Thanks, >-John > >On May 23, 2013, at 12:12 PM, Min Chen <min.c...@citrix.com> wrote: > >> John, sorry for my ignorance. In S3TemplateDownloader.download, we are >> only using HttpClient method to get the size and InputStream from a >>given >> URL, then we are invoking S3 client library putObject(PutObjectRequest) >>to >> download the content to S3. By using PutObjectRequest, I can set >> ProgressListener to show download progress to end user. What is your >> recommendation in this case? >> >> Thanks >> -min >> >> On 5/23/13 7:20 AM, "John Burwell" <jburw...@basho.com> wrote: >> >>> Min, >>> >>> The com.cloud.storage.template.S3TemplateDownloader is directly >>>accessing >>> the S3 API using HTTP client. >>> >>> Thanks, >>> John >>> >>> On May 22, 2013, at 5:57 PM, Min Chen <min.c...@citrix.com> wrote: >>> >>>> John, >>>> >>>> Can you clarify a bit on your last comment about directly accessing >>>>S3 >>>> HTTP API? We are only invoking routines in S3Utils to perform >>>>operations >>>> with S3, not invoke any REST api if that is what you meant. >>>> >>>> Thanks >>>> -min >>>> >>>> On 5/22/13 2:49 PM, "John Burwell" <jburw...@basho.com> wrote: >>>> >>>>> Edison, >>>>> >>>>> For changes that take as long as described, it should be expected >>>>>that >>>>> the review will take a proportional amount of time. In future >>>>> releases, we should think through ways to divide changes such as >>>>>these >>>>> into a set of smaller patches submitted throughout the course of the >>>>> release cycle. >>>>> >>>>> So far, I can say I am very concerned about failure scenarios and >>>>> potential race conditions around the NFS cache. However, I am only a >>>>> quarter of the way through the code so my concerns may be resolved by >>>>> the end of the process. >>>>> >>>>> I am also concerned about the correctness S3 implementation. Why did >>>>> you choose to directly access the S3 HTTP API rather using the client >>>>> library? >>>>> >>>>> Thanks, >>>>> -John >>>>> >>>>> On May 22, 2013, at 5:25 PM, Edison Su <edison...@citrix.com> wrote: >>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com] >>>>>>> Sent: Wednesday, May 22, 2013 1:26 PM >>>>>>> To: dev@cloudstack.apache.org >>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>> >>>>>>> On Wed, May 22, 2013 at 08:15:41PM +0000, Animesh Chaturvedi wrote: >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com] >>>>>>>>> Sent: Wednesday, May 22, 2013 12:08 PM >>>>>>>>> To: dev@cloudstack.apache.org >>>>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>>>> >>>>>>>>> On Wed, May 22, 2013 at 07:00:51PM +0000, Animesh Chaturvedi >>>>>>>>>wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: John Burwell [mailto:jburw...@basho.com] >>>>>>>>>>> Sent: Tuesday, May 21, 2013 8:51 AM >>>>>>>>>>> To: dev@cloudstack.apache.org >>>>>>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>>>>>> >>>>>>>>>>> Edison, >>>>>>>>>>> >>>>>>>>>>> Thanks, I will start going through it today. Based on other >>>>>>>>>>> $dayjob responsibilities, it may take me a couple of days. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -John >>>>>>>>>> [Animesh>] John we are just a few days away from 4.2 feature >>>>>>>>>> freeze, can >>>>>>>>> you provide your comments by Friday 5/24. I would like all >>>>>>>>> feature >>>>>>> threads >>>>>>>>> to be resolved sooner so that we don't have last minute rush. >>>>>>>>> >>>>>>>>> I'm just going to comment on this, but not take it much >>>>>>>>>further... >>>>>>>>> this type of change is an "architectural" change. We had >>>>>>>>> previously >>>>>>>>> discussed (on several >>>>>>>>> threads) that the appropriate time for this sort of thing to hit >>>>>>>>> master was >>>>>>>>> *early* in the release cycle. Any reason that that consensus >>>>>>>>> doesn't apply here? >>>>>>>> [Animesh>] Yes it is an architectural change and discussion on >>>>>>>>this >>>>>>>> started a >>>>>>> few weeks back already, Min and Edison wanted to get it in sooner >>>>>>>by >>>>>>> 4/30 >>>>>>> but it took longer than anticipated in preparing for merge and >>>>>>> testing on >>>>>>> feature branch. >>>>>>> >>>>>>> You're not following me I think. See this thread on the Javelin >>>>>>> merge: >>>>>>> >>>>>>> http://markmail.org/message/e6peml5ddkqa6jp4 >>>>>>> >>>>>>> We have discussed that our preference is for architectural changes >>>>>>>to >>>>>>> hit >>>>>>> master shortly after a feature branch is cut. Why are we not doing >>>>>>> that here? >>>>>> >>>>>> This kind of refactor takes time, a lot of time. I think I worked on >>>>>> the merge of primary storage refactor into master and bug fixes >>>>>>during >>>>>> >>>>>> >>>>>>March(http://comments.gmane.org/gmane.comp.apache.cloudstack.devel/14 >>>>>>46 >>>>>> 9) >>>>>> , then started to work on the secondary storage refactor in >>>>>> April(http://markmail.org/message/cspb6xweeupfvpit). Min and I >>>>>> finished >>>>>> the coding at end of April, then tested for two weeks, send out the >>>>>> merge request at middle of May. >>>>>> With the refactor, the storage code will be much cleaner, and the >>>>>> performance of S3 will be improved, and integration with other >>>>>>storage >>>>>> vendor will be much easier, and the quality is ok(33 bugs fired, >>>>>>only >>>>>> 5 >>>>>> left: >>>>>> >>>>>> >>>>>>https://issues.apache.org/jira/issues/?jql=text%20~%20%22Object_Store >>>>>>_R >>>>>> ef >>>>>> actor%22). Anyway, it's up to the community to decide, merge it or >>>>>> not, >>>>>> we already tried our best to get it done ASAP. >>>>>> >>>> >>> >> >