Min, Yes, I feel that class needs to be fixed before it can be merged into master.
Thanks, -John On May 23, 2013, at 4:54 PM, Min Chen <min.c...@citrix.com> wrote: > Thanks for your suggestions, John. Current code is working fine > functionality-wise, can we do such code cleanup and optimization > post-merge? > > -min > > On 5/23/13 12:54 PM, "John Burwell" <jburw...@basho.com> wrote: > >> Min, >> >> TL;DR There are rare circumstances where directly access the AWS HTTP API >> makes sense, but generally, you should use a client. >> >> If you need the size of an S3 object in advance, I recommend using >> AmazonS3Client#getObjectMeta method. Also, large swathes of the class >> appear to be copied directly from HTTPTemplateDownloader and don't make >> sense when interacting S3. For example, S3 does not support chunked HTTP >> operations or HTTP authentication. As such, this class could be greatly >> simplified to operate in terms of S3Utils (likely with a few additional >> methods) to get this information. >> >> In terms of progress tracking of downloads and transfer operations, I >> recommend retrofitting S3Utils to leverage the TransferManager mechanism. >> In addition to providing progress information (TransferProgress), it >> also provides various operation optimizations (particularly for uploads). >> It also simplifies the interaction tremendously (e.g. it builds the >> PutRequest objects for you). >> >> Thanks, >> -John >> >> On May 23, 2013, at 12:12 PM, Min Chen <min.c...@citrix.com> wrote: >> >>> John, sorry for my ignorance. In S3TemplateDownloader.download, we are >>> only using HttpClient method to get the size and InputStream from a >>> given >>> URL, then we are invoking S3 client library putObject(PutObjectRequest) >>> to >>> download the content to S3. By using PutObjectRequest, I can set >>> ProgressListener to show download progress to end user. What is your >>> recommendation in this case? >>> >>> Thanks >>> -min >>> >>> On 5/23/13 7:20 AM, "John Burwell" <jburw...@basho.com> wrote: >>> >>>> Min, >>>> >>>> The com.cloud.storage.template.S3TemplateDownloader is directly >>>> accessing >>>> the S3 API using HTTP client. >>>> >>>> Thanks, >>>> John >>>> >>>> On May 22, 2013, at 5:57 PM, Min Chen <min.c...@citrix.com> wrote: >>>> >>>>> John, >>>>> >>>>> Can you clarify a bit on your last comment about directly accessing >>>>> S3 >>>>> HTTP API? We are only invoking routines in S3Utils to perform >>>>> operations >>>>> with S3, not invoke any REST api if that is what you meant. >>>>> >>>>> Thanks >>>>> -min >>>>> >>>>> On 5/22/13 2:49 PM, "John Burwell" <jburw...@basho.com> wrote: >>>>> >>>>>> Edison, >>>>>> >>>>>> For changes that take as long as described, it should be expected >>>>>> that >>>>>> the review will take a proportional amount of time. In future >>>>>> releases, we should think through ways to divide changes such as >>>>>> these >>>>>> into a set of smaller patches submitted throughout the course of the >>>>>> release cycle. >>>>>> >>>>>> So far, I can say I am very concerned about failure scenarios and >>>>>> potential race conditions around the NFS cache. However, I am only a >>>>>> quarter of the way through the code so my concerns may be resolved by >>>>>> the end of the process. >>>>>> >>>>>> I am also concerned about the correctness S3 implementation. Why did >>>>>> you choose to directly access the S3 HTTP API rather using the client >>>>>> library? >>>>>> >>>>>> Thanks, >>>>>> -John >>>>>> >>>>>> On May 22, 2013, at 5:25 PM, Edison Su <edison...@citrix.com> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com] >>>>>>>> Sent: Wednesday, May 22, 2013 1:26 PM >>>>>>>> To: dev@cloudstack.apache.org >>>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>>> >>>>>>>> On Wed, May 22, 2013 at 08:15:41PM +0000, Animesh Chaturvedi wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>> From: Chip Childers [mailto:chip.child...@sungard.com] >>>>>>>>>> Sent: Wednesday, May 22, 2013 12:08 PM >>>>>>>>>> To: dev@cloudstack.apache.org >>>>>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>>>>> >>>>>>>>>> On Wed, May 22, 2013 at 07:00:51PM +0000, Animesh Chaturvedi >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: John Burwell [mailto:jburw...@basho.com] >>>>>>>>>>>> Sent: Tuesday, May 21, 2013 8:51 AM >>>>>>>>>>>> To: dev@cloudstack.apache.org >>>>>>>>>>>> Subject: Re: [MERGE]object_store branch into master >>>>>>>>>>>> >>>>>>>>>>>> Edison, >>>>>>>>>>>> >>>>>>>>>>>> Thanks, I will start going through it today. Based on other >>>>>>>>>>>> $dayjob responsibilities, it may take me a couple of days. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -John >>>>>>>>>>> [Animesh>] John we are just a few days away from 4.2 feature >>>>>>>>>>> freeze, can >>>>>>>>>> you provide your comments by Friday 5/24. I would like all >>>>>>>>>> feature >>>>>>>> threads >>>>>>>>>> to be resolved sooner so that we don't have last minute rush. >>>>>>>>>> >>>>>>>>>> I'm just going to comment on this, but not take it much >>>>>>>>>> further... >>>>>>>>>> this type of change is an "architectural" change. We had >>>>>>>>>> previously >>>>>>>>>> discussed (on several >>>>>>>>>> threads) that the appropriate time for this sort of thing to hit >>>>>>>>>> master was >>>>>>>>>> *early* in the release cycle. Any reason that that consensus >>>>>>>>>> doesn't apply here? >>>>>>>>> [Animesh>] Yes it is an architectural change and discussion on >>>>>>>>> this >>>>>>>>> started a >>>>>>>> few weeks back already, Min and Edison wanted to get it in sooner >>>>>>>> by >>>>>>>> 4/30 >>>>>>>> but it took longer than anticipated in preparing for merge and >>>>>>>> testing on >>>>>>>> feature branch. >>>>>>>> >>>>>>>> You're not following me I think. See this thread on the Javelin >>>>>>>> merge: >>>>>>>> >>>>>>>> http://markmail.org/message/e6peml5ddkqa6jp4 >>>>>>>> >>>>>>>> We have discussed that our preference is for architectural changes >>>>>>>> to >>>>>>>> hit >>>>>>>> master shortly after a feature branch is cut. Why are we not doing >>>>>>>> that here? >>>>>>> >>>>>>> This kind of refactor takes time, a lot of time. I think I worked on >>>>>>> the merge of primary storage refactor into master and bug fixes >>>>>>> during >>>>>>> >>>>>>> >>>>>>> March(http://comments.gmane.org/gmane.comp.apache.cloudstack.devel/14 >>>>>>> 46 >>>>>>> 9) >>>>>>> , then started to work on the secondary storage refactor in >>>>>>> April(http://markmail.org/message/cspb6xweeupfvpit). Min and I >>>>>>> finished >>>>>>> the coding at end of April, then tested for two weeks, send out the >>>>>>> merge request at middle of May. >>>>>>> With the refactor, the storage code will be much cleaner, and the >>>>>>> performance of S3 will be improved, and integration with other >>>>>>> storage >>>>>>> vendor will be much easier, and the quality is ok(33 bugs fired, >>>>>>> only >>>>>>> 5 >>>>>>> left: >>>>>>> >>>>>>> >>>>>>> https://issues.apache.org/jira/issues/?jql=text%20~%20%22Object_Store >>>>>>> _R >>>>>>> ef >>>>>>> actor%22). Anyway, it's up to the community to decide, merge it or >>>>>>> not, >>>>>>> we already tried our best to get it done ASAP. >>>>>>> >>>>> >>>> >>> >> >