On Tue, May 28, 2013 at 11:42:56AM -0400, John Burwell wrote: > All, > > I have gone a through a large chunk of this patch, and published my review > thus far (https://reviews.apache.org/r/11277/). TL;DR is that this patch > has a number of significant issues which can be summarized as follows: > > 1. While it appeas that the requirement of NFS for secondary storage has > largely been removed, it has basically been if blocked out instead of pushed > down as an detail choice in the physical layer. Rather than exploiting > polymorpish to vary behavior through a set of higher level abstracttions, the > orchestration performs instanceof NFSTO checks. The concern is that future > evolution of secondary storage layer will still have dependencies on NFS. > > 2. In some sceanrios, NFS is still a requirement for secondary storage. In > particular, Xen users will still have to maintain an "NFS cache". Given the > degradation of capability, I think it would be helpful to put a few more > eyeballs on the Xen implementation to determine if we could avoid using an > intermediate file system. > > 3. I have the following concerns regarding potential race conditions and > resource exhaustion in the NFS cache implementation. > > - The current random allocator algorithm does not account for the amount > space that will be required for the operation (i.e. checking to ensure that > the cache it picks has enough space to transfer to hold the object being > downloaded) nor does it reserve the space.Given the long (in compute time) > running nature of these of processes, the available space in a cache could be > exhausted by a number of concurrently transfering templates and/or snapshots. > By reserving space before the transfer, the allocator would be able to > account for both pending operations and the current contents of the cache. > - There appears no mechanism to age out contents of the cache. > Therefore, as implemented, it will grow unbounded. The only workaround for > this problem would be to have an NFS cache whose size equals that of the > object store. > - The mechanism lacks robust error handling or retry logic. In > particular, the behavior if/when a cache exhausts available space appears > non-deterministic. > - Generally, I see little consideration for alternative/exception flows. > For example, what happens if I attempt to use a template/iso/snapshot in > transit to the object store to/from the cache? Since these files can be very > large (multiple GBs), we have assume some transfer latency in all > interactions. > > 4. The Image Cache abstraction is too tightly coupled to the core > orchestration mechanism. I would recommend implementing it as a DataStore > proxy that is applied as necessary. For example, the current Xen hypervisor > implementation must interact with a file system. It seems most appropriate > that the Xen hypervisor implementation would apply the proxy to its secondary > storage DataStore instances. Therefore, the storage engine should only > provide the facility not attempt to determine when it is needed. > Additionally, a proxy model would greatly reduce the size and complexity of > the various classes (e.g. AncientDataMotionStrategy). > > 5. While I have only reviewed roughly 50-60% of the patch thus far, I see > little to no additional unit tests for the new code added. Integration tests > have been expanded, but many of the test simply execute the service methods > and do not verify the form of the operation results. There are also a > tremendous number of ignored exceptions that should likely fail these tests. > Therefore, we could have tests passing due to an ignored exception that > should be failing. > > While I recognize the tremendous effort that has been expended to implement > this capability and value of the feature, I see significant design and > operational issues that must be addressed before it can be accepted into > master. In my opinion, the object_store represents a good first step, but it > needs a few more review/refinement iterations before it will be ready for a > master merge. > > Thanks, > -John
IMO, at a minimum, John's comments above (and items noted in reviewboard) should be addressed / discussed before any merge occurs.