Ben, thanks a lot for this wrap-up, much appreciated!
2015-11-10 22:25 GMT+01:00 Ben Swartzlander <b...@swartzlander.org>: > I wasn't going to write a wrap-up email this time around since so many > people were able to attend in person, but enough people missed it that I > changed my mind and decided to write down my own impressions of the > sessions. > > Wednesday working session: Migration Improvements > ------------------------------------------------- > During this session we covered the status of the migration feature so far > (it's merged but experimental) and the existing gaps: > 1) Will fail on shares on most backends with share servers > 2) Controller node in the data path -- needs data copy service > 3) No implementations of optimized migration yet > 4) Confusion around task_state vs state > 5) Need to disable most existing operations during a migration > 6) Possibly need to change driver interface for getting mount info > > Basically there is a lot of work left to be done on migration but we're > happy with the direction it's going. If we can address the gaps we could > make the APIs supported in Mitaka. We're eager to get to building the > valuable APIs on top of migration, but we can't do that until migration > itself is solid. > > I also suggested that migration might benefit from an API change to allow > a 2-phase migration which would allow the user (the admin in this case) to > control when the final cutover happens instead of letting it happen by > surprise. > > Wednesday working session: Access Allow/Deny Driver Interface > ------------------------------------------------------------- > During this session I proposed a new driver interface for allowing/denying > access to shares which is a single "sync access" API that the manager would > call and pass all of the rules to. The main benefits of the change would be: > 1) More reliable cleanup of errors > 2) Support for atomically updating multiple rules > 3) Simpler/more efficient implementation on some backends > > Most vendors agreed that the new interface would be superior, and > generally speaking most vendors said that the new interface would be > simpler and more efficient than the existing one. > > There were some who were unsure and one who specifically said an access > sync would be inefficient compared to the current allow/deny semantics. We > need to see if we can provide enough information in the new interface to > allow them to be more efficient (such as providing the new rules AND the > diff against the old rules). > > It was also pointed out that error reporting would be different using this > new interface, because errors applying rules couldn't be associated with > the specific rule that caused them. We need a solution to that problem. > > Thursday fishbowl session: Share Replication > -------------------------------------------- > There was a demo of the POC code from NetApp and general education on the > new design. Since the idea was new to some and the implementation was new > to everyone, there was not a lot of feedback. > > We did discuss a few issues, such as whether multiple shares should be > allowed in a single AZ. > > We agreed that this replication feature will be exposed to the same kind > of race conditions that exist in active/active HA, so there is additional > pressure to solve the distributed locking problem. Fortunately the > community seems to be converging on a solution to that problem -- the tooz > library. > > We agreed that support for replication in a first party driver is > essential for the feature to be accepted -- otherwise developers who don't > have proprietary storage systems would be unable to develop/test on the > feature. > > Thursday fishbowl session: Alternative Snapshot Semantics > --------------------------------------------------------- > During this session I proposed 2 new things you can do with snapshots: > 1) Revert a share to a snapshot > 2) Exporting snapshots directly as readonly shares > > For reverting snapshots, we agreed that the operation should preserve all > existing snapshots. If a backend is unable to revert without deleting > snapshots, it should not advertise the capability. > > For mounting snapshots, it was pointed out that we need to define the > access rules for the share. I proposed simply inheriting the rules for the > parent share with rw rules squashed to ro. That approach has downsides > though because it links to the access on the snapshot and the share (which > may no be desired) and also forces us to pass a list of snapshots into the > access calls so the driver can update snapshot access when updating share > access. > > Sage proposed creating a new concept of a readonly share and simply > overloading the existing create-share-from-snapshot logic with a -readonly > flag which gives us the semantics we want with much complexity. The > downside to that approach is that we will have to add code to handle > readonly shares. > > There was an additional proposal to allow the create/delete snapshot calls > without any other snapshot-related calls because some backends have in-band > snapshot semantics. This is probably unnecessary because every backend that > has snapshots is likely to support at least one of the proposed semantics > so we don't need another mode. > > Thursday working session: Export Location Metadata > -------------------------------------------------- > In this session we discussed the idea Jason proposed back in the winter > midcycle meetup to allow drivers to tag export locations with various types > of metadata which is meaningful to clients of Manila. There were several > proposed use cases discussed. > > The main thing we agreed to was that the metadata itself seems like a good > thing as long as the metadata keys and values are standardized. We didn't > like the possibility of vendor-defined or admin-defined metadata appearing > in the export location API. > > One use case discussed what the idea of preferred/non-preferred export > locations. This use case makes sense and nobody was against it. > > Another use case discussed was "readonly" export locations which might > allow certain drivers to use different export locations for writing and > reading. There was debate about how much sense this made. > > The third discussed use case was Jason's original suggestion: locality > information of different export locations to enable clients to determine > which export location is closer to them. We were generally opposed to this > idea because it's not clear how locality information would be expressed. We > didn't like admin-defined values, and the only standard thing we have is > AZs, which are already exposed through a different mechanism. > > Some time was spent in this session discussing the forthcoming Ceph driver > and its special requirements. We discussed the possbility of dual-protocol > access to shares (Ceph+NFS in this case). Dual protocol access was > previously rejected (NFS+CIFS) due to concerns about interoperability. We > still need to decide if we want to allow Ceph+NFS as a special case based > on the idea that Ceph shares would always support NFS and there's unlikely > to ever be a second Ceph driver. If we allow this, then the share protocol > would make sense to expose as a metadata value. > > Lastly we discussed the Ceph key-sharing requirement where peeople want to > use Manila to discover the Ceph secret. That would require adding some new > metadata, but on the access rules, not on the export locations. > > Thursday working session: Interactions Between New Features > -------------------------------------------------- > In this session we considered the possible interactions between share > migration, consistency groups, and share replication (the 3 new > experimental features). > > We quickly concluded that in the current code, bad things are likely to > happen if you use 2 of these features at the same time, so as a top > priority we must prevent that and return a sensible error message instead > of allowing undefined behavior. > > We spent the rest of the session discussing how the feature should > interact and concluded that enabling the behaviors we want require > significant new code for all of the pairings. > > Migration+CGs: requires the concept of CG instance or some way of tracking > which share instances make up the original CG and the migrated CG. > Alternatively, requires the ability to disband and reconstruct CGs. A > blueprint with and actual design is needed. > > Migration+Replication: can probably be implemented by simply migrating the > primary (active) replica to the destination and re-replicating from there. > This requires significant new code in the migration paths though because > they'll need to rebuild the replication topology on the destination side. > Also, for safety the migration should not complete until the destination > side is fully replicated to avoid the chance of a failure immediately after > migration causing a loss of access. There may be opportunities for > optimized versions of the above, especially when cross-AZ bandwidth is > limited and the migration is within an AZ. More though is needed, and a > blueprint should spell out the design. > > Replication+CGs: it doesn't make a lot of sense to replicate individual > shares from a CG -- more likely users will want to replicate the whole CG. > This is an assumption though and we have no supporting data. Either way, > replication at the granularity of a CG would require more logic to schedule > the replicas CGs before scheduling the share replicas. This is likely to be > significant new code and need a blueprint. > > Replication+CGs+Migration: this was proposed as a joke, but it's a serious > concern. The above designs should consider what happens if we have a > replicated CG and we wish to migrate it. If the above designs are done > carefully we should get correct behavior here for free. > > Friday contributor meetup > ------------------------- > On Friday we quickly reviewed the above topics for those that missed > earlier sessions, then we launched into the laundry list of topics that we > weren't able to fit into design sessions/working sessions. > > QoS: Zhongjun proposed a QoS implementation similar to Cinder's. After a > brief discussion, there was not agreement that Cinder's model was a good > one to follow, as QoS was introduced to Cinder before some other later > enhancements such as standardized extra specs. We're inclined to use > standardized extra specs instead of QoS policies in Manila. We still need > to agree on which QoS-related extra specs we should standardize. There are > 2 criteria for standardizing on QoS-related extra specs (1) they should > mean the same thing across vendors e.g. max-bytes-per-second, and (2) they > should be something that's widely implemented. I expect we'll see lots of > vendor-specific QoS-related extra specs, and we need to make sure it's > possible to mix vendors in the same share-type and assign a QoS policy > that's equivalent across both. > > Minimum required features: The main open question about minimum features > was access control types for each protocol. We agreed that for CIFS, > user-based access control is required, and IP-based access control is not. > Furthermore, support for read-only access rules is required. > > Improving gate performance/stability: We briefly discussed the gate issues > that plagued us during Liberty and our plan to address them, which is to > add more first party drivers that don't depend on quickly-evolving > projects. The big offender during Liberty was Neutron, although Nova and > Cinder have both bitten us in the past. To be clear, the existing Generic > driver is not going away, and will still be QA'd, but we would rather not > make it the gating driver. > > Manila HA: We briefly discussed tooz and agreed that we will use it in > Manila to address race conditions that exist in active-active HA scenarios, > as well as race conditions introduce by the replication code. > > Multiple share servers: There was some confusion about what a share server > is. Some assumed that it would required multiple Manila "share servers" to > implement a highly available shared storage system. In reality, the "share > server" in Manila is a collection of resources which can include an > arbitrary amount of underlying physical or virtual resources -- enough to > implement a highly available solution with a single share server. > > Rolling Upgrades: We punted on this again. There was a discussion about > how slow the progress in Cinder appears to be and we don't want to get > trapped in limbo for multiple cycles. However if the work will unavoidably > take multiple cycles then we need to know that so we can plan accordingly. > Support for rolling upgrades is still viewed as desirable, but the team is > worried about the apparent implementation cost. > > More mount automation: We covered what was done during Liberty (2 > approaches) and discussed briefly the new approach Sage suggested. The > nova-metadata approach that Clinton proposed in Vancouver which was not > accepted by the Nova team will probably get a warmer reception based on the > additional integration Sage is proposing. We should re-propose it and > continue with our existing plans. There was also broad support for Sage's > NFS-over-vsock proposal and Nova share-attach semantics. > > Key-based authentication: We went into more detail on John's Ceph-driver > requirements, and why he thinks it makes sense to use communicate secrets > from the Ceph backend to the end user through Manila. We didn't really > reach a decision, but nobody was strongly against the change John proposed. > I think we're still interested in alternative proposals if anyone has one. > > Make all deleted columns booleans: We discussed soft deletes and the fact > that they're not implemented consistently in Manila today. We also > questioned the value of soft deletes and reasoning for why we use them. > Some believed there was a performance benefit, and others suggested that it > had more to do with preservation of history for auditing and > troubleshooting. Depending on the real motivation, this proposal may need > to be scrapped or modified. > > Replication 2.0: We discussed the remaining work for the share replication > feature before it could be accepted. There were 2 main issues: support for > share servers, and a first party driver implementation of the feature. > There was some dispute about the value of a first party implementation but > the strongest argument in favor was the need for community members who > didn't have access to proprietary hardware to be able to maintain the > replication code. > > Functional tests for network plugins: We discussed the fact that the > existing network plugins aren't covered by tests in the gate. For > nova-network, we decided that was acceptable since that functionality is > deprecated. For the neutron network plugin and the standalone network > plugin, we need additional test jobs that exercise them. > > Capability lists: The addition of standardized extra specs pointed out a > gap, which is that with some extra specs a backend should be able to > advertise both the positive and negative capability. In the past vendors > have achieved that with gross pairs of vendor-specific extra specs (e.g. > netapp_dedupe and netapp_no_dedupe). We agreed it would make more sense for > the backend to simply advertise dedupe=[False,True]. Changes to the filter > scheduler are needed to allow capability lists, so Clinton volunteered to > implement those changes. > > Client microversions: There wasn't much to discuss on the topic of > microversions. The client patches are in progress. Cinder seems to be > headed down a similar path. We are happy with the feature so far. > > Fix delete operations: We discussed what the force-delete APIs should do > and what they shouldn't do. It was agreed that force-delete means "remove > it from the manila DB no matter what, and make a best effort attempt to > delete it off the backend". Some changes are needed to support that > semantic. We also discussed the common problem of "stuck" shares and how to > clean them up. We agreed that admins should typically use the reset-state > API and retry an ordinary delete. The force-delete approach can leave > garbage behind on the backend. The reset-state and retry-delete approach > should never leave garbage behind, so it's safer to use. > > Remove all extensions: We discussed the current effort to move extensions > into core. This work is mostly done and not much discussion was needed. > > Removing task state: We agreed to remove the task-state column introduced > by migration and use ordinary states for migration. > > Interaction with Nova attach file system API: We went into more detail on > Sage's Nova file-system-attach proposal and concluded that it should "just > work" without changes from Manila. > > > -Ben Swartzlander > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Dr. Silvan Kaiser Quobyte GmbH Hardenbergplatz 2, 10623 Berlin - Germany +49-30-814 591 800 - www.quobyte.com<http://www.quobyte.com/> Amtsgericht Berlin-Charlottenburg, HRB 149012B Management board: Dr. Felix Hupfeld, Dr. Björn Kolbeck, Dr. Jan Stender -- -- *Quobyte* GmbH Hardenbergplatz 2 - 10623 Berlin - Germany +49-30-814 591 800 - www.quobyte.com Amtsgericht Berlin-Charlottenburg, HRB 149012B management board: Dr. Felix Hupfeld, Dr. Björn Kolbeck, Dr. Jan Stender
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev