On Wed, Mar 20, 2013 at 02:38:35PM -0400, Luiz Capitulino wrote: > On Wed, 20 Mar 2013 13:14:21 -0500 > mdroth <mdr...@linux.vnet.ibm.com> wrote: > > > > > > > > > + handle = s->pstate.fd_counter++; > > > > > > > > + if (s->pstate.fd_counter < 0) { > > > > > > > > + s->pstate.fd_counter = 0; > > > > > > > > + } > > > > > > > > > > > > > > Is this handling the overflow case? Can't fd 0 be in use already? > > > > > > > > > > > > It could, but it's very unlikely that an overflow/counter reset > > > > > > would > > > > > > result in issuing still-in-use handle, since guest-file-open would > > > > > > need > > > > > > to be called 2^63 times in the meantime. > > > > > > > > > > Agreed, but as you do check for that case and as the right fix is > > > > > simple > > > > > and I think it's worth it. I can send a patch myself. > > > > > > > > > > > > > A patch to switch to tracking a list of issued handles in the keystore, > > > > or to return an error on overflow? > > > > > > To return an error on overflow. Minor, but if we do handle it let's do it > > > right. Or, we could just add an assert like: > > > > > > assert(s->pstate.fd_counter >= 0); > > > > Ah, well, I'm not sure I understand then. You mean dropping: > > > > if (s->pstate.fd_counter < 0) { > > s->pstate.fd_counter = 0; > > } > > > > And replacing it with an error or assertion? > > Yes, because I had understood you meant that this is very unlikely to be > triggered because we'd need guest-file-open to be called 2^63. This is > what I agreed above, although thinking more about it there's also the > possibility of a malicious client doing this on purpose. > > But now I see that what you really meant is that it's unlikely for fd 0 > to be in use after 2^63 guest-file-open calls. Am I right? If yes, then
Yup, that's the scenario I was referring to. > I disagree because there's no way to guarantee when a certain fd will be > in use or not, unless we allow fds to be returned. Yes, it's a real scenario that can indeed occur under "normal" usage, but in my head it's similar to assumptions we make about clocks not overflowing within a timeframe that anyone cares about in terms of severity. To quantify it more: Given RPC latency there's really no way to run up the fd counter faster than once per nanosecond (more like millisecond atm), so you'd have to keep a handle open and constantly called guest-file-open for a period of 292 years before a duplicate handle gets issued. > > > If so, the overflow is actually expected: once we dish out handle MAX_INT64, > > we should restart at 0. I initially made fd_counter a uint64_t so > > overflow/reset would happen more naturally, but since we issue handles as > > int64_t this would've caused other complications. > > > > Something like this might be more clear about the intent though: > > > > handle = s->pstate.fd_counter; > > if (s->pstate.fd_counter == MAX_INT64) { > > s->pstate.fd_counter = 0; > > } else { > > s->pstate.fd_counter++; > > } > > I disagree about restarting to zero as I have explained above. You seem to > not like returning an error, is it because we'll make guest-file-open > useless after the limit is reached? Yes. But, on second thought, given the above I guess I don't mind returning an error as a failsafe for now. Although I think it should be an assert() with a FIXME:, since it really should be fixed before anyone ever manages to trigger it. > > Let's review our options: > > 1. When fd_count reaches MAX_INT64 we reset it to zero > > Pros: simple and guest-file-open always work > Cons: fd 0 might be in use by a client > > 2. When fd_count reaches MAX_INT64 we return an error > > Pros: simple and we fix 'cons' from item 1 > Cons: guest-file-open will have a usage count limit > > 3. Allow fds to be returned by clients on guest-file-close and do 2 on top > > Pros: solve problems discussed in items 1 and 2 > Cons: not trivial and the usage limit problem from item 2 can still > happen if the client ends up not calling guest-file-close > (although I do think we'll reach the OS limit here) > > Do you see other options? Am I overcomplicating? > No, I think that about sums it up. Doing 3, paired with a limit on the number of outstanding FDs as in 2 is the full solution. The only complication that is that the higher the limit we impose, the more I/O and look-up overhead will be incured for every guest-file-open/guest-file-close, because those operations will become O(fd_limit). So if we do it we'll need to impose a reasonable limit like 16k outstanding FDs at a time or something (maybe even that's too much, but I'm thinking an extra 64k read/write with every fopen()/fclose() isn't that big a deal). But to complicate things somewhat more: This whole reason for this keystore thing is to patch up the existing interfaces so they continue functioning without clients needing to update. So if we're considering something that requires imposing a fairly small limit on the number of outstanding handles, they'd need to update their implementations eventually anyway to be correct. So I wonder if, rather than pursuing option 3, we just introduce an interface that does what we really want and returns handles as UUIDs, then mark the existing interfaces as deprecated (and then remove them within the next 300 years so our assert never gets hit :)