Ben Pfaff <b...@ovn.org> writes:

> On Thu, Aug 11, 2016 at 11:33:59AM -0400, Aaron Conole wrote:
>> Ben Pfaff <b...@ovn.org> writes:
>> 
>> > On Mon, Aug 01, 2016 at 08:24:40AM -0400, Aaron Conole wrote:
>> >> Aaron Conole <acon...@redhat.com> writes:
>> >> 
>> >> > Currently, when using Open vSwitch with DPDK and qemu guests, the
>> >> > recommended
>> >> > method for joining the guests is via the dpdkvhostuser interface. This
>> >> > interface uses Unix Domain sockets to communicate. When these sockets 
>> >> > are
>> >> > created, they inherit the permissions and ownership from the
>> >> > vswitchd process.
>> >> > This can lead to an undesirable state where the QEMU process
>> >> > cannot use the
>> >> > socket file until manual intervention is performed (via `chown`
>> >> > and/or `chmod`
>> >> > calls).
>> >> >
>> >> > This patchset gives the ability to set the permissions and
>> >> > ownership of all
>> >> > dpdkvhostuser sockets from the database, avoiding the manual 
>> >> > intervention
>> >> > required to connect QEMU and OVS via DPDK.
>> >> >
>> >> > The first patch adds chmod and chown calls to lib, with unit tests. The
>> >> > second patch hooks those calls into the netdev_dpdk_vhost_user_construct
>> >> > function, after the socket is created.
>> >> >
>> >> > Changes from v2:
>> >> > * Added a new 2nd patch to series for chmod/chown on already
>> >> > opened files.
>> >> >   There exist known implementations for other systems, including
>> >> > FreeBSD, but
>> >> >   only linux is implemented.  ENOTSUP is set when these calls fail
>> >> > on non-linux
>> >> >   systems.
>> >> 
>> >> Ping.  I was hoping to have this applied before the 2.6 branch was cut;
>> >> is that possible?
>> >
>> > The code in patch 2, which goes through every fd in /proc to find the
>> > socket it needs because DPDK will not tell it the fd and will not set
>> > the needed permissions, is appalling.  It may be the biggest kluge I've
>> > seen this year.
>> 
>> Ouch.  I'm sure that's an accomplishment I don't want associated with my
>> name.  Do you take issue with the design (I agree it's awful to need
>> anything like this)?  Did the implementation really smell that bad?
>
> It's not about the code (and it's certainly not ad hominem).  It's about
> the whole technique.  The idea of opening a socket, then scanning the
> whole fd table by reading through the file system to determine the
> number of the fd that was opened, is appalling.
>
>> > This would be slightly less appalling if it used fstat() and
>> > getsockname() instead of /proc.  Still not great.
>> 
>> I disagree - patch 2/3 wouldn't exist.   [...]
>
> The approach I was talking about was to scan the fd table using a loop
> over the valid fds, calling fstat() on each to find out whether it's a
> socket then getsockname() to find out whether it's the right socket.
>
> Come to think of it, I don't know what happens if you try to
> getsockname() on a socket whose name is longer than whatever fits in
> sockaddr_un.  Dammit.
>
>> [...]  That goes to the last sentence
>> in this.  Both of those calls will work for a descriptor (and notice
>> that patch 1/3 uses them).  The issue I have is trying to go from a
>> pathname to a descriptor (which is not a standardized interface for a
>> plethora of reasons).  If I had the descriptor, though, patch 2/3 would
>> never have been submitted (because it would be unneeded - I do
>> fchmod/fchown in patch 1/3).
>
> Yes, right.  I understand the problem, I just don't understand why it
> exists.  The DPDK maintainers can solve it instantly in 5 minutes of
> work.
>
>> > This definitely needs comments to explain why it exists and what it's
>> > doing.
>
> Thanks for the explanation.  In this case, I meant that the *code* needs
> to have comments, since it had none.  It took quite a while from looking
> at the code to figure out what was going on, and longer to figure out
> why.  Comments should make this obvious to the reader.
>
> [...]
>
>> > This could be properly fixed in DPDK by adding a 1-line helper function
>> > that returns the fd for a vhost-user server.
>> 
>> +1
>> I agree strongly.  However, Christian Ehrhardt and myself have each
>> submitted patches[5][6] to try and make this kind of thing non-odius.
>> These patches have not been accepted.
>
> Neither of those is a simple "return server->listen_fd;" function.  Any
> chance that would be accepted?  It would solve the problem just fine.

I don't know, but I'll outline my plans, and if you agree, then I'll
put them into action.

First, I'll drop patch 2/3.

Second, I'll take your suggestion below, and implement it in the
netdev-dpdk area (as part of "patch 3/3", which will become 2/2).

I'll submit the above by mid-day Friday, tested.

Third, I'll cook up the DPDK server socket fd patch, and try to get that
part of a future DPDK release.  If accepted, then when OvS ports to that
release, I will submit a patch removing your suggestion, and using the
file descriptor instead (or whatever becomes of it).

If this sounds acceptable, I'll start in on it ASAP.

>> If you think that patch 2/3 is that maloderous that it cannot be
>> accepted, then there will always probably be this kind of disjointed
>> file-system permissions gap for vhost-user server mode interfaces.
>> Unless you have a better suggestion - I'm all ears then :)
>
> I do have one suggestion.  Something like this:
>
> for (int i = 0; i < 10; i++) {
>     /* Find the first available file descriptor, then close it. */
>     int fd = socket(...);
>     close(fd);
>
>     /* Create server socket. */
>     server = dpdk_create_server_socket(filename, ...);
>
>     /* Check that fd is the new server socket fd.
>      * This is normally true unless some other thread created a fd
>      * concurrently. */
>     struct stat s;
>     struct sockaddr_un sun;
>     socklen_t sun_len = sizeof sun;
>     if (!stat(fd, &s) && S_ISSOCK(s.st_mode) &&
>         getsockname(fd, &sun, &sun_len) && !strcmp(filename, sun.sun_path)) {
>         /* Yay, we found the fd. */
>         return success;
>     }
>
>     /* Race.  Try again. */
>     dpdk_close_server_socket(server);
> }

Thanks, Ben.  I am not opposed to going with this approach.
Unfortunately, there are no truly good solutions I see here at
the moment.

-Aaron
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to