Le 27/10/2015 04:09, Laurent Vivier a écrit : > > > Le 26/10/2015 15:40, Peter Maydell a écrit : >> On 6 October 2015 at 18:11, Laurent Vivier <laur...@vivier.eu> wrote: >>> This is obsolete, but if we want to use dhcp with some distros (like debian >>> ppc 8.2 jessie), we need it. >>> >>> At the bind level, we are not able to know the socket type so we try to >>> guess it by analyzing the name. We manage only the case "ethX", >>> "ethX" in spk_device is similar to set htons(0x6574) in sll_protocol in the >>> normal case, and as this protocol does not exist, it's ok. >>> >>> SOCK_PACKET uses network endian to encode protocol in socket() >>> >>> in PACKET(7) : >>> protocol is the IEEE 802.3 protocol >>> number in network order. See the <linux/if_ether.h> include file for a >>> list of allowed protocols. When protocol is set to htons(ETH_P_ALL) >>> then all protocols are received. All incoming packets of that protocol >>> type will be passed to the packet socket before they are passed to the >>> protocols implemented in the kernel. >>> >>> Signed-off-by: Laurent Vivier <laur...@vivier.eu> >>> --- >>> This patch is a remix of an old patch sent in 2012: >>> https://patchwork.ozlabs.org/patch/208892/ >>> >>> linux-user/syscall.c | 33 +++++++++++++++++++++++++++------ >>> 1 file changed, 27 insertions(+), 6 deletions(-) >>> >>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c >>> index 64be431..71cc1e2 100644 >>> --- a/linux-user/syscall.c >>> +++ b/linux-user/syscall.c >>> @@ -111,6 +111,7 @@ int __clone2(int (*fn)(void *), void *child_stack_base, >>> #include <linux/route.h> >>> #include <linux/filter.h> >>> #include <linux/blkpg.h> >>> +#include <linux/if_packet.h> >>> #include "linux_loop.h" >>> #include "uname.h" >>> >>> @@ -1198,11 +1199,20 @@ static inline abi_long >>> target_to_host_sockaddr(struct sockaddr *addr, >>> memcpy(addr, target_saddr, len); >>> addr->sa_family = sa_family; >>> if (sa_family == AF_PACKET) { >>> - struct target_sockaddr_ll *lladdr; >>> + /* Manage an obsolete case : >>> + * if socket type is SOCK_PACKET, bind by name otherwise by index >>> + * but we are not able to know socket type, so check if the name >>> + * is usable... >>> + * see linux/net/packet/af_packet.c: packet_bind_spkt() >>> + */ >>> + if (strncmp((char *)((struct sockaddr_pkt *)addr)->spkt_device, >>> + "eth", 3) != 0) { >>> + struct target_sockaddr_ll *lladdr; >> >> This confuses me. The packet(7) manpage suggests there are two flavours >> of packet socket: >> (1) legacy AF_INET + SOCK_PACKET >> (2) new style AF_PACKET + SOCK_RAW / SOCK_DGRAM >> >> but this comment suggests it's trying to handle AF_PACKET + SOCK_PACKET ? > > In fact, I've started not from the man page, but from a non working dhcp > client, originally with a m68k target and etch-m68k distro, and I've met > again this problem on a ppc target and jessie distro. > >> >> If AF_PACKET was introduced as the new way of doing things, it's not >> clear why it would be the family type used in the legacy approach's >> sockaddr_pkt (though there seems to be code around that does this). >> I suspect that 2.0 kernels just didn't check af_family here at all. > > by digging into the code, I have found: > > $ apt-get source isc-dhcp-client > $ vi isc-dhcp-4.3.1/common/lpf.c > > ... > 72 int if_register_lpf (info) > 73 struct interface_info *info; > 74 { > ... > 79 if ((sock = socket(PF_PACKET, SOCK_PACKET, > 80 htons((short)ETH_P_ALL))) < 0) { > ... > > So we can see socket() is used with domain PF_PACKET and type SOCK_PACKET. > > Next, the interface name is put into the sa_data of sockaddr, and bind() > is used with AF_PACKET: > > 94 /* Bind to the interface name */ > 95 memset (&sa, 0, sizeof sa); > 96 sa.sa_family = AF_PACKET; > 97 strncpy (sa.sa_data, (const char *)info -> ifp, sizeof > sa.sa_data); > 98 if (bind (sock, &sa, sizeof sa)) { > > ifp is initialized from a list of all discovered interfaces in > common/discover.c: > ... > 238 int > 239 begin_iface_scan(struct iface_conf_list *ifaces) { > ... > 283 if (ioctl(ifaces->sock, SIOCGLIFCONF, &ifaces->conf) < 0) { > ... > 918 void > 919 discover_interfaces(int state) { > ... > 924 struct interface_info *tmp; > ... > 939 if (!begin_iface_scan(&ifaces)) { > ... > 955 while (next_iface(&info, &err, &ifaces)) { > 956 > 957 /* See if we've seen an interface that matches this > one. */ > 958 for (tmp = interfaces; tmp; tmp = tmp->next) { > 959 if (!strcmp(tmp->name, info.name)) > 960 break; > 961 } > > ... > 987 strcpy(tmp->name, info.name); > ... > 1050 } > ... > 1063 for (tmp = interfaces ; tmp != NULL ; tmp = tmp->next) { > 1064 if (tmp->ifp == NULL) { > 1065 struct ifreq *tif; > 1066 > 1067 tif = (struct ifreq *)dmalloc(sizeof(struct > ifreq), > 1068 MDL); > 1069 if (tif == NULL) > 1070 log_fatal("no space for ifp mockup."); > 1071 strcpy(tif->ifr_name, tmp->name); > 1072 tmp->ifp = tif; > 1073 } > 1074 } > >> >> The code in the kernel's packet_recvmsg fills in the spkt_family >> field with the netdevice's type field, which is an ARPHRD_* constant >> as far as I can tell (I could well be wrong here). > > kernel 4.3.0-rc3, net/packet/af_packet.c: > > 2961 > 2962 static int packet_bind_spkt(struct socket *sock, struct sockaddr > *uaddr, > 2963 int addr_len) > 2964 { > 2965 struct sock *sk = sock->sk; > 2966 char name[15]; > 2967 struct net_device *dev; > 2968 int err = -ENODEV; > 2969 > 2970 /* > 2971 * Check legality > 2972 */ > 2973 > 2974 if (addr_len != sizeof(struct sockaddr)) > 2975 return -EINVAL; > 2976 strlcpy(name, uaddr->sa_data, sizeof(name)); > 2977 > 2978 dev = dev_get_by_name(sock_net(sk), name); > 2979 if (dev) > 2980 err = packet_do_bind(sk, dev, pkt_sk(sk)->num); > 2981 return err; > 2982 } > ... > 4246 static const struct proto_ops packet_ops_spkt = { > 4247 .family = PF_PACKET, > ... > 4250 .bind = packet_bind_spkt, > ... > 3022 > 3023 static int packet_create(struct net *net, struct socket *sock, > int protocol, > 3024 int kern) > ... > 3045 if (sock->type == SOCK_PACKET) > 3046 sock->ops = &packet_ops_spkt; > ... > >> Odd to have code in target_to_host_sockaddr and not in >> host_to_target_sockaddr. > > In the case of host_to_target_sockaddr(), there is no "if (sa_family == > AF_PACKET) {" as it was in target_to_host_sockaddr(), it's why I didn't > add it (and I don't like to add code I don't test). > >> >>> - lladdr = (struct target_sockaddr_ll *)addr; >>> - lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex); >>> - lladdr->sll_hatype = tswap16(lladdr->sll_hatype); >>> + lladdr = (struct target_sockaddr_ll *)addr; >>> + lladdr->sll_ifindex = tswap32(lladdr->sll_ifindex); >>> + lladdr->sll_hatype = tswap16(lladdr->sll_hatype); >>> + } >>> } >>> unlock_user(target_saddr, target_addr, 0); >>> >>> @@ -2509,7 +2519,12 @@ static abi_long do_socketcall(int num, abi_ulong >>> vptr) >>> /* now when we have the args, actually handle the call */ >>> switch (num) { >>> case SOCKOP_socket: /* domain, type, protocol */ >>> - return do_socket(a[0], a[1], a[2]); >>> + if (a[0] == AF_PACKET || >>> + a[1] == TARGET_SOCK_PACKET) { >>> + return do_socket(a[0], a[1], tswap16(a[2])); >>> + } else { >>> + return do_socket(a[0], a[1], a[2]); >>> + } >>> case SOCKOP_bind: /* sockfd, addr, addrlen */ >>> return do_bind(a[0], a[1], a[2]); >>> case SOCKOP_connect: /* sockfd, addr, addrlen */ >>> @@ -7500,7 +7515,13 @@ abi_long do_syscall(void *cpu_env, int num, abi_long >>> arg1, >>> #endif >>> #ifdef TARGET_NR_socket >>> case TARGET_NR_socket: >>> - ret = do_socket(arg1, arg2, arg3); >>> + if (arg1 == AF_PACKET || >>> + arg2 == TARGET_SOCK_PACKET) { >>> + /* in this case, socket() needs a network endian short */ >>> + ret = do_socket(arg1, arg2, tswap16(arg3)); >>> + } else { >>> + ret = do_socket(arg1, arg2, arg3); >>> + } >> >> This doesn't make sense to me. The argument to socket() >> is passed in via a register; so if the guest code correctly >> passes the protocol value as htons(whatever) then that will >> be the value in arg3 and we do not need to swap anything. >> >> I see we had this discussion about the previous version of the >> patch too... and my argument that we don't need the tswap16 >> in the socketcall code path either still makes sense to me. > > I agree with you, I think I have confused socketcall() that passes > parameters by memory, and socket() that passes parameters by registers. > I will remove this part and resend a patch.
And for the socketcall part, we need the tswap16(): for instance, int a = htons(0x0003); On a LE host: a = 0x00000300 On a BE host: a = 0x00000003 If the guest is BE, it will put in memory: 0x00 0x00 0x00 0x03 Then a LE host, will read: int b = 0x03000000 but get_user_ual() in do_socketcall() will byte-swap it and put 0x00000003 in a[2]. so without the byte-swap, we call do_socket(..., 0x0003), whereas the syscall is waiting for htons(0x0003) -> 0x0300 as we are on LE host. I'm sorry, I can't explain that better... >> >>> fd_trans_unregister(ret); >>> break; >>> #endif >>> -- >>> 2.4.3 >> >> thanks >> -- PMM >> > > Thank you for your comments, > Laurent >