David Miller <da...@davemloft.net> writes: > From: Andrew Morton <a...@linux-foundation.org> > Date: Tue, 4 Mar 2014 13:30:04 -0800 > >> On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) >> wrote: >> >>> >>> Modify audit_send_reply to directly use a non-blocking send and >>> to return an error on failure (if anyone cares). >>> >>> Modify audit_list_rules_send to use audit_send_reply and give up >>> if we can not send a packet. >>> >>> Merge audit_list_rules into iaudit_list_rules_send as the code >>> is now sufficiently simple to not justify to callers. >>> >>> Kill audit_send_list, audit_send_reply_thread because using >>> a separate thread for replies is not needed when sending >>> packets syncrhonously. >> >> Nothing much seems to be happening here?
Well you picked up the patch that fixes the worst of the bugs that I was complaining about. Beyond that I don't know what makes sense. >> In an earlier email you said "While reading through 3.14-rc1 I found a >> pretty siginficant mishandling of network namespaces in the recent >> audit changes." Were those recent audit changes a post-3.14 thing? And >> what is the user-visible effect of the mishandling? > > I took a quick look at this patch yesterday, and my only suspicion is > that threads are created currently by this code purely to cons up a > blockable context for the netlink code. > > Perhaps it wants to avoid any netlink packet drops from being possible. > > I'm not so sure. Looking at the history (see below) the stated reason from David Woodhouse is to prevent the removal of packets from the packet queues when we have reached our rcvbuf limit. The reasoning doesn't make a lick of sense to me and I was hoping the folks dealing with audit would comment. If we really want the ability to always appened to the queue of skb's is to just have a version of netlink_send_skb that ignores the queued limits. Of course an evil program then could force the generation of enough audit records to DOS the kernel, but we seem to be in that situation now. Shrug. > Anyways, some investigation into perhaps figuring out the intentions of > the original implementation would be nice. I had looked briefly and missed a few things. Here is the complete history in all of it's confusion. In brief. The code came in using non-blocking netlink writes. David Woodhouse made the writes blocking. Al Viro, and then Eric Paris patch the code to deal with deadlocks cause by the blocking writes. commit 4527a30f157fca45c102ea49a2fb34a4502264eb Author: akpm <akpm> Date: Mon Apr 12 20:29:12 2004 +0000 [PATCH] Light-weight Auditing Framework From: Rik Faith <fa...@redhat.com> This patch provides a low-overhead system-call auditing framework for Linux that is usable by LSM components (e.g., SELinux). This is an update of the patch discussed in this thread: http://marc.theaimsgroup.com/?t=107815888100001&r=1&w=2 In brief, it provides for netlink-based logging of audit records that have been generated in other parts of the kernel (e.g., SELinux) as well as the ability to audit system calls, either independently (using simple filtering) or as a compliment to the audit record that another part of the kernel generated. The main goals were to provide system call auditing with 1) as low overhead as possible, and 2) without duplicating functionality that is already provided by SELinux (and/or other security infrastructures). This framework will work "stand-alone", but is not designed to provide, e.g., CAPP functionality without another security component in place. This updated patch includes changes from feedback I have received, including the ability to compile without CONFIG_NET (and better use of tabs, so use -w if you diff against the older patch). Please see http://people.redhat.com/faith/audit/ for an early example user-space client (auditd-0.4.tar.gz) and instructions on how to try it. My future intentions at the kernel level include improving filtering (e.g., syscall personality/exit codes) and syscall support for more architectures. First, though, I'm going to work on documentation, a (real) audit daemon, and patches for other user-space tools so that people can play with the framework and understand how it can be used with and without SELinux. Update: Light-weight Auditing Framework receive filter fixes From: Rik Faith <fa...@redhat.com> Since audit_receive_filter() is only called with audit_netlink_sem held, it cannot race with either audit_del_rule() or audit_add_rule(), so the list_for_each_entry_rcu()s may be replaced by list_for_each_entry()s, and the rcu_read_{un,}lock()s removed. A fix for this is part of the attached patch. Other features of the attached patch are: 1) generalized the ability to test for inequality 2) added syscall exit status reporting and testing 3) added ability to report and test first 4 syscall arguments (this adds a large amount of flexibility for little cost; not implemented or tested on ppc64) 4) added ability to report and test personality User-space demo program enhanced for new fields and inequality testing: http://people.redhat.com/faith/audit/auditd-0.5.tar.gz BKrev: 407afc18IAH0yJVxEhi5ka-sbTOn8g diff --git a/kernel/audit.c b/kernel/audit.c new file mode 100644 index 000000000000..765822b03b91 --- /dev/null +++ b/kernel/audit.c [snip] +void audit_send_reply(int pid, int seq, int type, int done, int multi, + void *payload, int size) +{ + struct sk_buff *skb; + struct nlmsghdr *nlh; + int len = NLMSG_SPACE(size); + void *data; + int flags = multi ? NLM_F_MULTI : 0; + int t = done ? NLMSG_DONE : type; + + skb = alloc_skb(len, GFP_KERNEL); + if (!skb) + goto nlmsg_failure; + + nlh = NLMSG_PUT(skb, pid, seq, t, len - sizeof(*nlh)); + nlh->nlmsg_flags = flags; + data = NLMSG_DATA(nlh); + memcpy(data, payload, size); + netlink_unicast(audit_sock, skb, pid, MSG_DONTWAIT); + return; + +nlmsg_failure: /* Used by NLMSG_PUT */ + if (skb) + kfree_skb(skb); +} commit b7d1125817c9a46cc46f57db89d9c195e7af22f8 Author: David Woodhouse <dw...@shinybook.infradead.org> Date: Thu May 19 10:56:58 2005 +0100 AUDIT: Send netlink messages from a separate kernel thread netlink_unicast() will attempt to reallocate and will free messages if the socket's rcvbuf limit is reached unless we give it an infinite timeout. So do that, from a kernel thread which is dedicated to spewing stuff up the netlink socket. Signed-off-by: David Woodhouse <dw...@infradead.org> @@ -293,13 +319,16 @@ void audit_send_reply(int pid, int seq, int type, int done, int multi, skb = alloc_skb(len, GFP_KERNEL); if (!skb) - goto nlmsg_failure; + return; - nlh = NLMSG_PUT(skb, pid, seq, t, len - sizeof(*nlh)); + nlh = NLMSG_PUT(skb, pid, seq, t, size); nlh->nlmsg_flags = flags; data = NLMSG_DATA(nlh); memcpy(data, payload, size); - netlink_unicast(audit_sock, skb, pid, MSG_DONTWAIT); + + /* Ignore failure. It'll only happen if the sender goes away, + because our timeout is set to infinite. */ + netlink_unicast(audit_sock, skb, pid, 0); return; nlmsg_failure: /* Used by NLMSG_PUT */ commit 9044e6bca5a4a575d3c068dfccb5651a2d6a13bc Author: Al Viro <v...@zeniv.linux.org.uk> Date: Mon May 22 01:09:24 2006 -0400 [PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULES We should not send a pile of replies while holding audit_netlink_mutex since we hold the same mutex when we receive commands. As the result, we can get blocked while sending and sit there holding the mutex while auditctl is unable to send the next command and get around to receiving what we'd sent. Solution: create skb and put them into a queue instead of sending; once we are done, send what we've got on the list. The former can be done synchronously while we are handling AUDIT_LIST or AUDIT_LIST_RULES; we are holding audit_netlink_mutex at that point. The latter is done asynchronously and without messing with audit_netlink_mutex. Signed-off-by: Al Viro <v...@zeniv.linux.org.uk> commit f09ac9db2aafe36fde9ebd63c8c5d776f6e7bd41 Author: Eric Paris <epa...@redhat.com> Date: Fri Apr 18 10:11:04 2008 -0400 Audit: stop deadlock from signals under load A deadlock is possible between kauditd and auditd under load if auditd receives a signal. When auditd receives a signal it sends a netlink message to the kernel asking for information about the sender of the signal. In that same context the audit system will attempt to send a netlink message back to the userspace auditd. If kauditd has already filled the socket buffer (see netlink_attachskb()) auditd will now put itself to sleep waiting for room to send the message. Since auditd is responsible for draining that socket we have a deadlock. The fix, since the response from the kernel does not need to be synchronous is to send the signal information back to auditd in a separate thread. And thus auditd can continue to drain the audit queue normally. Signed-off-by: Eric Paris <epa...@redhat.com> Signed-off-by: Al Viro <v...@zeniv.linux.org.uk> Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/