Re: [PATCH] update to the latest libedit version and remove libreadline deps

2010-11-05 Thread Jilles Tjoelker
On Fri, Nov 05, 2010 at 04:32:56PM +0100, Baptiste Daroussin wrote:
> I've updated libedit to the latest version available in the netbsd cvs.
> UTF8 support is disabled for now has it seems to be experimental and segfault.
> I also patch and tested all the sources that used to be linked against
> libreadline so that it now uses libedit making libreadline unused (I
> guess, perhaps I have missed some)

> beware that there are collision between libreadline and libedit
> (/usr/include/readline/readline.h) is provided by both of them.

> You can find the patch against current here:
> http://people.freebsd.org/~bapt/update-libedit.patch

This patch reverts various local enhancements to libedit, such as:
* r212191 more expected behaviour of ed-delete-next-char
* r212235  key support
* r209217,r209219,r209224 completion improvements for sh(1)
* built-in "\033[1~" and "\033[4~" support (not sure how useful this is)
* .An macros in man pages

If the csh-style history expansion gets in, a "set -o histexpand" for sh
could be useful. I think this is not important enough to add all the
code for, but if we have the code anyway why not allow using it. On the
other hand, if we can replace it with stubs and is very simplistic,
perhaps it is better to stub it out and not change sh.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: libkvm: consumers of kvm_getprocs for non-live kernels?

2010-11-10 Thread Jilles Tjoelker
On Wed, Nov 10, 2010 at 09:41:52PM +0100, Ulrich Spörlein wrote:
> I have this cleanup of libkvm sitting in my tree and it needs a little
> bit of testing, especially the function kvm_proclist, which is only
> called from kvm_deadprocs which is only called from kvm_getprocs when kd
> is not ALIVE.

> The only consumer in our tree that I can make out is *probably* kgdb, as
> ps(1), top(1), w(1), pkill(1), fstat(1), systat(1), pmcstat(8) and
> bsnmpd don't really work on coredumps

> But, the kgdb file gnu/usr.bin/binutils/gdb/kvm-fbsd.c, where
> kvm_getprocs is probably called on a dead kernel is not even used during
> build!

> So I guess I'm staring at dead code here, any kvm people around that can
> clue me in?

It is a while ago that I last used this, but ps and fstat definitely
worked on crashdumps, to some extent. /usr/sbin/crashinfo uses this.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Removing PATH=...%builtin... from sh

2010-12-25 Thread Jilles Tjoelker
Most ash derivatives have an undocumented feature where the presence of
an entry "%builtin" in $PATH will cause builtins to be checked at that
point of the PATH search, rather than before looking at any directories
as documented in the man page (very old versions do document this
feature).

I would like to remove this feature from sh, as it complicates the code,
may violate expectations (for example, /usr/bin/alias is very close to a
forkbomb with PATH=/usr/bin:%builtin, only /usr/bin/builtin not being
another link saves it) and appears to be unused (all the %builtin google
code search finds is in some sort of ash source code).

The feature also has a bug in that it does not work properly if there is
a PATH assignment before 'type', 'command -v' or 'command -V' and
%builtin differs between the set and temporary paths. (This worked in
8.x for 'command'; 'type' ignores PATH assignment in 8.x.)

It is true that dash ignores %builtin for the special builtins and the
"regular builtins" that are found before PATH, using it only for the
other builtins. I could do the same but I think it is not useful enough.

See also
http://lists.freebsd.org/pipermail/svn-src-head/2010-November/022613.html

Which builtins this is about:
POSIX special builtins:
  : . break continue eval exec exit export readonly return set shift
  times trap unset
POSIX commands found before PATH (regular builtins):
  alias bg cd command false fc fg getopts jobs kill pwd read true
  umask unalias wait
Other POSIX utilities that must be builtins:
  hash type ulimit
Other POSIX utilities:
  [ echo printf test
Non-POSIX utilities that must be builtins:
  builtin bind chdir jobid local setvar
Undocumented builtins:
  exp let wordexp

And for completeness one command that should really be a builtin or
otherwise magic:
Missing POSIX commands found before PATH (regular builtins):
  newgrp

-- 
Jilles Tjoelker
Index: bin/sh/exec.c
===
--- bin/sh/exec.c	(revision 216690)
+++ bin/sh/exec.c	(working copy)
@@ -92,7 +92,6 @@
 
 
 static struct tblentry *cmdtable[CMDTABLESIZE];
-static int builtinloc = -1;		/* index in path of %builtin, or -1 */
 int exerrno = 0;			/* Last exec error */
 
 
@@ -245,8 +244,7 @@
 	}
 	while ((name = *argptr) != NULL) {
 		if ((cmdp = cmdlookup(name, 0)) != NULL
-		 && (cmdp->cmdtype == CMDNORMAL
-		 || (cmdp->cmdtype == CMDBUILTIN && builtinloc >= 0)))
+		 && cmdp->cmdtype == CMDNORMAL)
 			delete_cmd_entry();
 		find_command(name, &entry, DO_ERR, pathval());
 		if (verbose) {
@@ -337,8 +335,8 @@
 			goto success;
 	}
 
-	/* If %builtin not in path, check for builtin next */
-	if (builtinloc < 0 && (i = find_builtin(name, &spec)) >= 0) {
+	/* Check for builtin next */
+	if ((i = find_builtin(name, &spec)) >= 0) {
 		INTOFF;
 		cmdp = cmdlookup(name, 1);
 		if (cmdp->cmdtype == CMDFUNCTION)
@@ -354,7 +352,7 @@
 	prev = -1;		/* where to start */
 	if (cmdp) {		/* doing a rehash */
 		if (cmdp->cmdtype == CMDBUILTIN)
-			prev = builtinloc;
+			prev = -1;
 		else
 			prev = cmdp->param.index;
 	}
@@ -366,19 +364,7 @@
 		stunalloc(fullname);
 		idx++;
 		if (pathopt) {
-			if (prefix("builtin", pathopt)) {
-if ((i = find_builtin(name, &spec)) < 0)
-	goto loop;
-INTOFF;
-cmdp = cmdlookup(name, 1);
-if (cmdp->cmdtype == CMDFUNCTION)
-	cmdp = &loc_cmd;
-cmdp->cmdtype = CMDBUILTIN;
-cmdp->param.index = i;
-cmdp->special = spec;
-INTON;
-goto success;
-			} else if (prefix("func", pathopt)) {
+			if (prefix("func", pathopt)) {
 /* handled below */
 			} else {
 goto loop;	/* ignore unimplemented options */
@@ -485,8 +471,7 @@
 
 	for (pp = cmdtable ; pp < &cmdtable[CMDTABLESIZE] ; pp++) {
 		for (cmdp = *pp ; cmdp ; cmdp = cmdp->next) {
-			if (cmdp->cmdtype == CMDNORMAL
-			 || (cmdp->cmdtype == CMDBUILTIN && builtinloc >= 0))
+			if (cmdp->cmdtype == CMDNORMAL)
 cmdp->rehash = 1;
 		}
 	}
@@ -506,13 +491,11 @@
 	const char *old, *new;
 	int idx;
 	int firstchange;
-	int bltin;
 
 	old = pathval();
 	new = newval;
 	firstchange = ;	/* assume no change */
 	idx = 0;
-	bltin = -1;
 	for (;;) {
 		if (*old != *new) {
 			firstchange = idx;
@@ -523,19 +506,12 @@
 		}
 		if (*new == '\0')
 			break;
-		if (*new == '%' && bltin < 0 && prefix("builtin", new + 1))
-			bltin = idx;
 		if (*new == ':') {
 			idx++;
 		}
 		new++, old++;
 	}
-	if (builtinloc < 0 && bltin >= 0)
-		builtinloc = bltin;		/* zap builtins */
-	if (builtinloc >= 0 && bltin < 0)
-		firstchange = 0;
 	clearcmdentry(firstchange);
-	builtinloc = bltin;
 }
 
 
@@ -556,9 +532,7 @@
 		pp = tblp;
 		while ((cmdp = *pp) != NULL) {
 			if (

Re: man 3 getopt char * const argv[] - is const wrong ?

2011-02-13 Thread Jilles Tjoelker
On Sun, Feb 13, 2011 at 01:59:38PM +0100, Matthias Andree wrote:
> Am So, 13.02.2011, 13:20 schrieb Julian H. Stacey:
> > Hi Hackers
> > Ref.: man 3 getopt
> > int getopt(int argc, char * const argv[], const char *optstring);
> > Ref.: K&R 2nd Ed P.211 last indent, 2nd sentence
> > The purpose of const is to announce [objects] that may be
> > placed in read-only memory, and perhaps to increas[e] opportunities for
> > optimization

const only rarely allows optimization, mostly because a declaration like
const foo *p only says that the foo may not be modified via p, not that
it may not be modified at all.

Another important purpose of const is to allow the programmer to
document what is not changed by a function, and have the compiler check
this documentation.

> > optstring is obviously const,
> > but I don't see that argv can calim to be const ?

> The const basically states that the argv[] elements (i. e. the char *
> pointers) are const[ant], or, read another way, that getopt promises not
> to mess with the *pointers* but is free to modify the actual strings
> pointed to (with the usual constraints of not assuming they can be
> extended, for instance) - and I don't see the problem in that.

getopt() should not modify the strings themselves either, but there are
at least two reasons why the type is not const char *const argv[]:

1. Elements of argv are written into char *optarg.

2. Many programs declare main's second argument as char *argv[] which
   cannot be converted to const char *const [], other than via a cast.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Basic UTF-8 support for sh(1)

2011-02-25 Thread Jilles Tjoelker
Here is a patch that adds basic UTF-8 support to sh(1). This is enabled
if the locale is set appropriately.

Features:
* ${#var} counts codepoints. (Really, bytes with (b & 0xc0) != 0x80.)
* ?, [...] patterns match codepoints instead of bytes. They do not match
  invalid sequences. This is so that ${var#?} removes the first
  codepoint, not the first byte. However, * continues to match any
  string and an invalid sequence matches an identical invalid sequence.
  (This differs from fnmatch(3).)

Internal:
* CTL* bytes are moved to bytes that cannot occur in UTF-8 so that
  mbrtowc(3) can be used directly. The new locations do occur in
  iso-8859-* encodings.

Limitations:
* Only UTF-8 support is added, not any other multibyte encodings. I do
  not want to bloat up sh with mbrtowc(3) and similar everywhere.
* Invalid sequences may not be handled as desired. It seems aborting on
  invalid UTF-8 sequences would break things, so they are let through.
  This also avoids bloating the code up with checking everywhere.
* There is no special treatment for combining characters, accented
  letters may match ? or ?? or even more depending on normalization
  form. This matches other code in FreeBSD and is usually good enough
  because normalization forms that use as few codepoints as possible
  tend to be used.
* IFS remains byte-based as in ksh93 (but unlike bash and zsh).
* Our version of libedit does not support UTF-8 so sh will still be
  rather unpleasant to use interactively with characters not in
  us-ascii.

Is this useful and worth the (small) bloat?

A somewhat related feature is support for \u and \U
sequences in $'...' (this will be added to POSIX, see
http://austingroupbugs.net/view.php?id=249 and I plan to add it to sh).
Ideally, these are converted using iconv(3) but as long as it is not
unconditionally available in base or if it is not supposed to be used,
the codepoints can be encoded in UTF-8 for UTF-8 locales, leaving other
locales with question marks.

-- 
Jilles Tjoelker
Index: parser.h
===
--- parser.h	(revision 218371)
+++ parser.h	(working copy)
@@ -34,16 +34,16 @@
  */
 
 /* control characters in argument strings */
-#define CTLESC '\201'
-#define CTLVAR '\202'
-#define CTLENDVAR '\203'
-#define CTLBACKQ '\204'
+#define CTLESC '\300'
+#define CTLVAR '\301'
+#define CTLENDVAR '\371'
+#define CTLBACKQ '\372'
 #define CTLQUOTE 01		/* ored with CTLBACKQ code if in quotes */
 /*	CTLBACKQ | CTLQUOTE == '\205' */
-#define	CTLARI	'\206'
-#define	CTLENDARI '\207'
-#define	CTLQUOTEMARK '\210'
-#define	CTLQUOTEEND '\211' /* only for ${v+-...} */
+#define	CTLARI	'\374'
+#define	CTLENDARI '\375'
+#define	CTLQUOTEMARK '\376'
+#define	CTLQUOTEEND '\377' /* only for ${v+-...} */
 
 /* variable substitution byte (follows CTLVAR) */
 #define VSTYPE		0x0f	/* type of variable substitution */
Index: sh.1
===
--- sh.1	(revision 218467)
+++ sh.1	(working copy)
@@ -2510,4 +2510,7 @@ was originally written by
 .Sh BUGS
 The
 .Nm
-utility does not recognize multibyte characters.
+utility does not recognize multibyte characters other than UTF-8.
+The line editing library
+.Xr editline 3
+does not recognize multibyte characters.
Index: expand.c
===
--- expand.c	(revision 218371)
+++ expand.c	(working copy)
@@ -52,6 +52,7 @@ __FBSDID("$FreeBSD$");
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Routines to expand arguments to commands.  We have to deal with
@@ -111,16 +112,16 @@ static void addfname(char *);
 static struct strlist *expsort(struct strlist *);
 static struct strlist *msort(struct strlist *, int);
 static char *cvtnum(int, char *);
-static int collate_range_cmp(int, int);
+static int collate_range_cmp(wchar_t, wchar_t);
 
 static int
-collate_range_cmp(int c1, int c2)
+collate_range_cmp(wchar_t c1, wchar_t c2)
 {
-	static char s1[2], s2[2];
+	static wchar_t s1[2], s2[2];
 
 	s1[0] = c1;
 	s2[0] = c2;
-	return (strcoll(s1, s2));
+	return (wcscoll(s1, s2));
 }
 
 /*
@@ -665,6 +666,7 @@ evalvar(char *p, int flag)
 	int special;
 	int startloc;
 	int varlen;
+	int varlenb;
 	int easy;
 	int quotes = flag & (EXP_FULL | EXP_CASE | EXP_REDIR);
 
@@ -712,8 +714,15 @@ again: /* jump here after setting a variable with
 		if (special) {
 			varvalue(var, varflags & VSQUOTE, subtype, flag);
 			if (subtype == VSLENGTH) {
-varlen = expdest - stackblock() - startloc;
-STADJUST(-varlen, expdest);
+varlenb = expdest - stackblock() - startloc;
+varlen = varlenb;
+if (localeisutf8) {
+	val = stackblock() + startloc;
+	for (;val != expdest; val++)
+		if ((*val & 0xC0) == 0x80)
+	

Re: [LIBC] Modfied Version of sscanf

2011-05-01 Thread Jilles Tjoelker
On Sat, Apr 30, 2011 at 06:44:43PM +0200, Martin Möller wrote:
> This is my first email to this list, so hello to all members.
> The current version of sscanf, stops when a whitespace characters occurs in
> a string
> when the "%s" (string) type is used.

> The following code:

> char name [20], value [20];
> sscanf ("Test 2->Test 3", "%s->%s", name, value);
> printf ("%s->%s\n", name, value);

> outputs total garbage on my FreeBSD-7.0-RELEASE #0 amd64.
> Is there already a way to do this or should we release a new version of
> sscanf, e.g. called sscanfWS.

> This modified version would output: Test 2->Test 3.

I think you should use functions like memchr(), strchr() and strtok_r()
rather than sscanf().

For one, your code has undefined behaviour if the name or the value
exceed 19 bytes. If the input is untrusted, as your follow-up seems to
indicate, this undefined behaviour likely manifests in allowing an
attacker to execute code of his own choosing. Even if you avoid the
buffer overflow using a format string like "%19s->%19s" it is still not
very good as you may not get an error if the string is too long. Silent
truncation might invalidate security checks done elsewhere and can lead
to hard-to-diagnose bugs.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Does FreeBSD have replacement for posix_fadvice() or fcntl(F_RDADVISE)?

2011-05-12 Thread Jilles Tjoelker
On Thu, May 12, 2011 at 11:38:12AM +0400, Lev Serebryakov wrote:
>Does FreeBSD have some custom call, which can be used where Linux
>  programs uses posix_fadvice() and DARWIN ones fcntl(F_RDADVISE)?

>It is like madvise(2) but for file descriptors.

An effect like POSIX_FADV_SEQUENTIAL can be obtained with the
F_READAHEAD or F_RDAHEAD fcntl(2) requests. The implementation is
divided between the generic VFS layer and various filesystems (cd9660,
ext2fs, msdosfs, nfs, xfs and ufs appear to use the information to some
degree).

Setting readahead to 0 with the fcntl requests has little effect: it
resets the state of the sequential access detection heuristic but does
not disable it.

The O_DIRECT open(2) flag has an effect similar to what
POSIX_FADV_NOREUSE should do, except that it changes semantics by adding
alignment restrictions.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC] rcexecr: rcorder in parallel

2011-06-04 Thread Jilles Tjoelker
On Sat, Jun 04, 2011 at 12:45:37PM +0100, Julien Laffaye wrote:
> On Sat, Jun 4, 2011 at 10:10 AM, Buganini  wrote:
> > https://github.com/buganini/rcexecr

> > Currently it is able to determine the exec/wait order

> > There are something I haven't digged in deeply in the "self
> > modification" part.

> > patches/ideas are welcome.

> Thanks for doing that!

Yes.

> You should use kqueue(2) instead of waitpid(2) so that you can
> efficiently monitor a pool of processes.
> See pwait(1) for an example.

Hmm, I don't think kqueue() should be used here. Its main advantage is
that it works regardless of parent-child relationships, but that
advantage is not relevant here. On the other hand, waitpid() is still
necessary to get rid of the zombies. Furthermore, waitpid() is standard
while kqueue() is not, and I think non-standard interfaces should only
be used if they provide a real benefit above standard interfaces.

The current approach with waitpid() for specific processes should be
good enough for a proof of concept. It will keep zombies longer than
necessary, particularly for things that are not explicitly depended on.
To avoid this, use waitpid(-1, ...) and maintain more tracking for
processes that have already terminated.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: int64_t and printf

2011-06-05 Thread Jilles Tjoelker
On Sun, Jun 05, 2011 at 02:31:27PM -0400, Sean C. Farley wrote:
> On Sun, 5 Jun 2011, Ben Laurie wrote:
> > So, for example int64_t has no printf modifier I am aware of. Likewise 
> > its many friends.

> > In the past I've handled this by having a define somewhere along the 
> > lines of...

> > #if 
> > # define INT_64_T_FMT "%ld"
> > #else
> > # define INT_64_T_FMT "%lld"
> > #endif

> > but I have no idea where to put such a thing in FreeBSD. Opinions? 
> > Also, I guess I'd really need to do a modifier rather than a format, 
> > for full generality.

> You need to include inttypes.h, which includes machine/_inttypes.h. 
> This will provide the appropriate macro which in this case is PRId64.

The macros from  are certainly valid, but the style in most
of FreeBSD prefers casting to a type such as intmax_t or uintmax_t for
which a constant format string is available (%jd/%ju). In the particular
case of int64_t, it would seem that long long is better than intmax_t,
as it is possibly shorter but still guaranteed large enough by C99, but
there are objections against this that I do not understand.

Also, note the format strings for size_t, %zu, and for ptrdiff_t, %td.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


tr A-Z a-z in locales other than C

2011-06-06 Thread Jilles Tjoelker
t;);
-   for (cnt = 0; cnt < NCHARS_SB; cnt++)
-   if (charcoll((const void *)&cnt, (const void *)&(s->lastch)) >= 
0 &&
-   charcoll((const void *)&cnt, (const void *)&stopval) <= 0)
-   *p++ = cnt;
-   *p = OOBCH;
-   n = p - s->set;
-
-   s->cnt = 0;
-   s->state = SET;
-   if (n > 1)
-   mergesort(s->set, n, sizeof(*(s->set)), charcoll);
+   s->cnt = stopval - s->lastch + 1;
+   s->state = RANGE;
+   --s->lastch;
return (1);
 }
 

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: tr A-Z a-z in locales other than C

2011-06-07 Thread Jilles Tjoelker
On Tue, Jun 07, 2011 at 04:24:43AM +0400, Andrey Chernov wrote:
> On Tue, Jun 07, 2011 at 12:41:05AM +0200, Jilles Tjoelker wrote:

> > There is a related issue with ranges in regular expressions, glob and
> > fnmatch (likewise unspecified by POSIX outside the POSIX locale), but
> > this is less likely to cause problems.

> You care about ports, but suggested change is americano-centrism which 
> kills tr usage for national language documents due to impossibility to 
> specify whole national alphabet easily, just by two letters.

Hmm, so that's with translation to a constant, or with the -d and/or -s
options. In such cases, there may be a range for all letters with
collation order, but not with codeset order (mainly if "all letters"
includes letters with diacritical marks).

In FreeBSD, upper case sorts before lower case, so cases can be
distinguished this way but all letters may require two ranges. In most
other operating systems the cases go together so a single range is
sufficient, but cases cannot be distinguished. Making such things work
on multiple operating systems requires careful testing.

> Moreover, having differently treated regex ranges in tr vs other places 
> you mention will produce additional chaos.

I think this is already inconsistent because some programs do not enable
locale or use different locale code.

With UTF-8 or other multibyte character sets, this is even more so
because functions like isalpha work very poorly by definition and there
is no collation support for such character sets in FreeBSD.

> Back to the ports: it is not hard to run _any_ port's make or configure 
> with LANG=C directly by the ports Mk system to eliminate that problem.

True, but some ports install scripts with problematic tr calls.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: tr A-Z a-z in locales other than C

2011-06-07 Thread Jilles Tjoelker
On Wed, Jun 08, 2011 at 09:56:39AM +1200, Atom Smasher wrote:
> the man page makes it clear...

>   Translate the contents of file1 to upper-case.

> tr "[:lower:]" "[:upper:]" < file1

>   (This should be preferred over the traditional UNIX idiom of ``tr a-z
>   A-Z'', since it works correctly in all locales.)

> for any other uses, either build the port with locale specified as "C" as 
> mentioned, or patch the port so:
>   tr '[a-z]' '[A-Z]'
>   becomes:
>   env LC_ALL=C tr '[a-z]' '[A-Z]'

> the only change that would be appropriate to the tr utility would be a 
> command-line option to select a locale... something like:
>   tr -l C '[a-z]' '[A-Z]'

> i don't think anyone would object to that, but it would still require 
> patching some ports under some locales...

That new option would provide zero benefit. If things are going to be
patched anyway then patch them to be standards compliant.

> maybe another option would be modifying tr to recognize other [new] 
> environment variables... TR_LANG, TR_LC_ALL, TR_LC_CTYPE and 
> TR_LC_COLLATE. done that way, things could be set in /etc/make.conf (or 
> sys.mk), not need any patching, and not interfere with other uses of 
> locale.

That would be rather ugly.

If  tr a-z A-Z  is supposed to be deceiving in some locales, then let it
remain so unconditionally.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


[PATCH] [RFC] sh(1) vfork support

2011-06-13 Thread Jilles Tjoelker
   ps->pid = pid;
+   ps->status = -1;
+   ps->cmd = nullstr;
+   jp->foreground = 1;
+#if JOBS
+   setcurjob(jp);
+#endif
+   }
+   INTON;
+   TRACE(("In parent shell:  child = %d\n", (int)pid));
+   return pid;
+}
+
+
 /*
  * Wait for job to finish.
  *

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: invalid argument in select() when peer socket is in FD_SET

2011-07-31 Thread Jilles Tjoelker
On Sun, Jul 31, 2011 at 04:20:08PM +0200, Christoph P.U. Kukulies wrote:
> I posted this on freebsd-questions also but maybe the expert density 
> isn't that high as here in "hackers".
> Since I think it may be a design or implementation issue in FreeBSDs' 
> select(), I'm posting it here as well,
> hoping to get an experts' answer.

> I have written a small server to test TCP/IP roundtrip times of the 
> packets in a proprietary protocol and while
> compiling and running this server on different platforms (Windows 
> 7/cygwin, UbuntuLinux, FreeBSD 8.0 Release), I found
> that the server produces an error when the listening socket (on which 
> the accept() is performed) is also
> member of the select() fd_set.

> On the other platforms the program works without error, just under 
> FreeBSD I'm getting this "invalid argument" error.

> Comments appreciated (despite comments about the error checking logic

> [snip]
>  tv.tv_sec = 0;
>  tv.tv_usec = 500;   /* 5 seconds */
> [snip]
>  n = select(nfds, &readfds,
> (fd_set *) NULL, /* not interested in write */
> (fd_set *) NULL, /* ...or exceptions */
> &tv);/* timeout */

The number of microseconds in a struct timeval must be nonnegative and
less than one million (likewise, the number of nanoseconds in a struct
timespec must be nonnegative and less than 1000 million).

FreeBSD checks this strictly in most functions.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: eliminating a syscall on accept()+ioctl() combo

2011-08-02 Thread Jilles Tjoelker
On Mon, Aug 01, 2011 at 08:11:04AM +0200, Vlad Galu wrote:
> On Jul 31, 2011, at 9:59 PM, Bernard van Gastel wrote:
> > I want to reduce the number of syscalls for my networking
> > application. The app handles incoming connections with the
> > 'accept()' system call. Is there a way to specify to accept() that
> > the newly created file descriptors should be non-blocking (FIONBIO)?
> > This will avoid an ioctl() after the accept(). Thanks!

> You can make your listening socket non-blocking. Newly created file
> descriptors will inherit that property. However, that will require you
> to select()/poll()/kqueue() for that descriptor as well, instead of
> simply blocking in accept().

This is documented FreeBSD behaviour and common across BSDs, but is not
portable. POSIX leaves it unspecified what the non-blocking state of the
new socket is and in fact Linux always makes the new socket blocking
(unless you request non-blocking using their new accept4() call).

Because this portability issue can be very subtle, I suggest not blindly
relying on it.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


fdopendir() wrongly closes passed fd on error, union mess

2011-08-09 Thread Jilles Tjoelker
While trying to use openat()/fdopendir()/fstatat() to improve
performance of sh(1) pathname generation, I noticed that fdopendir()
closes the passed file descriptor if it fails (such as when stable/8
silently ignores O_DIRECTORY even though it is defined in the header
file).

The below patch should fix this problem. I have changed the ugly
DTF_REWIND code so it moves the reopened directory to the same file
descriptor number, so as to minimize change there while ensuring the
correct fd is closed.

A later possibility is to fix the DTF_REWIND problem by reading union
directories from a new descriptor obtained via
  fd2 = openat(fd, ".", O_RDONLY | O_DIRECTORY | O_CLOEXEC);
and doing this unconditionally, retiring the DTF_REWIND flag.

Reading the kernel code, it appears that calling getdirentries() on an
open file description in a filesystem mounted with MNT_UNION (mount -o
union) may change that open file description irreversibly to one
pointing to the lower layer. This must not happen to the descriptor
passed to fdopendir() or the descriptor returned via dirfd() since
fchdir() and *at functions may not work properly with such a changed
descriptor.

Note: unionfs does not appear to mangle the open file description like
MNT_UNION does, but it does need the duplicate removal code like
MNT_UNION.

Index: lib/libc/gen/opendir.c
===
--- lib/libc/gen/opendir.c  (revision 224739)
+++ lib/libc/gen/opendir.c  (working copy)
@@ -75,6 +75,8 @@ __opendir2(const char *name, int flags)
 {
int fd;
struct stat statb;
+   DIR *dir;
+   int saved_errno;
 
/*
 * stat() before _open() because opening of special files may be
@@ -89,7 +91,13 @@ __opendir2(const char *name, int flags)
if ((fd = _open(name, O_RDONLY | O_NONBLOCK | O_DIRECTORY)) == -1)
return (NULL);
 
-   return __opendir_common(fd, name, flags);
+   dir = __opendir_common(fd, name, flags);
+   if (dir == NULL) {
+   saved_errno = errno;
+   _close(fd);
+   errno = saved_errno;
+   }
+   return (dir);
 }
 
 static int
@@ -110,6 +118,7 @@ __opendir_common(int fd, const char *name, int fla
int incr;
int saved_errno;
int unionstack;
+   int fd2;
struct stat statb;
 
dirp = NULL;
@@ -199,14 +208,15 @@ __opendir_common(int fd, const char *name, int fla
 * which has also been read -- see fts.c.
 */
if (flags & DTF_REWIND) {
-   (void)_close(fd);
-   if ((fd = _open(name, O_RDONLY | O_DIRECTORY)) == -1) {
+   if ((fd2 = _open(name, O_RDONLY | O_DIRECTORY)) == -1) {
saved_errno = errno;
free(buf);
free(dirp);
errno = saved_errno;
return (NULL);
}
+   (void)_dup2(fd2, fd);
+   _close(fd2);
}
 
/*
@@ -309,7 +319,6 @@ __opendir_common(int fd, const char *name, int fla
 fail:
saved_errno = errno;
free(dirp);
-   (void)_close(fd);
errno = saved_errno;
return (NULL);
 }

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Mixing Asynchronous Network I/O and POSIX Threads

2011-09-18 Thread Jilles Tjoelker
On Sun, Sep 18, 2011 at 11:32:08AM -0400, Richard Yao wrote:
> I wrote a program for Linux that uses Asynchronous Network I/O and
> POSIX Threads. It uses a mix of gettid(), fcntl() and F_SETOWN to
> specify which thread handles each connection's SIGIO interrupts.
> gettid() is Linux-specific and I would prefer to do this in a way that
> also works with FreeBSD. Is that possible?

In FreeBSD, you can only F_SETOWN processes or process groups, not
threads, so a direct port is probably not possible. Another Linux
feature not in FreeBSD is F_SETSIG (to send a realtime signal instead of
SIGIO and provide siginfo_t information).

My recommendation is not to use SIGIO and SIGURG at all. If you use
signal handlers, your program is very likely to suffer from race
condition problems unless you unmask the signal only at well-defined
safe points. If you use sigtimedwait() or similar, select()/poll() can
provide similar functionality. Alternatively, if you have many
descriptors to watch, use non-portable facilities like BSD kqueue, Linux
epoll, Solaris ports or portable wrappers around them. Realtime signals
are nasty for this -- the signal queue has a finite size and when it
overflows you need a "resync" step that is expensive both in CPU and
programmer time.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Re[2]: Sharing device driver between kernel and user space

2011-09-23 Thread Jilles Tjoelker
On Wed, Sep 21, 2011 at 04:23:28PM +0200, Ivan Voras wrote:
> On 21 September 2011 16:09, geoffrey levand  wrote:
> > Sure i can use the synchronization primitives, the problem is that
> > the response to a request sent to PS3 VUART port is not available
> > immediately, and i have to disallow kernel access to the PS3 VUART
> > while i'm waiting for the response in user space. I send request
> > with write syscall from user space and wait for response with read
> > syscall. In the period of time between sending request and receiving
> > response i could receive some other packets from VUART port, e.g.
> > some kind of event notification,  i have to skip them. But kernel
> > should not interfer until i get my response.  So i would need to
> > lock out the kernel during this time. I think i found a good
> > solution for this problem, just use a IOCTL which tells kernel
> > device driver to stop processing kernel requests and events,
> > something like SET_USER_MODE.  After that i can use it in user
> > space.

Perhaps it is better to lock out kernel activity (and further opens)
while userland has the device open.

In any case, a close should probably release the device (possibly after
some sort of cleanup), so that a crash of the userland program does not
necessitate a reboot before the device can be used again.

> Have you read sema(9)?

sema(9) is best avoided in new code. It is not used much in existing
code, and it would be useful to be able to remove some day in order to
reduce redundancy in the provided synchronization primitives.

Instead, see mutex(9) and condvar(9).

The implementation of sema(9) is not very optimized, using a mutex and a
condition variable. (This is unlike the userland sem_post(3) and friends
which only require a single atomic op in the uncontested case, except
for named semaphores in 8.x and earlier.)

> Or if returning EBUSY is acceptable when the resource is in use by
> $whatever, maybe you just need a boolean variable.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Does anyone use nscd?

2011-10-08 Thread Jilles Tjoelker
On Wed, Oct 05, 2011 at 03:54:00PM -0700, Artem Belevich wrote:
> 2011/10/5 Dag-Erling Sm?rgrav :
> > Michael Bushkov  writes:
> >> 2. Consequences of the aforementioned problem can probably be
> >> corrected by using _setsockopt(..., SO_NOSIGPIPE) in
> >> __open_cached_connection() in nscachedcli.c

> > That sounds like a workaround rather than a fix...

> Not necessarily. Using SO_NOSIGPIPE is a valid option when someone
> wants to see read/write on a closed socket fail and return -1 with
> errno=EPIPE.

> Quick grep in libc shows that resolver code in
> lib/libc/resolv/res_send.c also sets SO_NOSIGPIPE for exactly that
> reason.

Disabling SIGPIPE is good anyway because a crashing/dying nscd should
not cause applications to terminate. However, if EPIPE/SIGPIPE happens
in normal operation, that is still a bug that should be fixed.

By the way, SO_NOSIGPIPE is not in POSIX.1-2008 while the MSG_NOSIGNAL
flag to send() is. It may be better to replace the write() call with
send() with the MSG_NOSIGNAL flag and drop the setsockopt().

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: xlocale patch

2011-11-20 Thread Jilles Tjoelker
On Wed, Sep 21, 2011 at 08:28:54PM +0100, David Chisnall wrote:
> And here's an updated version of the patch.  I've fixed some other
> bugs, including where wcstod() and wcstodl() in trunk return the wrong
> value for any input string starting with spaces, wchar.h's violation
> of POSIX by not declaring FILE, and a few C++ incompatibilities in
> other headers (e.g. putchar being a macro, which breaks things like
> std::putchar(foo)).  All of my libcxxrt and libc++ changes have now
> been pushed upstream, so this should now be repeatable.  
> 
> The libunwind port still has an irritating bug in the header, where
> the extern "C" {} block ends with a semicolon, which causes it to be
> rejected in any C++ program, but with that fixed you can compile
> libcxxrt (I used cmake .. -DCMAKE_CXX_FLAGS="-I/usr/local/include
> -nostdlib -g" - hopefully I'll work out how to make CMake add these
> flags automatically soon...).  Libc++ should build out of the box with
> cmake.

I think it is unfortunate that  depends on #include order; we
should be moving away from such ordering dependencies, not adding more.
POSIX does not have such dependencies so if they add all these new _l
functions, they will most likely do it differently.

POSIX.1-2008 has some of the _l functions but adds them to the plain
header; for example isalnum_l() is in  like isalnum(),
nl_langinfo_l() is in  like nl_langinfo() and strcasecmp_l()
is in  like strcasecmp(). This appears more sensible to me.
The  header can then be an empty file.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: cron(8) mis-feature with @reboot long after system startup

2011-11-26 Thread Jilles Tjoelker
On Sat, Nov 26, 2011 at 02:58:46PM -0500, APseudoUtopia wrote:
> On Fri, Nov 25, 2011 at 4:36 PM, Christian Kastner  wrote:
> > On 2011-11-25 08:02, Jason Hellenthal wrote:
> >> So with that said... is there a way we could actually make this run
> >> @reboot only ?

> > Debian's cron[0] and Fedora's cronie[1] have solved this by touching a
> > file on first startup and running @reboot only when this file does not
> > yet exist.

> I like this idea, however it has a major caveat: Assuming the shutdown
> scripts remove said file (and the boot scripts create said file), what
> happens in the event that the disk was umount'ed uncleanly? For
> example, a power failure (I know, that's what UPSs are for, but lets
> ignore that for a second). If the system is configured to
> automatically boot after a power failure, the @reboot cron script wont
> run (since the said file still exists...).

The file can be stored in /var/run, which is cleared at boot.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: "ps -e" without procfs(5)

2011-12-04 Thread Jilles Tjoelker
On Sat, Oct 29, 2011 at 01:32:39PM +0300, Mikolaj Golub wrote:
> [KERN_PROC_AUXV requires just p_cansee()]

If we are ever going to do ASLR, the AUXV information tells an attacker
where the stack, executable and RTLD are located, which defeats much of
the point of randomizing the addresses in the first place.

Given that the AUXV information seems to be used by debuggers only
anyway, I think it would be good to move it to p_candebug() now.

The full virtual memory maps (KERN_PROC_VMMAP, procstat -v) are already
under p_candebug().

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: "ps -e" without procfs(5)

2011-12-04 Thread Jilles Tjoelker
On Sun, Dec 04, 2011 at 10:58:10PM +0200, Mikolaj Golub wrote:
>  RNMW> Agreed. In general, my view is that p_cansee() should be used for very
>  RNMW> few of our process inspection APIs. I like your example of ASLR
>  RNMW> especially, as it illustrates how debugging information can aid even
>  RNMW> local attacks (i.e., user vs. setuid binary).

> What do you think about recently added kern.proc.ps_strings, which
> returns location of ps_strings structure? It uses p_cansee() too. The
> location is the same for all processes of the same ABI, so this does
> not look like sensitive information, on the other hand it also seems
> to be used by debuggers only.

With stack ASLR, the address will not be the same for every process of
the same ABI and will be sensitive information. Therefore I think this
should be locked down too.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: intent of tab-completion in /bin/sh in 9.0

2012-01-18 Thread Jilles Tjoelker
On Wed, Jan 18, 2012 at 11:00:44AM -0500, Matthew Story wrote:
> Just noticed that tab-completion in /bin/sh has been added in 9.0
> (verified that it is not there in 8.0, dunno if it's there in 8.2,
> could probably go digging to figure it out).  In addition to the
> command history via : (which is present in 8.0) FreeBSD sh
> is now actually a pretty usable interactive shell.  I also noticed
> that the following bit has been removed from the sh(1):

>  This version has many features which make it appear similar in some
> respects to the Korn shell, but it is not a Korn shell clone like pdksh.

That sentence is an implicit comparison to the Bourne shell. When it was
written, many other Un*x variants had a version of the Bourne shell as
/bin/sh, and it was not taken for granted to have things like $(...),
$((...)) and ${...#...} in a /bin/sh. Nowadays, this is often taken for
granted; the last OS to have a Bourne shell as /bin/sh is probably
Solaris 10 (11 has ksh93).

On the contrary, our /bin/sh is minimalistic compared to many other
shells used in that role, like bash, pdksh, mksh and ksh93. It (the 9.0
version) has only slightly more features than dash or NetBSD's sh, and
dash has instead some other features.

> Just wondering if the general direction here is attempting to provide a
> minimal POSIX shell, that is useful enough interactively to become the
> default root shell (supplanting csh)?  Or if there is just a general trend
> towards adopting more of the ksh feature-set.

POSIX itself has gradually adopted ksh features, so seeing more of them
in future is not unlikely. Most of the new language features in 9.0 are
either from POSIX.1-2008 or on the roadmap for a new version of POSIX
(in collaboration with other shell authors).

Adding other ksh features is not very likely.

It is certainly possible to use /bin/sh as root's shell, but the
distributed master.passwd entry will probably continue to use /bin/csh
for a long time.

Some plans for sh in 10.0 are in this mailing list post:
http://lists.freebsd.org/pipermail/freebsd-arch/2011-December/011976.html

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: intent of tab-completion in /bin/sh in 9.0

2012-01-19 Thread Jilles Tjoelker
On Wed, Jan 18, 2012 at 08:46:18PM -0500, Matthew Story wrote:
> On Wed, Jan 18, 2012 at 5:16 PM, Jilles Tjoelker  wrote:
> > POSIX itself has gradually adopted ksh features, so seeing more of them
> > in future is not unlikely. Most of the new language features in 9.0 are
> > either from POSIX.1-2008 or on the roadmap for a new version of POSIX
> > (in collaboration with other shell authors).

> Tab completion is a welcome addition, I was unaware that this had been (or
> is slated to be) added to the POSIX specification.  This makes far more
> sense than my proposed explanations.  Thanks for the clarification.

Tab completion is not in POSIX and not planned to be (as far as I know),
although there is some fairly ugly pathname completion functionality
required in 'set -o vi' mode (we do not implement it).

The reason is more like that I noticed that NetBSD had it and found
someone willing to port it and add some small features (escaping special
characters in what is inserted).

In using /bin/sh as login shell on virtual machines (to avoid the need
to install something else from ports), I have found the filename
completion to be remarkably useful.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


sh(1) vfork patch, with benchmarks

2012-01-25 Thread Jilles Tjoelker
Here is a new version of the vfork patch. The concept is the same,
trying to use vfork in simple common cases. Compared to
http://lists.freebsd.org/pipermail/freebsd-hackers/2011-June/035618.html
I have extended vfork use to some command substitutions and allowed
setting an environment variable SH_DISABLE_VFORK to disable all vfork
use.

The test machine is a 4-core Phenom II X4 in 32-bit mode with 4GB RAM,
stable/8 with newer sh. I have run tests with a dummy environment
variable SH_DISABLE_VFORL=1 (y) and with SH_DISABLE_VFORK=1 (n).

A microbenchmark
sh -c 'x=0; while [ $x -lt 1 ]; do /bin/kill -0 $$; x=$(($x + 1)); done'
is much faster:

x micro-vfork-timings1-n
+ micro-vfork-timings1-y
+--+
|+ |
|+ x   |
| + ++  x  xx  |
|++ ++  xx xx x|
| |_A||___AM_| |
+--+
N   Min   MaxMedian   AvgStddev
x   9  3.86  4.05 4 3.963   0.076648549
+   9  2.52   2.6  2.58 2.572   0.033082389
Difference at 95.0% confidence
-1.39111 +/- 0.0589948
-35.0995% +/- 1.48851%
(Student's t, pooled s = 0.0590315)

A make -j4 buildkernel is about 0.5% faster:

x buildkernel-vfork-timings-n
+ buildkernel-vfork-timings-y
+--+
|   +  x + |
|+  + +* + + +x*+xx+   x   xx x +   x  x x  *+  *  x+ x|
||__|M_AMA|___||
+--+
N   Min   MaxMedian   AvgStddev
x  17 435.3443.65 438.8 439.03588  2.378805
+  17 432.8442.86436.76 437.06412   3.20108
Difference at 95.0% confidence
-1.97176 +/- 1.97034
-0.449112% +/- 0.448789%
(Student's t, pooled s = 2.82007)

In both cases, the difference comes mainly from the system time, but the
user time is also a bit lower (statistically significant).

In a virtual machine with 10-current (default kernel config + capsicum
and procdesc) and the patch, the microbenchmark is similarly much
faster:

x micro-vfork-timings-n
+ micro-vfork-timings-y
+--+
|+ +   |
|+++ x |
|xx x  |
|   xx |
|+   x  xxx|
||_A|   |_A_|  |
+--+
N   Min   MaxMedian   AvgStddev
x  18 15.14 15.85 15.5715.5550.17088868
+  18  9.79 10.14  9.92 9.9127778   0.096820365
Difference at 95.0% confidence
-5.64222 +/- 0.0940703
-36.2727% +/- 0.604759%
(Student's t, pooled s = 0.138883)

Ian Lepore has been kind enough to try an earlier version of this patch
on some sort of ARM board and reports an improvement in boot time from
54 to 51 seconds, and a large difference in microbenchmarks.

commit f55a350fa9c3792e10f93160a93d016a7bfdd630
Author: Jilles Tjoelker 
Date:   Mon May 30 00:31:45 2011 +0200

sh: Use vfork in a few common cases.

This uses vfork() for simple commands and command substitutions containing a
single simple command, invoking an external program under certain conditions
(no redirections or variable assignments, non-interactive shell, no job
control).

The use of vfork() can be disabled by setting a variable named
SH_DISABLE_VFORK.

diff --git a/bin/sh/eval.c b/bin/sh/eval.c
index a5f0aff..2d90921 100644
--- a/bin/sh/eval.c
+++ b/bin/sh/eval.c
@@ -921,6 +921,15 @@ evalcommand(union node *cmd, int flags, struct backcmd 
*backcmd)
if (pipe(pip) < 0)
error("Pipe call failed: %s", strerror(errno));
}
+   if (cmdentry.cmdtype == CMDNORMAL &am

Re: sh(1) vfork patch, with benchmarks

2012-01-25 Thread Jilles Tjoelker
On Wed, Jan 25, 2012 at 11:54:46PM +0100, Jilles Tjoelker wrote:
> [snip]
> x micro-vfork-timings1-n
> + micro-vfork-timings1-y
> +--+
> |+
>  |
> |+ x  
>  |
> | + ++  x  xx 
>  |
> |++ ++  xx xx 
> x|
> | |_A|
> |___AM_| |
> +--+
> N   Min   MaxMedian   AvgStddev
> x   9  3.86  4.05 4 3.963   0.076648549
> +   9  2.52   2.6  2.58 2.572   0.033082389
> Difference at 95.0% confidence
> -1.39111 +/- 0.0589948
> -35.0995% +/- 1.48851%
> (Student's t, pooled s = 0.0590315)

> [snip]

I forgot to mention, the numbers are time in seconds (measured with
/usr/bin/time).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sem(4) lockup in python?

2012-02-05 Thread Jilles Tjoelker
On Sun, Feb 05, 2012 at 01:32:42PM -0500, Daniel Eischen wrote:
> On Sun, 5 Feb 2012, Ivan Voras wrote:
> > On 5 February 2012 11:44, Garrett Cooper  wrote:

> >>    'make MAKE_JOBS_NUMBER=1' is the workground used right now..

> > David Xu suggested that it is a bug in Python - it doesn't set
> > process-shared attribute when it calls sem_init(), but i've tried
> > patching it (replacing the port patchfile file the one I've attached)
> > and I still get the hang.

> I don't understand how process shared semaphores can work.  Perhaps
> I'm dumb and ignorant, but a sem_id_t is an allocated struct.   The
> actual kernel sem_id is inside the struct.  Isn't this the same
> reason pthread_mutex_t and pthread_cond_t cannot be process-shared?

That's how the old implementation works. It does not support
process-shared semaphores although they may happen to work in some
specific cases.

However, in 9.0, sem_t works differently and contains the actual lock
word directly, so that process-shared semaphores work. The
implementation is in lib/libc/gen/sem_new.c. The pshared flag to
sem_init() is not a no-op because it tells the kernel to allow for use
from multiple processes.

Note that the old implementation is still present as well, for
compatibility with old binaries.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: xargs short-circuit

2012-02-14 Thread Jilles Tjoelker
On Tue, Feb 14, 2012 at 01:34:49PM -0500, Matthew Story wrote:
> After reading the man-page, and browsing around the internet for a minute,
> I was just wondering if there is an option in (any) xargs to short-circuit
> on first failure of [utility [arguments]].

> e.g.

> $ jot - 1 10 | xargs -e -n1 sh -c 'echo "$*"; echo exit 1' worker || echo $?
> 1
> 1

> such that any non-0 exit code in a child process would cause xargs to stop
> processing.  seems like this would be a nice feature to have.

As per xargs(1), you can do this by having the command exit on a signal
or with a value of 255.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: a sysctl for process binary osreldate

2012-03-17 Thread Jilles Tjoelker
On Sat, Mar 17, 2012 at 09:30:05PM +0200, Mikolaj Golub wrote:
> I added osrel output to procstat -b option:

> kopusha:~% procstat -b 2975
>   PID COMMOSREL PATH
>  2975 emacs 101 /usr/local/bin/emacs-23.3

> Would this be ok or someone see a better way?

Hmm, this means that procstat is not supposed to be used from scripts as
it is apparently OK to change its output format like this?

In some ways, querying via ps would be better for scripts since it
allows things like
  ps -p PID -o KEYWORD=
which do not need additional parsing except that many of the newer
things in procstat do not have ps keywords.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: mmap segmentation fault

2012-04-12 Thread Jilles Tjoelker
On Fri, Apr 13, 2012 at 12:25:32AM +0530, Maninya M wrote:
> I want to allocate memory at a specified address location 'a' of size 'b'.
> I wrote code below to do it, but I'm getting a seg fault. How can I solve
> this?
> How can I get the allocated memory at the required address?

> int main()
> {
> unsigned int *addr,*newaddr;
> unsigned long a=134516736,a1;
> unsigned long b=3895296;
> unsigned long flags =6;
> a1=(a&0x);
> printf("%x\n",(void *)a);
> newaddr=(unsigned int *)mmap((void *)a,b,6,MAP_ANONYMOUS|MAP_FIXED,-1,0);

> if(newaddr==MAP_FAILED)
> printf("mmap failed");
> else
> printf("sucess %x",newaddr);
> return 0;
> }

> Output is
> 8049000
> Segmentation fault

If this is i386, you're mapping onto the executable itself. If you
really want to map something there, you will have to move your code
somewhere else or manipulate your executable to contain a suitable
memory area at the required address.

Try, for example,
  procstat -v $$

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC] last(1) with security.bsd.see_other_uids support

2012-06-05 Thread Jilles Tjoelker
On Sun, Jun 03, 2012 at 08:42:04PM -0500, Bryan Drewery wrote:
> I've written up a patch to add some privacy to last(1) while still
> giving non-privileged users access to their own login history.

> This is still a work in progress. I am reaching out to make sure my
> approach is proper and to get some input on code sharing. My goal is to
> add this support to w(1) and who(1) as well. FWIW I have been running a
> similar patch on my own shared-hosting systems (pre-utmpx) for a few years.

> Changes:

> * Added utmp group
> * All utmpx files are 660 root:utmp

The utmp group was previously (on other systems) used to allow
applications write access to utmp without making them setuid root
(making them setgid utmp instead).

> * last(1) runs setgid(utmp) and drops this as soon as the utmpx files
> are opened.
> * Users in the wheel or utmp group can see all entries
> * IFF security.bsd.see_other_uids=0: users only see their own entries,
> as well as shutdown/boot/init times.
> * If security.bsd.see_other_uids=1, all entries are always shown, as it
> does now.

> Justifications:

> Why the changes? This makes sense for shared hosting environments where
> jails are not practical. A user should be able to see their own login
> history, to see if someone has been accessing their account, but not to
> see the IPs of other users. The intention is *not* to disallow them to
> see that other users of the system. Obviously they can just cat
> /etc/passwd. This is just about IP privacy.

> Why the setgid? Allow reading the entries, but disallow directly opening
> the utx files. I've seen some shared hosts incorrectly chmod 0
> /usr/bin/last, but still leave the utmp files wide open for reading!

> Why the utmp group? It's consistent with other systems (OpenBSD, Linux),
> and allows giving a user access to see all entries, without granting
> them wheel or allowing a non-privileged user to run as setgid(wheel). It
> also helps mitigates security concerns by using a specific group only
> having extra privilege to utmpx files.

> I originally had not planned for security.bsd.see_other_uids, but
> considering POLA and consistency, it makes sense.

This requires every utmpx access to go through a setgid binary,
regardless of the value of security.bsd.see_other_uids. If called by a
user that is not root or in the utmp group, getutxent(3) and related
APIs become worthless.

While POSIX permits this (security restriction denying all visibility of
utmpx), this is not what applications expect. For example, tcsh and zsh
offer a "watch" feature that reports on logins and logouts by calling
utmpx APIs.

To avoid this, the utmpx APIs could communicate with a privileged daemon
if the files are not readable. The daemon can check the identity of the
caller via getpeereid(3). (Unfortunately, even if getpeereid() is
bypassed and LOCAL_PEERCRED called directly, only 16 groups can be
queried. Therefore the daemon cannot check the process credential for
the groups but will have to check the group database for the user.)

Also, the attack surface of such a daemon may be smaller than that of a
setuid/setgid program.

Alternatively, the daemon could be a setgid program that is spawned by
the utmpx APIs when needed.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [RFC] last(1) with security.bsd.see_other_uids support

2012-06-06 Thread Jilles Tjoelker
On Wed, Jun 06, 2012 at 01:20:12PM +0200, Pawel Jakub Dawidek wrote:
> On Tue, Jun 05, 2012 at 11:31:01PM +0200, Jilles Tjoelker wrote:
> > Also, the attack surface of such a daemon may be smaller than that of a
> > setuid/setgid program.

> Really? I don't see that. With current patch and setgid to utmp the
> process can only read some files that don't even contain very sensitive
> data (like passwords).

> Any privileged daemon is much bigger threat. Also, do we really want a
> daemon running all the time just to be able to parse utx files?

The daemon would run with non-root privileges just sufficient to read
the utmpx files.

If we have a good way to start it, the attack surface is limited to what
you can do with its socket and this can be cut down tightly.

On the other hand, an attacker can control various process attributes of
a setgid program such as the output file, a subset of signals, rlimits
and a subset of environment variables. For example, last, w and who have
some degree of locale support (time/date formats). Also, in this
particular case, dropping privileges does not help much since the utmpx
file descriptor is almost as valuable as the group credentials.

I agree that leaving a daemon running for this is ugly.

> > Alternatively, the daemon could be a setgid program that is spawned by
> > the utmpx APIs when needed.

> Still seems a bit too far for my taste. Spawning a daemon somewhere from
> within library doesn't sound like a good idea to me... At least until we
> have something like launchd that can start such services on demand.

The suggested approach is used by old implementations of grantpt(). If
the kernel does not set up the ownership of a new pseudo terminal
properly, grantpt() can invoke a setuid root binary, for example
/usr/libexec/pt_chown.

Similarly, the utmpx APIs might invoke a setgid helper if they cannot
read the files themselves. Communication would be over a pipe.

This has the downside of having a setgid program at all but things like
locale support are handled by the calling (unprivileged) application and
not by the setgid program.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


system() using vfork() or posix_spawn()

2012-07-30 Thread Jilles Tjoelker
People sometimes use system() from large address spaces where it would
improve performance greatly to use vfork() instead of fork().

A simple approach is to change fork() to vfork(), although I have not
tried this. It seems safe enough to use sigaction and sigprocmask system
calls in the vforked process.

Alternatively, we can have posix_spawn() do the vfork() with signal
changes. This avoids possible whining from compilers and static
analyzers about using vfork() in system.c. However, I do not like the
tricky code for signals and that it adds lines of code.

This is lightly tested.

Index: lib/libc/stdlib/system.c
===
--- lib/libc/stdlib/system.c(revision 238371)
+++ lib/libc/stdlib/system.c(working copy)
@@ -42,16 +42,21 @@
 #include 
 #include 
 #include 
+#include 
 #include "un-namespace.h"
 #include "libc_private.h"
 
+extern char **environ;
+
 int
 __system(const char *command)
 {
pid_t pid, savedpid;
-   int pstat;
+   int error, pstat;
struct sigaction ign, intact, quitact;
-   sigset_t newsigblock, oldsigblock;
+   sigset_t newsigblock, oldsigblock, defmask;
+   const char *argv[4];
+   posix_spawnattr_t attr;
 
if (!command)   /* just checking... */
return(1);
@@ -65,28 +70,36 @@
ign.sa_flags = 0;
(void)_sigaction(SIGINT, &ign, &intact);
(void)_sigaction(SIGQUIT, &ign, &quitact);
+   (void)sigemptyset(&defmask);
+   if ((intact.sa_flags & SA_SIGINFO) != 0 ||
+   intact.sa_handler != SIG_IGN)
+   (void)sigaddset(&defmask, SIGINT);
+   if ((quitact.sa_flags & SA_SIGINFO) != 0 ||
+   quitact.sa_handler != SIG_IGN)
+   (void)sigaddset(&defmask, SIGQUIT);
(void)sigemptyset(&newsigblock);
(void)sigaddset(&newsigblock, SIGCHLD);
(void)_sigprocmask(SIG_BLOCK, &newsigblock, &oldsigblock);
-   switch(pid = fork()) {
-   case -1:/* error */
-   break;
-   case 0: /* child */
-   /*
-* Restore original signal dispositions and exec the command.
-*/
-   (void)_sigaction(SIGINT, &intact, NULL);
-   (void)_sigaction(SIGQUIT,  &quitact, NULL);
-   (void)_sigprocmask(SIG_SETMASK, &oldsigblock, NULL);
-   execl(_PATH_BSHELL, "sh", "-c", command, (char *)NULL);
-   _exit(127);
-   default:/* parent */
+   argv[0] = "sh";
+   argv[1] = "-c";
+   argv[2] = command;
+   argv[3] = NULL;
+   if ((error = posix_spawnattr_init(&attr)) != 0 ||
+   (error = posix_spawnattr_setsigmask(&attr, &oldsigblock)) != 0 ||
+   (error = posix_spawnattr_setsigdefault(&attr, &defmask)) != 0 ||
+   (error = posix_spawnattr_setflags(&attr,
+   POSIX_SPAWN_SETSIGDEF | POSIX_SPAWN_SETSIGMASK)) != 0 ||
+   (error = posix_spawn(&pid, _PATH_BSHELL, NULL, &attr,
+   __DECONST(char **, argv), environ)) != 0) {
+   pid = -1;
+   errno = error;
+   } else {
savedpid = pid;
do {
pid = _wait4(savedpid, &pstat, 0, (struct rusage *)0);
} while (pid == -1 && errno == EINTR);
-   break;
}
+   posix_spawnattr_destroy(&attr);
    (void)_sigaction(SIGINT, &intact, NULL);
(void)_sigaction(SIGQUIT,  &quitact, NULL);
(void)_sigprocmask(SIG_SETMASK, &oldsigblock, NULL);

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn()

2012-08-05 Thread Jilles Tjoelker
On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote:
> On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote:
> > People sometimes use system() from large address spaces where it would
> > improve performance greatly to use vfork() instead of fork().

> > A simple approach is to change fork() to vfork(), although I have not
> > tried this. It seems safe enough to use sigaction and sigprocmask system
> > calls in the vforked process.

> > Alternatively, we can have posix_spawn() do the vfork() with signal
> > changes. This avoids possible whining from compilers and static
> > analyzers about using vfork() in system.c. However, I do not like the
> > tricky code for signals and that it adds lines of code.

> > This is lightly tested.

> It is interesting to note that for some time our vfork(2) no longer
> stops the whole forked process (parent), only the forking thread is
> waiting for the child exit or exec. I am not sure is this point
> important for system(3), but determined code can notice the difference
> from the fork->vfork switch.

Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is
not a difference.

Thread singling may be noticeable from a failing execve() (but only in
the process doing execve()) and in the rare case of rfork() without
RFPROC.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn() and libthr

2012-08-09 Thread Jilles Tjoelker
On Mon, Aug 06, 2012 at 11:25:35AM +0300, Konstantin Belousov wrote:
> On Sun, Aug 05, 2012 at 11:54:32PM +0200, Jilles Tjoelker wrote:
> > On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote:
> > > On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote:
> > > > People sometimes use system() from large address spaces where it would
> > > > improve performance greatly to use vfork() instead of fork().

> > > > A simple approach is to change fork() to vfork(), although I have not
> > > > tried this. It seems safe enough to use sigaction and sigprocmask system
> > > > calls in the vforked process.

> > > > Alternatively, we can have posix_spawn() do the vfork() with signal
> > > > changes. This avoids possible whining from compilers and static
> > > > analyzers about using vfork() in system.c. However, I do not like the
> > > > tricky code for signals and that it adds lines of code.

> > > > This is lightly tested.

> > > It is interesting to note that for some time our vfork(2) no longer
> > > stops the whole forked process (parent), only the forking thread is
> > > waiting for the child exit or exec. I am not sure is this point
> > > important for system(3), but determined code can notice the difference
> > > from the fork->vfork switch.

> > Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is
> > not a difference.
> It is the difference, because vforked child shares parent address space.

> > Thread singling may be noticeable from a failing execve() (but only in
> > the process doing execve()) and in the rare case of rfork() without
> > RFPROC.

> No, other running threads in parent affect vforked child till exec or exit.
> In fact, I would classify this as bug, but not a serious one.

There are some ugly ways this parallel execution is depended on. If the
vforked child calls sigaction() while another thread is also in
sigaction() for that signal, the vforked child needs to wait for the
other thread to release the lock.

This uses a per-process lock to synchronize threads in different
processes, which may not work properly.

If the vforked child is killed (such as by SIGKILL) while holding the
lock, the parent is not killed but its _thr_sigact is damaged.

These problems could be avoided in libthr by skipping the lock in
_sigaction() if a signal action is being set to SIG_DFL or SIG_IGN and
the old action is not queried. In those cases, _thr_sigact is not
touched so no lock is required. This change also helps applications,
provided they call sigaction() and not signal().

Alternatively, posix_spawn() and system() could use the sigaction system
call directly, bypassing libthr (if present). However, this will not
help applications that call vfork() and sigaction() themselves (such as
a shell that wants to implement ... & using vfork()).

posix_spawn() likely still needs some adjustment so that having it reset
all signals (sigfillset()) to the default action will not cause it to
[EINVAL] because libthr does not allow changing SIGTHR's disposition.

Index: lib/libthr/thread/thr_sig.c
===
--- lib/libthr/thread/thr_sig.c (revision 238970)
+++ lib/libthr/thread/thr_sig.c (working copy)
@@ -519,8 +519,16 @@
return (-1);
}
 
-   if (act)
+   if (act) {
newact = *act;
+   /*
+* Short-circuit cases where we do not touch _thr_sigact.
+* This allows performing these safely in a vforked child.
+*/
+   if ((newact.sa_handler == SIG_DFL ||
+   newact.sa_handler == SIG_IGN) && oact == NULL)
+   return (__sys_sigaction(sig, &newact, NULL));
+   }
 
__sys_sigprocmask(SIG_SETMASK, &_thr_maskset, &oldset);
_thr_rwl_wrlock(&_thr_sigact[sig-1].lock);

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn() and libthr

2012-08-11 Thread Jilles Tjoelker
On Fri, Aug 10, 2012 at 10:16:04AM +0800, David Xu wrote:
> On 2012/08/09 18:56, Jilles Tjoelker wrote:
> > On Mon, Aug 06, 2012 at 11:25:35AM +0300, Konstantin Belousov wrote:
> >> On Sun, Aug 05, 2012 at 11:54:32PM +0200, Jilles Tjoelker wrote:
> >>> On Mon, Jul 30, 2012 at 01:53:03PM +0300, Konstantin Belousov wrote:
> >>>> On Mon, Jul 30, 2012 at 12:24:08PM +0200, Jilles Tjoelker wrote:
> >>>>> People sometimes use system() from large address spaces where it would
> >>>>> improve performance greatly to use vfork() instead of fork().

> >>>>> A simple approach is to change fork() to vfork(), although I have not
> >>>>> tried this. It seems safe enough to use sigaction and sigprocmask system
> >>>>> calls in the vforked process.

> >>>>> Alternatively, we can have posix_spawn() do the vfork() with signal
> >>>>> changes. This avoids possible whining from compilers and static
> >>>>> analyzers about using vfork() in system.c. However, I do not like the
> >>>>> tricky code for signals and that it adds lines of code.

> >>>>> This is lightly tested.

> >>>> It is interesting to note that for some time our vfork(2) no longer
> >>>> stops the whole forked process (parent), only the forking thread is
> >>>> waiting for the child exit or exec. I am not sure is this point
> >>>> important for system(3), but determined code can notice the difference
> >>>> from the fork->vfork switch.

> >>> Neither fork nor vfork call thread_single(SINGLE_BOUNDARY), so this is
> >>> not a difference.
> >> It is the difference, because vforked child shares parent address space.

> >>> Thread singling may be noticeable from a failing execve() (but only in
> >>> the process doing execve()) and in the rare case of rfork() without
> >>> RFPROC.

> >> No, other running threads in parent affect vforked child till exec or exit.
> >> In fact, I would classify this as bug, but not a serious one.

> > There are some ugly ways this parallel execution is depended on. If the
> > vforked child calls sigaction() while another thread is also in
> > sigaction() for that signal, the vforked child needs to wait for the
> > other thread to release the lock.

> > This uses a per-process lock to synchronize threads in different
> > processes, which may not work properly.

> > If the vforked child is killed (such as by SIGKILL) while holding the
> > lock, the parent is not killed but its _thr_sigact is damaged.

> > These problems could be avoided in libthr by skipping the lock in
> > _sigaction() if a signal action is being set to SIG_DFL or SIG_IGN and
> > the old action is not queried. In those cases, _thr_sigact is not
> > touched so no lock is required. This change also helps applications,
> > provided they call sigaction() and not signal().

> > Alternatively, posix_spawn() and system() could use the sigaction system
> > call directly, bypassing libthr (if present). However, this will not
> > help applications that call vfork() and sigaction() themselves (such as
> > a shell that wants to implement ...&  using vfork()).

> > posix_spawn() likely still needs some adjustment so that having it reset
> > all signals (sigfillset()) to the default action will not cause it to
> > [EINVAL] because libthr does not allow changing SIGTHR's disposition.

> > Index: lib/libthr/thread/thr_sig.c
> > ===
> > --- lib/libthr/thread/thr_sig.c (revision 238970)
> > +++ lib/libthr/thread/thr_sig.c (working copy)
> > @@ -519,8 +519,16 @@
> > return (-1);
> > }
> >
> > -   if (act)
> > +   if (act) {
> > newact = *act;
> > +   /*
> > +* Short-circuit cases where we do not touch _thr_sigact.
> > +* This allows performing these safely in a vforked child.
> > +*/
> > +   if ((newact.sa_handler == SIG_DFL ||
> > +   newact.sa_handler == SIG_IGN)&&  oact == NULL)
> > +   return (__sys_sigaction(sig,&newact, NULL));
> > +   }
> >
> > __sys_sigprocmask(SIG_SETMASK,&_thr_maskset,&oldset);
> > _thr_rwl_wrlock(&_thr_sigact[sig-1].lock);
> >

> Your patch is better than nothing, I don't object.
> The problem is visble to you, but there is also invisible user - rtld.
> If a symbol is never used in parent proce

Re: system() using vfork() or posix_spawn() and libthr

2012-08-14 Thread Jilles Tjoelker
On Tue, Aug 14, 2012 at 11:15:06PM +0800, David Xu wrote:
> But in real word, pthread atfork handlers are not async-signal safe,
> they mostly do mutex locking and unlocking to keep consistent state,
> mutex is not async-signal safe.
> The malloc prefork and postfork handlers happen to work because I have
> some hacking code in library for malloc locks. Otherwise, you even
> can not use fork() in signal handler.

This problem was also reported to the Austin Group at
http://austingroupbugs.net/view.php?id=62

Atfork handlers are inherently async-signal-unsafe.

An interpretation was issued suggesting to remove fork() from the list
of async-signal-safe functions and add a new async-signal-safe function
_Fork() which does not call the atfork handlers.

This change will however not be in POSIX.1-2008 TC1 but only in the next
issue (SUSv5).

A slightly earlier report http://austingroupbugs.net/view.php?id=18 just
requested the _Fork() function because an existing application
deadlocked when calling fork() from a signal handler.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn() and libthr

2012-08-16 Thread Jilles Tjoelker
On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote:
> My point is that the fact that fork() is called from dynamic context
> that was identified as the signal handler does not mean anything.
> It can be mis-identified for many reasons, which both me and you
> tried to partially enumerate above.

> The really important thing is the atomicity of the current context,
> e.g. synchronous SIGSEGV caused by a language runtime GC is very
> much safe place to call atfork handlers, since runtimes usually cause
> signal generations only at the safe place.

> I do not think that such approach as you described can be completed,
> the _Fork() seems the only robust way.

Agreed, that way (detecting signal handler) lies madness.

If need be, _Fork() is easily implemented and used.

> BTW, returning to Jilles proposal, can we call vfork() only for
> single-threaded parent ? I think it gives good boost for single-threaded
> case, and also eliminates the concerns of non-safety.

This would probably fix the safety issues but at a price. There is a
correlation between processes so large that they benefit greatly from
vfork and threaded processes.

On the other hand, I think direct calls to vfork() in applications are
risky and it may not be possible to support them safely in all
circumstances. However, if libc is calling vfork() such as via popen(),
system() or posix_spawn(), it should be possible even in a
multi-threaded process. In that case, the rtld and libthr problems can
be avoided by using aliases with hidden visibility for all functions the
vforked child needs to call (or any other method that prevents
interposition and hard-codes a displacement into libc.so).

There may still be a problem in programs that install signal handlers
because the set of async-signal-safe functions is larger than what can
be done in a vforked child. Userland can avoid this by masking affected
signals before calling vfork() and resetting them to SIG_DFL before
unmasking them. This will add many syscalls if the code does not know
which signals are affected (such as libc). Alternatively, the kernel
could map caught signals to the default action for processes with
P_PPWAIT (just like it does not stop such processes because of signals
or TTY job control).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn() and libthr

2012-08-17 Thread Jilles Tjoelker
On Fri, Aug 17, 2012 at 03:43:12PM +0300, Konstantin Belousov wrote:
> On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote:
> > On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote:
> > > BTW, returning to Jilles proposal, can we call vfork() only for
> > > single-threaded parent ? I think it gives good boost for single-threaded
> > > case, and also eliminates the concerns of non-safety.

> > This would probably fix the safety issues but at a price. There is a
> > correlation between processes so large that they benefit greatly from
> > vfork and threaded processes.
> Ok, so I will continue with my patch.

> > On the other hand, I think direct calls to vfork() in applications are
> > risky and it may not be possible to support them safely in all
> > circumstances. However, if libc is calling vfork() such as via popen(),
> > system() or posix_spawn(), it should be possible even in a
> > multi-threaded process. In that case, the rtld and libthr problems can
> > be avoided by using aliases with hidden visibility for all functions the
> > vforked child needs to call (or any other method that prevents
> > interposition and hard-codes a displacement into libc.so).

> I do not see how using any aliases could help there. Basically, if mt
> process is not single-threaded for vfork, you can have both some parent
> thread and child enter rtld. This is complete mess.

If libc calls a function with hidden visibility, this proceeds directly
(not via the PLT) and rtld is not involved.

Several ways to do this are in section 2.2.7 Avoid Using Exported
Symbols of Ulrich Drepper's dsohowto. One of them is something like

extern __typeof(next) next_int
__attribute((alias("next"), visibility("hidden")));

in the same source as the definition of the function

int next(void) { ...; }

As Drepper notes, the visibility attribute is not strictly required if a
version script keeps the symbol local but it might lead to better code.
At least on i386, though, the optimal direct near call instruction is
generated even without it. For example, _nsdispatch() calls
libc_dlopen() (kept local by the version script) without going through
the PLT (from the output of objdump -dS on the libc.so.7 in /usr/obj).

In the assembler syscall stubs using code from lib/libc/arch/SYS.h this
can be done by adding another .set (we currently have foo, _foo and
__sys_foo for a syscall foo; some syscalls have only _foo and
__sys_foo) such as __syshidden_foo. The ones that are actually used
then need prototypes (similar to the __sys_* ones in lib/libthr/?).

For some reason, the symbol .cerror (HIDENAME(cerror)) is also exported.
Either this should be kept local or a local uninterposable alias should
be added and used (as with the syscalls).

The function __error() (to get errno's address for the current thread)
is and must be called via the PLT (because libthr is separate).
Therefore, we should ensure it is called at least once before vfork so
calls in the child do not involve rtld. The implementations for the
various architectures use the TLS register (or memory location for ARM),
so they seem safe.

This should suffice to fix posix_spawn() but the _execvpe() used by
posix_spawnp also uses various string functions. If not all of these
have already been called earlier, this will not work. Making calls to
them not go through the PLT seems fairly hard, even though it would make
sense in general, so perhaps I should instead reimplement it such that
the parent does almost all of the work.

An alternative is to write the core of posix_spawn() in assembler using
system calls directly but I would rather avoid that :(

> > There may still be a problem in programs that install signal handlers
> > because the set of async-signal-safe functions is larger than what can
> > be done in a vforked child. Userland can avoid this by masking affected
> > signals before calling vfork() and resetting them to SIG_DFL before
> > unmasking them. This will add many syscalls if the code does not know
> > which signals are affected (such as libc). Alternatively, the kernel
> > could map caught signals to the default action for processes with
> > P_PPWAIT (just like it does not stop such processes because of signals
> > or TTY job control).

> If rtld does not work, then any library function call from a signal handler
> is problematic. My goal is to get a working rtld and possibly malloc.

> BTW, not quite related, it seems that the placement of openat,
> setcontext and swapcontext in the pthread.map is wrong. openat is
> at FBSD_1.1 in libc, and *context are at FBSD_1.0 version, while
> libthr exports them at 1.2. App or library gets linked to arbitrary
> version depending on whether libphread was specified at link time, and
> i

[patch] libc: Do not export .cerror

2012-08-24 Thread Jilles Tjoelker
E
-   jmp PIC_PLT(HIDENAME(cerror))
 END(__sys_getcontext)
 
.section .note.GNU-stack,"",%progbits
Index: lib/libc/i386/sys/cerror.S
===
--- lib/libc/i386/sys/cerror.S  (revision 239195)
+++ lib/libc/i386/sys/cerror.S  (working copy)
@@ -48,13 +48,14 @@
.globl  CNAME(__error)
.type   CNAME(__error),@function
 HIDENAME(cerror):
+#ifdef PIC
+   PIC_PROLOGUE
pushl   %eax
-#ifdef PIC
-   /* The caller must execute the PIC prologue before jumping to cerror. */
callPIC_PLT(CNAME(__error))
popl%ecx
PIC_EPILOGUE
 #else
+   pushl   %eax
callCNAME(__error)
popl%ecx
 #endif
Index: lib/libc/i386/sys/sbrk.S
===
--- lib/libc/i386/sys/sbrk.S(revision 239195)
+++ lib/libc/i386/sys/sbrk.S(working copy)
@@ -59,7 +59,7 @@
addl%eax,4(%esp)
mov $SYS_break,%eax
KERNCALL
-   jb  err
+   jb  HIDENAME(cerror)
PIC_PROLOGUE
movlPIC_GOT(HIDENAME(curbrk)),%edx
movl(%edx),%eax
@@ -67,9 +67,6 @@
PIC_EPILOGUE
 back:
ret
-err:
-   PIC_PROLOGUE
-   jmp PIC_PLT(HIDENAME(cerror))
 
 #else /* !PIC */
 
@@ -80,13 +77,11 @@
addl%eax,4(%esp)
mov $SYS_break,%eax
KERNCALL
-   jb  err
+   jb  HIDENAME(cerror)
movlHIDENAME(curbrk),%eax
addl%ecx,HIDENAME(curbrk)
 back:
ret
-err:
-   jmp HIDENAME(cerror)
 #endif /* PIC */
 END(sbrk)
 
Index: lib/libc/i386/sys/Ovfork.S
===
--- lib/libc/i386/sys/Ovfork.S  (revision 239195)
+++ lib/libc/i386/sys/Ovfork.S  (working copy)
@@ -50,8 +50,7 @@
jmp *%ecx
 1:
pushl   %ecx
-   PIC_PROLOGUE
-   jmp PIC_PLT(HIDENAME(cerror))
+   jmp HIDENAME(cerror)
 END(__sys_vfork)
 
.section .note.GNU-stack,"",%progbits
Index: lib/libc/i386/sys/ptrace.S
===
--- lib/libc/i386/sys/ptrace.S  (revision 239195)
+++ lib/libc/i386/sys/ptrace.S  (working copy)
@@ -50,11 +50,8 @@
 #endif
mov $SYS_ptrace,%eax
KERNCALL
-   jb  err
+   jb  HIDENAME(cerror)
ret
-err:
-   PIC_PROLOGUE
-   jmp PIC_PLT(HIDENAME(cerror))
 END(ptrace)
 
.section .note.GNU-stack,"",%progbits
Index: lib/libc/i386/sys/exect.S
===
--- lib/libc/i386/sys/exect.S   (revision 239195)
+++ lib/libc/i386/sys/exect.S   (working copy)
@@ -47,8 +47,7 @@
pushl   %edx
popf
KERNCALL
-   PIC_PROLOGUE
-   jmp PIC_PLT(HIDENAME(cerror))   /* exect(file, argv, env); */
+   jmp HIDENAME(cerror)/* exect(file, argv, env); */
 END(exect)
 
.section .note.GNU-stack,"",%progbits
Index: lib/libc/i386/sys/syscall.S
===
--- lib/libc/i386/sys/syscall.S (revision 239195)
+++ lib/libc/i386/sys/syscall.S (working copy)
@@ -45,11 +45,8 @@
KERNCALL
push%ecx/* need to push a word to keep stack frame intact
   upon return; the word must be the return address. */
-   jb  1f
+   jb  HIDENAME(cerror)
ret
-1:
-   PIC_PROLOGUE
-   jmp PIC_PLT(HIDENAME(cerror))
 END(syscall)
 
.section .note.GNU-stack,"",%progbits
-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: system() using vfork() or posix_spawn() and libthr

2012-08-26 Thread Jilles Tjoelker
On Sun, Aug 19, 2012 at 03:26:36PM +0300, Konstantin Belousov wrote:
> On Fri, Aug 17, 2012 at 06:08:47PM +0200, Jilles Tjoelker wrote:
> > On Fri, Aug 17, 2012 at 03:43:12PM +0300, Konstantin Belousov wrote:
> > > On Fri, Aug 17, 2012 at 12:39:33AM +0200, Jilles Tjoelker wrote:
> > > > On Thu, Aug 16, 2012 at 02:44:26PM +0300, Konstantin Belousov wrote:
> > > > > BTW, returning to Jilles proposal, can we call vfork() only for
> > > > > single-threaded parent ? I think it gives good boost for 
> > > > > single-threaded
> > > > > case, and also eliminates the concerns of non-safety.

> > > > This would probably fix the safety issues but at a price. There is a
> > > > correlation between processes so large that they benefit greatly from
> > > > vfork and threaded processes.
> > > Ok, so I will continue with my patch.

> > > > On the other hand, I think direct calls to vfork() in applications are
> > > > risky and it may not be possible to support them safely in all
> > > > circumstances. However, if libc is calling vfork() such as via popen(),
> > > > system() or posix_spawn(), it should be possible even in a
> > > > multi-threaded process. In that case, the rtld and libthr problems can
> > > > be avoided by using aliases with hidden visibility for all functions the
> > > > vforked child needs to call (or any other method that prevents
> > > > interposition and hard-codes a displacement into libc.so).

> > > I do not see how using any aliases could help there. Basically, if mt
> > > process is not single-threaded for vfork, you can have both some parent
> > > thread and child enter rtld. This is complete mess.

> > If libc calls a function with hidden visibility, this proceeds directly
> > (not via the PLT) and rtld is not involved.
> Oh, so you are only concerned with libc there ?

> If spending so much efforts on this issue, IMO it is pity to only fix
> libc and not fix vfork for all consumers.

True. If vfork can be fixed more generally without too much penalty, it
is better to do that. And it looks like that is possible, including
detecting unexpected kills, see below.

> > Several ways to do this are in section 2.2.7 Avoid Using Exported
> > Symbols of Ulrich Drepper's dsohowto. One of them is something like

> > extern __typeof(next) next_int
> > __attribute((alias("next"), visibility("hidden")));

> > in the same source as the definition of the function

> > int next(void) { ...; }

> > As Drepper notes, the visibility attribute is not strictly required if a
> > version script keeps the symbol local but it might lead to better code.
> > At least on i386, though, the optimal direct near call instruction is
> > generated even without it. For example, _nsdispatch() calls
> > libc_dlopen() (kept local by the version script) without going through
> > the PLT (from the output of objdump -dS on the libc.so.7 in /usr/obj).

> I am not confident, so this might be a hallucination, but 'optimization'
> in the recent gnu ld might rewrite  the call target to avoid PLT when
> symbol visibility is hidden.

If symbol visibility is hidden, rtld does not know about the symbol's
name; therefore, the only way ld can make it work is to make the call
instruction point directly at the called function, not to a PLT entry.

Different from what I wrote earlier, on i386, there is still a penalty
in omitting the visibility attribute: the compiler will generate code to
load %ebx which may not be necessary.

> > In the assembler syscall stubs using code from lib/libc/arch/SYS.h this
> > can be done by adding another .set (we currently have foo, _foo and
> > __sys_foo for a syscall foo; some syscalls have only _foo and
> > __sys_foo) such as __syshidden_foo. The ones that are actually used
> > then need prototypes (similar to the __sys_* ones in lib/libthr/?).

> > For some reason, the symbol .cerror (HIDENAME(cerror)) is also exported.
> > Either this should be kept local or a local uninterposable alias should
> > be added and used (as with the syscalls).

I sent a patch for .cerror (i386 only) in a new mailing list thread.

> > The function __error() (to get errno's address for the current thread)
> > is and must be called via the PLT (because libthr is separate).
> > Therefore, we should ensure it is called at least once before vfork so
> > calls in the child do not involve rtld. The implementations for the
> > various architectures use the TLS register (or memory location for ARM),
> > so they seem safe.

> > This should suffice to f

Re: [patch] libc: Do not export .cerror

2012-08-31 Thread Jilles Tjoelker
On Tue, Aug 28, 2012 at 02:03:22PM +0300, Konstantin Belousov wrote:
> On Sat, Aug 25, 2012 at 12:16:55AM +0200, Jilles Tjoelker wrote:
> > Not exporting .cerror causes it to be jumped to directly instead of via
> > the PLT.

> > The below patch is for i386 only and also takes advantage of .cerror's
> > new status by not saving and loading %ebx before jumping to it.
> > (Therefore, .cerror now saves and loads %ebx itself.) Where there was a
> > conditional jump to a jump to .cerror, the conditional jump has been
> > changed to jump to .cerror directly (many modern CPUs don't do static
> > prediction and in any case it is not much of a benefit anyway).

> Why do you need to save/restore the %ebx at all ? %ebx ==
> &__GLOBAL_OFFSET_TABLE__ is only needed when you access GOT, but .cerror
> only works with PLT, which is addressed using the instruction capable of
> relative addressing. The old .cerror does not need it as well, but it is
> just engraved in the function ABI.

On i386, a shared object's PLT entry needs %ebx set up to work properly.
This is because such a PLT entry needs to access the GOT to find the
address to jump to (the first instruction is jmp *d32(%ebx)).

An executable's PLT entry accesses the GOT via absolute addressing and
therefore does not need %ebx.

> > The patch decreases the size of libc.so.7 by a few kilobytes.

> > Similar changes could be made to other architectures, and there may be
> > more symbols that are exported but need not be.
> Sure, would you handle at least amd64 too ?

The below patch handles amd64.

I'm a bit annoyed that most of the syscall stubs are 17 bytes long now
and have the maximum 15 bytes of padding. This means that the patch
provides virtually no gain in code size.

Index: lib/libc/amd64/Symbol.map
===
--- lib/libc/amd64/Symbol.map   (revision 239865)
+++ lib/libc/amd64/Symbol.map   (working copy)
@@ -66,7 +66,6 @@
.curbrk;
.minbrk;
_brk;
-   .cerror;
_end;
__sys_vfork;
_vfork;
Index: lib/libc/amd64/SYS.h
===
--- lib/libc/amd64/SYS.h(revision 239865)
+++ lib/libc/amd64/SYS.h(working copy)
@@ -36,38 +36,20 @@
 #include 
 #include 
 
-#ifdef PIC
 #defineRSYSCALL(x) ENTRY(__CONCAT(__sys_,x));  
\
.weak CNAME(x); \
.set CNAME(x),CNAME(__CONCAT(__sys_,x));\
.weak CNAME(__CONCAT(_,x)); \
.set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \
-   mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \
-   2: movq PIC_GOT(HIDENAME(cerror)),%rcx; jmp *%rcx; \
+   mov __CONCAT($SYS_,x),%eax; KERNCALL;   \
+   jb HIDENAME(cerror); ret;   \
END(__CONCAT(__sys_,x))
 
 #definePSEUDO(x)   ENTRY(__CONCAT(__sys_,x));  
\
.weak CNAME(__CONCAT(_,x)); \
.set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \
-   mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret ; \
-   2: movq PIC_GOT(HIDENAME(cerror)),%rcx; jmp *%rcx; \
+   mov __CONCAT($SYS_,x),%eax; KERNCALL;   \
+   jb HIDENAME(cerror); ret;   \
END(__CONCAT(__sys_,x))
-#else
-#defineRSYSCALL(x) ENTRY(__CONCAT(__sys_,x));  
\
-   .weak CNAME(x); \
-   .set CNAME(x),CNAME(__CONCAT(__sys_,x));\
-   .weak CNAME(__CONCAT(_,x)); \
-   .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \
-   mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \
-   2: jmp HIDENAME(cerror);\
-   END(__CONCAT(__sys_,x))
 
-#definePSEUDO(x)   ENTRY(__CONCAT(__sys_,x));  
\
-   .weak CNAME(__CONCAT(_,x)); \
-   .set CNAME(__CONCAT(_,x)),CNAME(__CONCAT(__sys_,x)); \
-   mov __CONCAT($SYS_,x),%rax; KERNCALL; jb 2f; ret; \
-   2: jmp HIDENAME(cerror);\
-   END(__CONCAT(__sys_,x))
-#endif
-
 #define KERNCALL   movq %rcx, %r10; syscall
Index: lib/libc/amd64/gen/rfork_thread.S
===
--- lib/libc/amd64/gen/rfork_thread.S   

Re: Change vfork() to posix_spawn()?

2012-09-14 Thread Jilles Tjoelker
On Fri, Sep 14, 2012 at 01:45:49PM +0200, Erik Cederstrand wrote:
> Den 14/09/2012 kl. 13.03 skrev Ivan Voras :
> > On 14/09/2012 09:49, Erik Cederstrand wrote:

> >> I'm looking through the Clang Analyzer scans on
> >> http://scan.freebsd.your.org/freebsd-head looking for false
> >> positives to report back to LLVM. There are quite a list of reports
> >> suggesting to change vfork() calls to posix_spawn(). Example from
> >> /bin/rpc:
> >> http://scan.freebsd.your.org/freebsd-head/bin.rcp/2012-09-12-amd64/report-nsOV80.html#EndPath

> >> I know nothing about this but I can see fork and posix_spawn have
> >> been discussed on this list previously. Is this a legitimate
> >> warning (in this case and in general in FreeBSD base)?

> > Currently (on 9-stable at least), posix_spawn() is implemented as a
> > wrapper around vfork(), so I doubt replacing one with the other
> > would do much.

vfork() returns twice, possibly reanimating variables from the dead.
Calling posix_spawn() limits this issue to the posix_spawn()
implementation only.

For example, in case of unwilling compiler developers, the optimization
level for that file might be lowered or more volatile keywords might be
added. I think it makes more sense to disable various optimizations in
the compiler automatically in functions that call vfork(), longjmp() and
similar functions, but I do not decide what compiler developers do.

> The analyzer added this warning in January. The release notes link to
> this explanation:
> https://www.securecoding.cert.org/confluence/display/seccode/POS33-C.+Do+not+use+vfork()

> I guess this is the important part:

> "Because of the implementation of the vfork() function, the parent
> process is suspended while the child process executes. If a user sends
> a signal to the child process, delaying its execution, the parent
> process (which is privileged) is also blocked. This means that an
> unprivileged process can cause a privileged process to halt, which is
> a privilege inversion resulting in a denial of service."

This problem only occurs if privileges are dropped between vfork and
exec, which is uncommon. If no privileges are dropped, the user can
affect the parent directly.

Furthermore, this exact problem does not happen in FreeBSD because child
processes between vfork and exec/exit are not affected by stop signals
(this is stronger than the vfork(2) man page documents).

However, related issues are still present. If there is a signal handler
that blocks for a long time (many functions which do this are
async-signal-safe) for a signal permitted by
security.bsd.conservative_signals, an unprivileged user will be able to
trigger it and delay the thread calling vfork(). A function may also be
async-signal-safe but not suitable for a vforked child (for example,
libthr makes many functions async-signal-safe by postponing signal
handlers which is not good enough if a vforked child is SIGKILL'ed).

An unprivileged user may also trigger priority inversion by lowering the
priority of the child process and consuming CPU time at a higher
priority.

Obviously, the child process should not lower its priority voluntarily
either.

These problems can be fixed in various ways. The direct priority
inversion problem can be fixed by using fork() instead in that case or
by adding a priority inheritance scheme in the kernel for vforked
children (but only for the static priority; the parent's dynamic
priority will increase because it is sleeping).

The privilege manipulation available via POSIX_SPAWN_RESETIDS seems safe
enough. Since it only affects the effective UID/GID, it does not affect
the ability to modify scheduling parameters (real UID) or to send
signals (real or saved UID). Since the seteuid() call itself will set
the issetugid flag to true if it changed anything, it does not affect
the ability to debug before exec.

More general privilege dropping frequently involves frameworks such as
PAM which are not async-signal-safe and certainly not vfork-safe.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: -lpthread vs -pthread: does -D_REENTRANT matter?

2012-10-14 Thread Jilles Tjoelker
On Mon, Oct 08, 2012 at 12:17:08PM -0400, Eitan Adler wrote:
> The only difference between -lpthread and -pthread that I could see is
> that the latter also sets -D_REENTRANT.
> However, I can't find any uses of _REENTRANT anywhere outside of a few
> utilities that seem to define it manually.

> Testing with various manually written pthread programs resulted in
> identical binaries, let alone identical results.

> Is there an actual difference between -pthread and -lpthread or is
> this just a historical artifact?

In some cases, -pthread also affects the compiler's code generation. On
some RISC architectures, compilers may try to avoid loads and stores of
less than 32 bits.

For example (untested):
  struct { int n; char a, b, c, d; } *p;
  p->a = p->b = p->c = 0;

The compiler might load p->d and then store the 32 bits containing a, b,
c and d at once. This causes a race condition if p->d is written
concurrently.

Because C99 does not specify threading, it allows these transformations.
In C11, they are forbidden. Passing -pthread disables them as well.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: -lpthread vs -pthread: does -D_REENTRANT matter?

2012-10-21 Thread Jilles Tjoelker
On Sun, Oct 14, 2012 at 04:55:07PM -0400, Eitan Adler wrote:
> On 14 October 2012 10:42, Jilles Tjoelker  wrote:

> > Because C99 does not specify threading, it allows these transformations.
> > In C11, they are forbidden. Passing -pthread disables them as well.

> Is the man page wrong or do I misunderstand?

>This option sets flags for both the preprocessor and linker.  It
>does not affect the thread safety of object code produced by the
>compiler or that of libraries supplied with it.

OK, that would explain why I could not find such things.

I seem to recall GCC must sometimes be instructed not to do
thread-unsafe transformations but I cannot find how to do so.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


[patch] rtld: fix fd leak with parallel dlopen and fork/exec

2012-11-04 Thread Jilles Tjoelker
, NULL, NULL,
(char *[]){ "sh", "-c", "procstat -f $$", NULL },
environ);
if (error != 0)
errc(1, error, "posix_spawnp");
do
wpid = waitpid(pid, &status, 0);
while (wpid == -1 && errno == EINTR);
if (wpid == -1)
err(1, "waitpid");
if (status != 0)
errx(1, "sh -c failed");
}
}

int
main(int argc, char *argv[])
{
pthread_t td;
int error;

if (argc != 2)
{
fprintf(stderr, "Usage: %s dso\n", argv[0]);
return 1;
}

error = pthread_create(&td, NULL, dlopen_thread, argv[1]);
if (error != 0)
errc(1, error, "pthread_create");

filestat_loop();

return 0;
}

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Incorrect use of posix_memalign() (was: Re: svn commit: r243405 - in stable/9: include lib/libc/stdlib)

2012-11-25 Thread Jilles Tjoelker
On Thu, Nov 22, 2012 at 03:19:53PM +, Ed Schouten wrote:
> Author: ed
> Date: Thu Nov 22 15:19:53 2012
> New Revision: 243405
> URL: http://svnweb.freebsd.org/changeset/base/243405

> Log:
>   MFC r229848:

> Add aligned_alloc(3).

> The C11 folks reinvented the wheel by introducing an aligned version of
> malloc(3) called aligned_alloc(3), instead of posix_memalign(3). Instead
> of returning the allocation by reference, it returns the address, just
> like malloc(3).

>   I'm MFCing this now, as it seems aligned_alloc(3) is needed to make the
>   new version of libc++ work, which was merged back to FreeBSD 9 in r243376.

The C11 committee knew about posix_memalign() and had several reasons
for creating a new function; see for example
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1397.htm .

In particular, posix_memalign() is a little annoying to use correctly,
often requiring a temporary variable of type void *. It is tempting to
do something like
  error = posix_memalign((void **)&some_ptr, aln, sz);
and some FreeBSD code does this, but it violates strict-aliasing. A
further mostly theoretical objection is that assumes that the
representation of some_ptr and void * are compatible which C does not
guarantee.

The problem can be fixed by adding the temporary pointer variable like
  void *tmp_ptr;
  error = posix_memalign(&tmp_ptr, aln, sz);
  some_ptr = tmp_ptr;
or by using aligned_alloc() instead of posix_memalign()
  some_ptr = aligned_alloc(aln, sz);
with error checking against some_ptr instead of error.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Fix overlinking in base aka import pkgconf

2012-12-16 Thread Jilles Tjoelker
On Sun, Dec 16, 2012 at 04:03:40PM +0200, Konstantin Belousov wrote:
> On Sun, Dec 16, 2012 at 02:01:00PM +0100, Jeremie Le Hen wrote:
> > I think this could be solved by an implicit linker script contained in
> > .so and .a files, pointing to the real libraries.

> > We have already the SHLIB_LDSCRIPT variable to accomplish this for .so
> > files.  It may be possible to do the same for .a files, though this
> > would require renaming the real archives to something like .a.0 (here
> > I'm just taking a similar scheme to the one used for DSO).

> > As a matter of fact, libprocstat.a would contain:

> >   GROUP ( /usr/lib/libprocstat.a.0 /usr/lib/libkvm.a /usr/lib/libutil.a )

> > Note that libkvm.a and libutil.a could be linker scripts as well.

> > Kib, do you see any problem to this proposition?

> Wouldn't you need to completely rewrite the handling of the .a files
> in the share/mk ? I somewhat dislike the mere thought that .a is not
> an archive any longer.

> Does it make sense from the overhead and complexity POV, for such small
> goal ?

For base this may not make much sense since there is not much need to
run an old base binary against newer base libraries. Apart from that the
only advantage is some form of philosophical correctness.

For ports this may be a good way to keep static linking working after
fixing overlinking caused by libtool's dependency_libs in .la files.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: use after free in grep?

2012-12-20 Thread Jilles Tjoelker
On Thu, Dec 20, 2012 at 01:19:07PM +0100, Dimitry Andric wrote:
> On 2012-12-20 08:13, Eitan Adler wrote:
> > in xrealloc_impl

> > 338   new_ptr = realloc(ptr, new_size);
> > 339   if (new_ptr != NULL)
> > 340 {
> > 341   hash_table_del(xmalloc_table, ptr);

> > ^^^ isn't this a use-after-free of ptr?

> Yes, realloc does not guarantee the realloc'd space will be at the same
> address, so it may free ptr at its discretion.

Even if you somehow know realloc() is not going to move the block, it is
still wrong to use any pointer not derived from its return value to
access the block. Comparing the old and the new pointers (normally or
with memcmp()) does not help; it has an indeterminate result.

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm

> Also, there is a memory leak if realloc() returns NULL.  This is a
> very usual mistake when using realloc(). :-)

No, this would be correct if a successful realloc() call did not make
the old pointer indeterminate. The hash table remains unchanged if
realloc() fails.

> Probably, the code should do the hash_table_del() before the realloc(),
> but I am not sure if hash_table_del() will already free ptr.

Yes, and add it back if realloc() fails.

A smarter internal interface to the hash table would avoid freeing and
reallocating hash table entries here (which might fail).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Request for review, time_pps_fetch() enhancement

2013-02-09 Thread Jilles Tjoelker
On Wed, Feb 06, 2013 at 05:58:30PM +0200, Konstantin Belousov wrote:
> On Tue, Feb 05, 2013 at 09:41:38PM -0700, Ian Lepore wrote:
> > I'd like feedback on the attached patch, which adds support to our
> > time_pps_fetch() implementation for the blocking behaviors described in
> > section 3.4.3 of RFC 2783.  The existing implementation can only return
> > the most recently captured data without blocking.  These changes add the
> > ability to block (forever or with timeout) until a new event occurs.

> > Index: sys/kern/kern_tc.c
> > ===
> > --- sys/kern/kern_tc.c  (revision 246337)
> > +++ sys/kern/kern_tc.c  (working copy)
> > @@ -1446,6 +1446,50 @@
> >   * RFC 2783 PPS-API implementation.
> >   */
> >  
> > +static int
> > +pps_fetch(struct pps_fetch_args *fapi, struct pps_state *pps)
> > +{
> > [snip]
> > +   aseq = pps->ppsinfo.assert_sequence;
> > +   cseq = pps->ppsinfo.clear_sequence;
> > +   while (aseq == pps->ppsinfo.assert_sequence &&
> > +   cseq == pps->ppsinfo.clear_sequence) {
> Note that compilers are allowed to optimize these accesses even over
> the sequential point, which is the tsleep() call. Only accesses to
> volatile objects are forbidden to be rearranged.

> I suggest to add volatile casts to pps in the loop condition.

The memory pointed to by pps is global (other code may have a pointer to
it); therefore, the compiler must assume that the tsleep() call (which
invokes code in a different compilation unit) may modify it.

Because volatile does not make concurrent access by multiple threads
defined either, adding it here only seems to slow down the code
(potentially).

> > +   err = tsleep(pps, PCATCH, "ppsfch", timo);
> > +   if (err == EWOULDBLOCK && fapi->timeout.tv_sec == -1) {
> > +   continue;
> > +   } else if (err != 0) {
> > +   return (err);
> > +   }
> > +   }
> > +   }
-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


[patch] statfs does not detect -t nullfs -o union as a union mount

2013-02-27 Thread Jilles Tjoelker
While testing recent changes to opendir(), I noticed that fstatfs() does
not return the MNT_UNION flag for a -t nullfs -o union mount. As a
result, opendir()/readdir() return files that exist in both top and
bottom directories twice (at least . and ..). Other -o union mounts and
-t unionfs mounts work correctly in this regard.

The below patch passes through just the MNT_UNION flag of the nullfs
mount itself. Perhaps more flags should be passed through.

commit fce32a779af4eb977c9b96feb6e4f811d89f2881
Author: Jilles Tjoelker 
Date:   Sat Feb 23 22:22:39 2013 +0100

nullfs: Preserve the MNT_UNION flag of the nullfs mount itself.

This is needed so that opendir() can properly detect a union mount like
mount -t nullfs -o union dir1 dir2.

diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
index 3724e0a..ff06f57 100644
--- a/sys/fs/nullfs/null_vfsops.c
+++ b/sys/fs/nullfs/null_vfsops.c
@@ -313,7 +313,7 @@ nullfs_statfs(mp, sbp)
 
/* now copy across the "interesting" information and fake the rest */
sbp->f_type = mstat.f_type;
-   sbp->f_flags = mstat.f_flags;
+   sbp->f_flags = (sbp->f_flags & MNT_UNION) | mstat.f_flags;
sbp->f_bsize = mstat.f_bsize;
sbp->f_iosize = mstat.f_iosize;
sbp->f_blocks = mstat.f_blocks;

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [patch] statfs does not detect -t nullfs -o union as a union mount

2013-03-01 Thread Jilles Tjoelker
On Thu, Feb 28, 2013 at 12:21:44AM +0200, Konstantin Belousov wrote:
> On Wed, Feb 27, 2013 at 10:31:42PM +0100, Jilles Tjoelker wrote:
> > While testing recent changes to opendir(), I noticed that fstatfs() does
> > not return the MNT_UNION flag for a -t nullfs -o union mount. As a
> > result, opendir()/readdir() return files that exist in both top and
> > bottom directories twice (at least . and ..). Other -o union mounts and
> > -t unionfs mounts work correctly in this regard.

> > The below patch passes through just the MNT_UNION flag of the nullfs
> > mount itself. Perhaps more flags should be passed through.

> > commit fce32a779af4eb977c9b96feb6e4f811d89f2881
> > Author: Jilles Tjoelker 
> > Date:   Sat Feb 23 22:22:39 2013 +0100
> > 
> > nullfs: Preserve the MNT_UNION flag of the nullfs mount itself.
> > 
> > This is needed so that opendir() can properly detect a union mount like
> > mount -t nullfs -o union dir1 dir2.
> > 
> > diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
> > index 3724e0a..ff06f57 100644
> > --- a/sys/fs/nullfs/null_vfsops.c
> > +++ b/sys/fs/nullfs/null_vfsops.c
> > @@ -313,7 +313,7 @@ nullfs_statfs(mp, sbp)
> >  
> > /* now copy across the "interesting" information and fake the rest */
> > sbp->f_type = mstat.f_type;
> > -   sbp->f_flags = mstat.f_flags;
> > +   sbp->f_flags = (sbp->f_flags & MNT_UNION) | mstat.f_flags;
> > sbp->f_bsize = mstat.f_bsize;
> > sbp->f_iosize = mstat.f_iosize;
> > sbp->f_blocks = mstat.f_blocks;

> Would it make sense to preserve more flags from the upper mount ?
> I see a use for MNT_NOEXEC as well, at least.

Yes, preserving MNT_NOEXEC will make -t nullfs -o noexec work better, in
particular rtld's check for libraries loaded via environment variables.

In the same way MNT_RDONLY, MNT_NOSUID and MNT_NOSYMFOLLOW could be
preserved.

On the other hand, MNT_ROOTFS should probably not be passed through from
the underlying filesystem.

This would give
sbp->f_flags = (sbp->f_flags & (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID |
MNT_UNION | MNT_NOSYMFOLLOW) | (mstat.f_flags & ~MNT_ROOTFS);

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


[patch] SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC

2013-03-17 Thread Jilles Tjoelker
f
-   error = falloc(td, &fp, &fd, 0);
+   error = falloc(td, &fp, &fd, oflag);
if (error)
return (error);
/* An extra reference on `fp' has been held for us by falloc(). */
-   error = socreate(uap->domain, &so, uap->type, uap->protocol,
+   error = socreate(uap->domain, &so, type, uap->protocol,
td->td_ucred, td);
if (error) {
fdclose(td->td_proc->p_fd, fp, fd, td);
} else {
-   finit(fp, FREAD | FWRITE, DTYPE_SOCKET, so, &socketops);
+   finit(fp, FREAD | FWRITE | fflag, DTYPE_SOCKET, so, &socketops);
+   if ((fflag & FNONBLOCK) != 0)
+   (void) fo_ioctl(fp, FIONBIO, &fflag, td->td_ucred, td);
td->td_retval[0] = fd;
}
fdrop(fp, td);
@@ -648,9 +663,20 @@
struct filedesc *fdp = td->td_proc->p_fd;
struct file *fp1, *fp2;
struct socket *so1, *so2;
-   int fd, error;
+   int fd, error, oflag, fflag;
 
AUDIT_ARG_SOCKET(domain, type, protocol);
+
+   oflag = 0;
+   fflag = 0;
+   if ((type & SOCK_CLOEXEC) != 0) {
+   type &= ~SOCK_CLOEXEC;
+   oflag |= O_CLOEXEC;
+   }
+   if ((type & SOCK_NONBLOCK) != 0) {
+   type &= ~SOCK_NONBLOCK;
+   fflag |= FNONBLOCK;
+   }
 #ifdef MAC
/* We might want to have a separate check for socket pairs. */
error = mac_socket_check_create(td->td_ucred, domain, type,
@@ -665,12 +691,12 @@
if (error)
goto free1;
/* On success extra reference to `fp1' and 'fp2' is set by falloc. */
-   error = falloc(td, &fp1, &fd, 0);
+   error = falloc(td, &fp1, &fd, oflag);
if (error)
goto free2;
rsv[0] = fd;
fp1->f_data = so1;  /* so1 already has ref count */
-   error = falloc(td, &fp2, &fd, 0);
+   error = falloc(td, &fp2, &fd, oflag);
if (error)
goto free3;
fp2->f_data = so2;  /* so2 already has ref count */
@@ -686,8 +712,14 @@
 if (error)
goto free4;
}
-   finit(fp1, FREAD | FWRITE, DTYPE_SOCKET, fp1->f_data, &socketops);
-   finit(fp2, FREAD | FWRITE, DTYPE_SOCKET, fp2->f_data, &socketops);
+   finit(fp1, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp1->f_data,
+   &socketops);
+   finit(fp2, FREAD | FWRITE | fflag, DTYPE_SOCKET, fp2->f_data,
+   &socketops);
+   if ((fflag & FNONBLOCK) != 0) {
+   (void) fo_ioctl(fp1, FIONBIO, &fflag, td->td_ucred, td);
+   (void) fo_ioctl(fp2, FIONBIO, &fflag, td->td_ucred, td);
+   }
fdrop(fp1, td);
fdrop(fp2, td);
return (0);

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [patch] SOCK_CLOEXEC, SOCK_NONBLOCK and MSG_CMSG_CLOEXEC

2013-03-19 Thread Jilles Tjoelker
On Mon, Mar 18, 2013 at 10:59:57PM +0200, Konstantin Belousov wrote:
> On Sun, Mar 17, 2013 at 10:23:53PM +0100, Jilles Tjoelker wrote:
> > Here are some more modifications to allow creating file descriptors with
> > close-on-exec set. Like in linux/glibc, SOCK_CLOEXEC and SOCK_NONBLOCK
> > can be OR'ed in socket() and socketpair()'s type parameter, and
> > MSG_CMSG_CLOEXEC to recvmsg() makes file descriptors (SCM_RIGHTS)
> > atomically close-on-exec.

> > The numerical values for SOCK_CLOEXEC and SOCK_NONBLOCK are as in
> > NetBSD. MSG_CMSG_CLOEXEC is the first free bit for MSG_*.

> > I do not pass the SOCK_* flags to MAC because this may cause incorrect
> > failures and can be done later via fcntl() anyway. I expect audit to
> > cope with the new flags.

> > For MSG_CMSG_CLOEXEC, I had to change unp_externalize to take a flags
> > argument.

> This looks fine to me.

Thanks for the review.

> The only note I have, which is not directly related to your patch,
> is the recvmsg(2) behaviour when the undefined flag is passed.
> The syscall silently ignores the flags. I think this is quite wrong,
> and would cause interesting (security) implications if the program
> using the MSG_CMSG_CLOEXEC is run on older kernel which does not
> interpret the flag.

This is indeed unfortunate and it also affects open(2).

It may have originally been done for reasons of modularity; the
sys_open() and kern_open() need not know about the valid flag bits this
way and a device driver or filesystem can define new bits.

The effect of an older kernel ignoring MSG_CMSG_CLOEXEC is probably not
that bad, considering that running new userland on an old kernel is not
supported. Most compatibility shims for MSG_CMSG_CLOEXEC and similar
flags (such as the one in Wayland) simply contain the race condition, so
you lose anyway with an old kernel. It would be possible to use
closefrom() or an emulation of it or to lock fork() conditionally on the
necessary CLOEXEC flags not being available but it appears that people
do not go to that trouble.

> Might be, we should start returning EINVAL for unknown flag, despite
> SUSv4 not specifying the condition ?

Per POSIX it is valid to do this but it might cause a compatibility
problem if there are applications that pass garbage in flags.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: close(2) while accept(2) is blocked

2013-03-28 Thread Jilles Tjoelker
On Thu, Mar 28, 2013 at 06:54:31PM +0200, Andriy Gapon wrote:

> So, this started as a simple question, but the answer was quite
> unexpected to me.

> Let's say we have an opened and listen-ed socket and let's assume that
> we know that one thread is blocked in accept(2) and another thread is
> calling close(2). What is going to happen?

> Turns out that practically nothing.  For kernel the close call would
> be almost a nop.
> My understanding is this:
> - when socket is created, its reference count is 1
> - when accept(2) is called, fget in kernel increments the reference
>   count (kept in an associated struct file)
> - when close(2) is called, the reference count is decremented

> The reference count is still greater than zero, so fdrop does not call
> fo_close. That means that in the case of a socket soclose is not
> called.

> I am sure that the reference counting in this case is absolutely
> correct with respect to managing kernel side structures.

I agree this is expected and correct from the kernel point of view.

> But I am not that it is correct with respect to hiding the explicit
> close(2) call from other threads that may be waiting on the socket. In
> other words, I am not sure if fo_close is supposed to signify that
> there are no uses of a file, or that userland close-d the file.  Or
> perhaps these should be two different methods.

It would be possible to keep track of the file descriptor number but I
think it is not worth the large amount of extra code.

Keeping track of file descriptor number would be necessary to interrupt
waits after a close or dup2 on the file descriptor that was passed to
the blocking call, even if the object remains open on a different file
descriptor number or in a different process.

Also, most people would use the new functionality incorrectly anyway. A
close() on a file descriptor another thread is using is risky since it
is in most cases impossible to prove that the other thread is in fact
blocked on the file descriptor and not preempted right before making the
system call. In the latter case, the other thread might accept a
connection from a different socket created later. A dup2() of /dev/null
onto the file descriptor would be safer.

> Additional note is that shutdown(2) doesn't wake up the thread in
> accept(2) either.  At least that's true for unix domain sockets.
> Not sure if this is a bug.

I think it is a bug. It works properly for IPv4 TCP sockets.

The resulting error is [ECONNABORTED] for a blocking socket which likely
leads to infinite looping if the thread does not know about the
shutdown(2) (because that error normally means the accept should be
retried later). For a non-blocking socket the error is [EWOULDBLOCK]
which also leads to infinite looping and is certainly wrong because
select/poll do report the socket as readable. Both of these are in
kern_accept() in sys/kern/uipc_syscalls.c.

POSIX does not say which error code we should return here but these two
are almost certainly wrong (it is usable for waking up threads stuck in
accept() if those threads check a variable after every accept() failure
and do not rely on the exact value of errno). Linux returns [EINVAL] for
both blocking and non-blocking sockets, probably from the POSIX error
condition "The socket is not accepting connections." In our man page
that error condition is formulated "listen(2) has not been called on the
socket descriptor." which is clearly not the case.

Also, I think a non-blocking accept() should immediately fail with the
head->so_error if it is set, rather than returning [EWOULDBLOCK] until
another connection arrives. Likewise, filt_solisten() in
sys/kern/uipc_socket.c only returns true if there is a connection, not
if there was an error or shutdown() has been called. On the other hand,
sopoll_generic() looks correct.

Error reporting on non-blocking accept() might usefully be postponed
until there is a connection or the socket has been shut down, to reduce
context switches.

> But the summary seems to be is that currently it is not possible to
> break a thread out of accept(2) (at least without resorting to
> signals).

Pthread cancellation works better than raw signals for this use case. In
a proper implementation such as in FreeBSD 9.0 or newer and used
properly, it allows avoiding the resource leak that may happen when
calling longjmp() or pthread_exit() in a signal handler just after
accept() has created a new socket.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: close(2) while accept(2) is blocked

2013-04-05 Thread Jilles Tjoelker
On Thu, Apr 04, 2013 at 08:43:17PM +0300, Andriy Gapon wrote:
> on 01/04/2013 18:22 John Baldwin said the following:
> > I think you need to split the 'struct file' reference count into two
> > different counts similar to the how we have vref/vrele vs
> > vhold/vdrop for vnodes.  The fget for accept and probably most other
> > system calls should probably be equivalent to vhold, whereas things
> > like open/dup (and storing an fd in a cmsg) should be more like
> > vref.  close() should then be a vrele().

> This model makes perfect sense.
> Unfortunately, ENOTIME to work on this.

This looks like it can work but I don't know whether it's worth the
trouble.

> Meanwhile I am using the following patch specific to local domain
> sockets, accept(2) and shutdown(2).  Turns out that the problematic
> application does both shutdown(RDWR) and close(2), but that doesn't
> help on FreeBSD.
> BTW, this is the application:
> http://thread.gmane.org/gmane.os.freebsd.devel.office/1754
> The patch does help.

> Author: Andriy Gapon 
> Date:   Thu Mar 28 20:08:13 2013 +0200
> 
> [test!] uipc_shutdown: use soisdisconnected instead of socantsendmore
> 
> So that in addition to disabling sends we also wake up threads blocked
> in accept (on so_timeo in general).
> 
> diff --git a/sys/kern/uipc_usrreq.c b/sys/kern/uipc_usrreq.c
> index 9b60eab..e93d46c 100644
> --- a/sys/kern/uipc_usrreq.c
> +++ b/sys/kern/uipc_usrreq.c
> @@ -1074,7 +1074,7 @@ uipc_shutdown(struct socket *so)
> 
>   UNP_LINK_WLOCK();
>   UNP_PCB_LOCK(unp);
> - socantsendmore(so);
> + soisdisconnected(so);
>   unp_shutdown(unp);
>   UNP_PCB_UNLOCK(unp);
>   UNP_LINK_WUNLOCK();

I think this patch makes shutdown(SHUT_WR) on unix domain sockets act
like shutdown(SHUT_RDWR), i.e. receives are incorrectly denied.

The below patch also makes shutdown(SHUT_RDWR) abort a blocking accept
on a unix domain socket, but it should work for all domains:

Index: sys/kern/uipc_socket.c
===
--- sys/kern/uipc_socket.c  (revision 248873)
+++ sys/kern/uipc_socket.c  (working copy)
@@ -2428,9 +2428,11 @@ soshutdown(struct socket *so, int how)
sorflush(so);
if (how != SHUT_RD) {
error = (*pr->pr_usrreqs->pru_shutdown)(so);
+   wakeup(&so->so_timeo);
CURVNET_RESTORE();
return (error);
}
+   wakeup(&so->so_timeo);
CURVNET_RESTORE();
return (0);
 }

A blocking accept (and some other operations) is waiting on
&so->so_timeo. Once it wakes up, it will detect the SBS_CANTRCVMORE bit.

A spurious wakeup on so->so_timeo appears harmless (sleep retried)
except when lingering on close (SO_LINGER) so this should be fairly
safe.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Fwd: Where does FreeBSD tr -C differ from tr -c?

2013-04-07 Thread Jilles Tjoelker
On Sun, Apr 07, 2013 at 09:12:57PM +0200, Cedric Blancher wrote:
> Forwarding to freebsd-hackers@/freebsd-standa...@freebsd.org

> The question remain open and I need help. tr -C is implemented by
> FreeBSD tr -C but I can't find examples (or a testcase) where tr -c
> and tr -C differ.

Reading the rationale of POSIX, here is an example of a difference:

% printf 'a\200'|LC_ALL=en_US.US-ASCII tr -cd '\000-\177'|hd
  61|a|
0001
% printf 'a\200'|LC_ALL=en_US.US-ASCII tr -Cd '\000-\177'|hd 
  61 80 |a.|
0002

Because the bytes 128..255 are not characters in us-ascii, they cannot
be removed with -Cd, only with -cd.

Here is another difference (using LC_CTYPE=en_US.UTF-8, rest C):

% echo $'\U0001a000'|tr -cd '\U0001a000'|hd 
% echo $'\U0001a000'|tr -Cd '\U0001a000'|hd
  f0 9a 80 80   ||
0004

The cause is that iswrune(3) returns false for the unassigned code point
U+0001A000.

This may well contain bugs because Unicode adds new characters from time
to time and our tables seem to be updated very rarely.

POSIX also says things about collation order. You may not have detected
this because FreeBSD does not implement LC_COLLATE for multibyte locales
yet.

> PS: Who wrote tr -C and how can I contact the author?

You can read the Subversion logs but people may no longer be around.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: GSOC 2013 project " Kernel Size Reduction for Embedded System "

2013-04-08 Thread Jilles Tjoelker
On Mon, Apr 08, 2013 at 08:28:04PM +, Amit Rawat wrote:
> GSOC posted the list of selected organization for GSOC 2013 and I am
> highly happy that FreeBSD is among the selected organization.

> I am a third year student interested to work in the field of embedded
> system. I applied last year and the title of my project was " Kernel
> Size Reduction for Embedded System". The link to my last year proposal
> is
> https://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/amitrawat10/1#c8001

> But due to some flaws it doesn't get selected. I am looking to improve my
> proposal for this year and apply again. I explain some portion of my
> project pictorially on my blog

> https://amit10rawat.wordpress.com/2013/02/26/kernel-size-reduction-for-embedded-system/

> I am looking for suggestion and new ideas by which I can reduce the
> size of kernel.

It looks like the overlay idea could be implemented more simply by
taking advantage of the VM system: make part of the kernel code
pageable. Memory formerly occupied by rarely used kernel code can then
be used by userland applications. You will need some sort of backing
store where the kernel code can be read after booting; this is not
normally available.

However, almost no kernel code is safe in a situation where an
instruction fetch may fault. Reading or writing the secondary storage
can easily cause a deadlock. It causes the thread to sleep, which is not
allowed while holding a mutex. It would help if you could wire down
pieces which will need to be used in the near future from a place where
a fault is safe, but this may also be very slow even if the code is
already in memory.

Some other ideas for kernel size reduction:

* Find pieces of code that are required but seem big for what they do
  for you, and try to make them smaller. The proposal should list
  concrete parts.

* Find variables and functions that are required only during kernel
  initialization, place them in a special section and add this section
  to the free memory pool after kernel initialization.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: GSOC: Qt front-end for freebsd-update

2013-04-14 Thread Jilles Tjoelker
On Sun, Apr 14, 2013 at 02:11:44AM -0400, Justin Edward Muniz wrote:
>  I am excited for this year's Google Summer of Code, and I have a
> project in mind that I am working to propose.

>  I am a CS major and have experience with Qt, C++ and shell scripting.
> I have been developing on FreeBSD for several years, and I am looking to
> tackle developing a new Qt front-end for the freebsd-update command.

>  I am thinking a rather straight-forward user interface with more
> advanced options available (such as selecting a specific server). I am
> looking for constructive feedback, and also a developer interested in
> mentoring me this Summer.

If you want to work on freebsd-update, I think it will be more useful to
improve freebsd-update itself. Some issues are:

* It is not particularly robust in the face of errors (such as a full
  disk). From reading the documentation, you might get the impression
  that it is better than it actually is. The state for the rollback
  command is only set up after installation of the update so that
  command is not useful for backing out a partially installed update.

* The upgrade feature takes large amounts of time and network bandwidth.

* Less required user interaction would be nice. Think of upgrading many
  machines.

* freebsd-update and pkgng are two upgrade systems. Perhaps
  freebsd-upgrade should be scrapped altogether in favour of a
  pkgng-based solution.

If you want to write GUIs, perhaps a pkgng GUI is more interesting.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


[patch] accept4

2013-04-19 Thread Jilles Tjoelker
The accept4() function, compared to accept(), allows setting the new
file descriptor atomically close-on-exec and explicitly controlling the
non-blocking status on the new socket. (Note that the latter point means
that accept() is not equivalent to any form of accept4().)

The linuxulator's accept4 implementation leaves a race window where the
new file descriptor is not close-on-exec because it calls sys_accept().
This implementation leaves no such race window (by using falloc()
flags). The linuxulator could be fixed and simplified by using the new
code.

The comms/openobex port now compiles again without the
patch-lib_cloexec.h patch.

More information about API extensions to set new file descriptors
atomically close-on-exec is at
https://wiki.freebsd.org/AtomicCloseOnExec

I'm also working on pipe2() (using linuxulator work) and dup3() (patch
from Jukka A. Ukkonen).

commit 2e410d9260d12328b689b0938e09d09649b0460a
Author: Jilles Tjoelker 
Date:   Sat Mar 30 21:44:14 2013 +0100

Add accept4() system call.

diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc
index fef0f3c..105f469 100644
--- a/lib/libc/sys/Makefile.inc
+++ b/lib/libc/sys/Makefile.inc
@@ -270,6 +270,7 @@ MAN+=   sctp_generic_recvmsg.2 \
wait.2 \
write.2
 
+MLINKS+=accept.2 accept4.2
 MLINKS+=access.2 eaccess.2 \
access.2 faccessat.2
 MLINKS+=brk.2 sbrk.2
diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map
index 6faa0af..149fa41 100644
--- a/lib/libc/sys/Symbol.map
+++ b/lib/libc/sys/Symbol.map
@@ -378,6 +378,7 @@ FBSD_1.2 {
 };
 
 FBSD_1.3 {
+   accept4;
bindat;
cap_fcntls_get;
cap_fcntls_limit;
@@ -461,6 +462,8 @@ FBSDprivate_1.0 {
__sys_abort2;
_accept;
__sys_accept;
+   _accept4;
+   __sys_accept4;
_access;
__sys_access;
_acct;
diff --git a/lib/libc/sys/accept.2 b/lib/libc/sys/accept.2
index 978b948..e8bccaf 100644
--- a/lib/libc/sys/accept.2
+++ b/lib/libc/sys/accept.2
@@ -41,6 +41,8 @@
 .In sys/socket.h
 .Ft int
 .Fn accept "int s" "struct sockaddr * restrict addr" "socklen_t * restrict 
addrlen"
+.Ft int
+.Fn accept4 "int s" "struct sockaddr * restrict addr" "socklen_t * restrict 
addrlen" "int flags"
 .Sh DESCRIPTION
 The argument
 .Fa s
@@ -66,6 +68,26 @@ and
 signals from the original socket
 .Fa s .
 .Pp
+The
+.Fn accept4
+system call is similar,
+but the
+.Dv O_NONBLOCK
+property of the new socket is instead determined by the
+.Dv SOCK_NONBLOCK
+flag in the
+.Fa flags
+argument,
+the
+.Dv O_ASYNC
+property is cleared,
+the signal destination is cleared
+and the close-on-exec flag on the new file descriptor can be set via the
+.Dv SOCK_CLOEXEC
+flag in the
+.Fa flags
+argument.
+.Pp
 If no pending connections are
 present on the queue, and the original socket
 is not marked as non-blocking,
@@ -141,13 +163,15 @@ properties and the signal destination being inherited,
 but should set them explicitly using
 .Xr fcntl 2 .
 .Sh RETURN VALUES
-The call returns \-1 on error.
-If it succeeds, it returns a non-negative
+These calls return \-1 on error.
+If they succeed, they return a non-negative
 integer that is a descriptor for the accepted socket.
 .Sh ERRORS
 The
 .Fn accept
-system call will fail if:
+and
+.Fn accept4
+system calls will fail if:
 .Bl -tag -width Er
 .It Bq Er EBADF
 The descriptor is invalid.
@@ -180,6 +204,16 @@ are present to be accepted.
 A connection arrived, but it was closed while waiting
 on the listen queue.
 .El
+.Pp
+The
+.Fn accept4
+system call will also fail if:
+.Bl -tag -width Er
+.It Bq Er EINVAL
+The
+.Fa flags
+argument is invalid.
+.El
 .Sh SEE ALSO
 .Xr bind 2 ,
 .Xr connect 2 ,
@@ -194,3 +228,8 @@ The
 .Fn accept
 system call appeared in
 .Bx 4.2 .
+.Pp
+The
+.Fn accept4
+system call appeared in
+.Fx 10.0 .
diff --git a/lib/libthr/pthread.map b/lib/libthr/pthread.map
index 355edea..bbbd930e 100644
--- a/lib/libthr/pthread.map
+++ b/lib/libthr/pthread.map
@@ -181,6 +181,7 @@ FBSDprivate_1.0 {
___wait;
___waitpid;
__accept;
+   __accept4;
__aio_suspend;
__close;
__connect;
@@ -408,3 +409,7 @@ FBSD_1.2 {
setcontext;
swapcontext;
 };
+
+FBSD_1.3 {
+   accept4;
+};
diff --git a/lib/libthr/thread/thr_syscalls.c b/lib/libthr/thread/thr_syscalls.c
index 2327d74..7a08302 100644
--- a/lib/libthr/thread/thr_syscalls.c
+++ b/lib/libthr/thread/thr_syscalls.c
@@ -101,6 +101,7 @@ extern pid_t__waitpid(pid_t, int *, int);
 extern int __sys_aio_suspend(const struct aiocb * const[], int,
const struct timespec *);
 extern int __sys_accept(int, struct sockaddr *, socklen_t *);
+extern int __sys_accept4(int, struct sockaddr *, socklen_t *, int);
 extern int __sys_connect(int, const struct sockaddr *, socklen_t);
 extern int __sys_fsync(int);
 extern int __sys_msync(void

[patch] pipe2

2013-04-21 Thread Jilles Tjoelker
On Sat, Apr 20, 2013 at 12:48:39AM +0200, Jilles Tjoelker wrote:
> I'm also working on pipe2() (using linuxulator work) and dup3() (patch
> from Jukka A. Ukkonen).

This is an implementation of pipe2. As with the accept4 patch, make
sysent needs to be run in sys/kern and sys/compat/freebsd32.

As a bonus, new architectures might implement pipe(p) as pipe2(p, 0)
avoiding the need for assembler (but behaviour is different if the
pointer is invalid).

I will do a __FreeBSD_version bump once these are in.

commit 256ab750c5e70db85d87c884994eb13038b182a2
Author: Jilles Tjoelker 
Date:   Sat Mar 30 23:32:34 2013 +0100

Add pipe2() system call.

diff --git a/include/unistd.h b/include/unistd.h
index dabf178..9df0777 100644
--- a/include/unistd.h
+++ b/include/unistd.h
@@ -533,6 +533,7 @@ char*mktemp(char *);
 #endif
 int nfssvc(int, void *);
 int nlm_syscall(int, int, int, char **);
+int pipe2(int *, int);
 int profil(char *, size_t, vm_offset_t, int);
 int rcmd(char **, int, const char *, const char *, const char *, int *);
 int rcmd_af(char **, int, const char *,
diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc
index 105f469..8e918bf 100644
--- a/lib/libc/sys/Makefile.inc
+++ b/lib/libc/sys/Makefile.inc
@@ -352,6 +352,7 @@ MLINKS+=pathconf.2 lpathconf.2
 MLINKS+=pdfork.2 pdgetpid.2\
pdfork.2 pdkill.2 \
pdfork.2 pdwait4.2
+MLINKS+=pipe.2 pipe2.2
 MLINKS+=read.2 pread.2 \
read.2 preadv.2 \
read.2 readv.2
diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map
index 149fa41..24f4621 100644
--- a/lib/libc/sys/Symbol.map
+++ b/lib/libc/sys/Symbol.map
@@ -393,6 +393,7 @@ FBSD_1.3 {
ffclock_getcounter;
ffclock_getestimate;
ffclock_setestimate;
+   pipe2;
posix_fadvise;
wait6;
 };
diff --git a/lib/libc/sys/pipe.2 b/lib/libc/sys/pipe.2
index 92d137f..bb3db51 100644
--- a/lib/libc/sys/pipe.2
+++ b/lib/libc/sys/pipe.2
@@ -28,7 +28,7 @@
 .\" @(#)pipe.2 8.1 (Berkeley) 6/4/93
 .\" $FreeBSD$
 .\"
-.Dd January 30, 2006
+.Dd March 31, 2013
 .Dt PIPE 2
 .Os
 .Sh NAME
@@ -40,6 +40,8 @@
 .In unistd.h
 .Ft int
 .Fn pipe "int fildes[2]"
+.Ft int
+.Fn pipe2 "int fildes[2]" "int flags"
 .Sh DESCRIPTION
 The
 .Fn pipe
@@ -50,6 +52,29 @@ which is an object allowing
 bidirectional data flow,
 and allocates a pair of file descriptors.
 .Pp
+The
+.Fn pipe2
+system call allows control over the attributes of the file descriptors
+via the
+.Fa flags
+argument.
+Values for
+.Fa flags
+are constructed by a bitwise-inclusive OR of flags from the following
+list, defined in
+.In fcntl.h :
+.Bl -tag -width ".Dv O_NONBLOCK"
+.It Dv O_CLOEXEC
+Set the close-on-exec flag for the new file descriptors.
+.It Dv O_NONBLOCK
+Set the non-blocking flag for the ends of the pipe.
+.El
+.Pp
+If the
+.Fa flags
+argument is 0, the behavior is identical to a call to
+.Fn pipe .
+.Pp
 By convention, the first descriptor is normally used as the
 .Em read end
 of the pipe,
@@ -88,7 +113,9 @@ pipe in one direction.
 .Sh ERRORS
 The
 .Fn pipe
-system call will fail if:
+and
+.Fn pipe2
+system calls will fail if:
 .Bl -tag -width Er
 .It Bq Er EMFILE
 Too many descriptors are active.
@@ -97,6 +124,16 @@ The system file table is full.
 .It Bq Er ENOMEM
 Not enough kernel memory to establish a pipe.
 .El
+.Pp
+The
+.Fn pipe2
+system call will also fail if:
+.Bl -tag -width Er
+.It Bq Er EINVAL
+The
+.Fa flags
+argument is invalid.
+.El
 .Sh SEE ALSO
 .Xr sh 1 ,
 .Xr fork 2 ,
@@ -111,3 +148,8 @@ function appeared in
 .Pp
 Bidirectional pipes were first used on
 .At V.4 .
+.Pp
+The
+.Fn pipe2
+function appeared in
+.Fx 10.0 .
diff --git a/sys/compat/freebsd32/syscalls.master 
b/sys/compat/freebsd32/syscalls.master
index 2cbdf31..be245be 100644
--- a/sys/compat/freebsd32/syscalls.master
+++ b/sys/compat/freebsd32/syscalls.master
@@ -1026,3 +1026,4 @@
struct sockaddr * __restrict name, \
__socklen_t * __restrict anamelen, \
int flags); }
+542AUE_PIPENOPROTO { int pipe2(int *fildes, int flags); }
diff --git a/sys/kern/capabilities.conf b/sys/kern/capabilities.conf
index 0b64503..d2fa51c 100644
--- a/sys/kern/capabilities.conf
+++ b/sys/kern/capabilities.conf
@@ -490,6 +490,7 @@ pdkill
 ## Allow pipe(2).
 ##
 pipe
+pipe2
 
 ##
 ## Allow poll(2), which will be scoped by capability rights.
diff --git a/sys/kern/sys_pipe.c b/sys/kern/sys_pipe.c
index 90c3022..493fee5e 100644
--- a/sys/kern/sys_pipe.c
+++ b/sys/kern/sys_pipe.c
@@ -477,6 +477,24 @@ sys_pipe(struct thread *td, struct pipe_args *uap)
return (0);
 }
 
+int
+sys_pipe2(struct thread *td, struct pipe2_args *uap)
+{
+   int error, fildes[2];
+
+   if (uap->flags & ~(O_CLOEXEC | O_NONBLOCK))
+   return (EINVAL);
+   error = kern_pipe2(td, fildes, uap-

Re: [patch] pipe2

2013-04-22 Thread Jilles Tjoelker
On Mon, Apr 22, 2013 at 08:31:18PM +0300, Konstantin Belousov wrote:
> On Sun, Apr 21, 2013 at 11:02:43PM +0200, Jilles Tjoelker wrote:
> > On Sat, Apr 20, 2013 at 12:48:39AM +0200, Jilles Tjoelker wrote:
> > > I'm also working on pipe2() (using linuxulator work) and dup3() (patch
> > > from Jukka A. Ukkonen).

> > This is an implementation of pipe2. As with the accept4 patch, make
> > sysent needs to be run in sys/kern and sys/compat/freebsd32.

> > As a bonus, new architectures might implement pipe(p) as pipe2(p, 0)
> > avoiding the need for assembler (but behaviour is different if the
> > pointer is invalid).

> I do not see anything wrong with the patch.  That said, I prefer the
> pipe(2) approach of delegating the access to the usermode memory to
> usermode, which avoids the need of the calls to kern_close().

Delegating the access gives better semantics (a signal instead of an
[EFAULT] error) but I don't like adding architecture-specific assembler
wrappers like pipe.S. Also, different calling conventions than expected
from the C level prototype may make things harder to understand.

Perhaps the wrapper can be implemented in C with the below declarations
but it is quite likely that this is not fully portable.

struct pipereturn { register_t a, b; };
struct pipereturn __sys_pipe(void);

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Some improvements to rm(1)

2013-04-26 Thread Jilles Tjoelker
On Thu, Apr 25, 2013 at 10:56:10PM -0400, Eitan Adler wrote:
> On 25 April 2013 22:50, Brooks Davis  wrote:
> > On Thu, Apr 25, 2013 at 10:16:32PM -0400, Eitan Adler wrote:
> >> Anyone have thoughts on the following?

> >> commit 82c78ba923d8ce4a1bfbb309658c49021c8bb384
> >> Author: Eitan Adler 
> >> Date:   Thu Apr 25 22:14:49 2013 -0400

> >> Take some improvements from DragonFlyBSD:
> >>   - add const where appropriate
> >>   - add static where appropriate
> >>   - fix a whitespace issues
> >
> > The no-op changes look more correct to me.

> > I think the -x option seems a bit odd.  What is the use case?  At a
> > first thought, it seems to raise more questions than it resolves.

> It goes along with cp -x, find -x, and others.

> Quick example #1: You have /usr/ports /usr/ports/distfiles as
> different mount points it lets you wipe /usr/ports without wiping your
> distfile cache.

> Quick example #2: You have /usr/src/ null mounted in every user's
> /home/ and you want to wipe one home directory.

Hmm, isn't this already possible using  find -x DIR -delete  ?

There will be an error message 'Device busy' about attempting to delete
the mount point but this does not even affect the exit status and all
files not under the mount point are removed.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Some improvements to rm(1)

2013-04-26 Thread Jilles Tjoelker
On Fri, Apr 26, 2013 at 08:33:54AM -0500, Joshua Isom wrote:
> On 4/26/2013 7:23 AM, Eitan Adler wrote:
> > Yes, rm's functionality can be fully replicated by find.

> As well as anything using -R.

Emulating other -R things using find becomes quite slow when you don't
want to impose {PATH_MAX} limits or open up symlink-based race windows
because the only safe option is -execdir UTILITY {} \;. Any find command
based on -exec, -print or -print0 passes pathnames which are subject to
{PATH_MAX} limits and directories concurrently replaced with symlinks.

The construct -execdir ... {} + is unusably broken in older FreeBSD
versions and gives no advantage compared to -execdir ... {} \; in recent
-CURRENT.

With -L, this is not a new problem because symlinks are followed anyway
and the underlying code (fts(3)) always imposes the {PATH_MAX} limit in
that case.

The -delete primary is safe like -execdir.

I'm not entirely sure about this because the rm(1) patch is simple and
the new syntax is fairly clear.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: SystemV IPC. Segment info

2013-04-30 Thread Jilles Tjoelker
On Tue, Apr 30, 2013 at 07:49:40AM +0400, Vagner wrote:
> Tell me please, may I send this to PR, or this changes is not valid?

> > A few weeks ago, I ran into problem, which related to SystemV IPC.
> > More than 20 processes attached to a segment shared queue.
> > Process-initiator for create segment was killed, as process which
> > was accessed to segment last. Segment didn't free memory, but tagged
> > it as SHMSEG_REMOVED as the result. This is a reason of memory
> > overflow (memory assotiated as shm). Moreover, processes, which was
> > attached to this segment did't get a new data. I have one resolve. I
> > need to restarted all process, which still attached to segment. But
> > this reason have a problem. We haven't list of this processes at
> > system. Moreover, struct shmid_ds, which described segment, haven't
> > this info too.

> > This patch is a resolve of problem. It: 
> > - added a linked list of structures shmid_pi in struct shmid_ds. PID
> > (and last access time) recorded to this struct consistently. Memory
> > allocates with ident 'shminfo' for this list of struct shmid_pi.
> > - added syscall shminf for get all elements from list shmid_ds.
> > - added option [-P] in ipcs(1) for system call shminf.

I think it is strange to maintain pids in the VM system. This makes the
VM system more complex for little reason (because the information is
only needed for monitoring, not normal operation).

Perhaps it is possible to do what you want using procstat -v or a slight
extension to it.

Alternatively, you may find a different way to get rid of your stale
worker processes. For example, before invoking shmctl(IPC_RMID), set a
flag inside the memory segment that all workers should exit and activate
whatever mechanism you use for telling the workers that they need to do
something.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Managing userland data pointers in kqueue/kevent

2013-05-19 Thread Jilles Tjoelker
On Wed, May 15, 2013 at 01:34:58PM +0100, Paul LeoNerd Evans wrote:
> On Wed, 15 May 2013 13:29:59 +0100
> Paul "LeoNerd" Evans  wrote:

> > Is that not the exact thing I suggested?

> > The "extension to create register a kevent to catch these events" is
> > that you put the EV_DROPWATCH bit flag in the event at the time you
> > register it.

> > The "returned event [that] could have all the appropriate informaiton
> > for the event being dropped" is that you receive an event with
> > EV_DROPPED set on it. It being a real event includes of course the
> > udata pointer, so you can handle it.

> In fact, to requote the original PR I wrote[1] on the subject:

> ---

>   I propose the addition of a new flag applicable to any kevent watch
>   structure, documented thusly:

> The flags field can contain the following values:
> ..
> EV_DROPWATCH Requests that the kernel will send an EV_DROPPED event
>  on this watch when it has finished watching it for any
>  reason, including EV_DELETE, expiry because of
>  EV_ONESHOT, or because the filehandle was closed by
>  close(2).
> 
> EV_DROPPED   This flag is returned by the kernel if it is now about
>  to drop the watch. After this flag has been received,
>  no further events will occur on this watch.

>   This flag then makes it trivial to build a generic wrapper for kqueue
>   that can always manage its memory correctly.

>   a) at EV_ADD time, simply set flags |= EV_DROPWATCH

>   b) after an event has been processed that included the EV_DROPPED
>   flag, free() the pointer given in the udata field.

An important detail is missing: how do you avoid using up all kernel
memory on knotes if someone keeps adding new file descriptors with
EV_ADD | EV_DROPWATCH and closing the file descriptors again without
ever draining the kqueue?

This problem did not use to exist for file descriptor events before: the
number of such knotes was limited to the number of open file
descriptors. However, it does already exist for most of the other event
types. For example, pwait -v will return the exit status even if it was
suspended (^Z) while the process terminated and the parent reaped the
zombie. For EVFILT_TIMER, the worst effect is a denial of service of
EVFILT_TIMER on all other processes in the system. EVFILT_USER does not
appear to check anything and appears to allow arbitrary kernel memory
consumption.

The EVFILT_TIMER needs to keep its global limit and EVFILT_USER needs
something similar.

For the rest, call an event that is no longer associated to a kernel
object (e.g. EVFILT_READ whose file descriptor is closed, EVFILT_PROC
whose process has terminated and been reaped by the parent or EVFILT_AIO
whose I/O request is completed) "unbound". The number of events that are
not unbound is limited by existing limits on the other kernel objects. A
possible fix is to reject (such as with [ENOMEM]) adding new events when
there are too many unbound events in the queue already. The application
should then allow kevent() to return pending events first before it adds
new ones. If the kernel returns unbound events in preference to other
events, a kevent() call with nevents >= 2 * nchanges cannot result in a
net increase in the number of current and potential unbound events,
since it allows the kernel to return (and forget) as many unbound events
as it may add (nchanges entries are required for EV_ERROR leaving
nchanges for returning other events).

>   It is not required that these two flags have distinct values; since
>   one is userland->kernel and the other kernel->userland, they could for
>   neatness reuse the same bit field.

I think it would be consistent with other EV_* to use the same name and
value for both.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: find -delete broken, or just used improperly?

2013-05-20 Thread Jilles Tjoelker
On Mon, May 20, 2013 at 03:23:16PM -0400, Kurt Lidl wrote:
> OK, maybe I'm missing something obvious, but...

> find(1) says:

>  -delete
>  Delete found files and/or directories.  Always returns true.
>  This executes from the current working directory as find recurses
>  down the tree.  It will not attempt to delete a filename with a
>  ``/'' character in its pathname relative to ``.'' for security
>  reasons.  Depth-first traversal processing is implied by this
>  option.  Following symlinks is incompatible with this option.

> However, it fails even when the path is absolute:

> bhyve9# mkdir /tmp/foo
> bhyve9# find /tmp/foo -empty -delete
> find: -delete: /tmp/foo: relative path potentially not safe

> Shouldn't this work?

The "relative path" refers to a pathname that contains a slash other
than at the beginning or end and may therefore refer to somewhere else
if a directory is concurrently replaced by a symlink.

When -L is not specified and "." can be opened, the fts(3) code
underlying find(1) is careful to avoid following symlinks or being
dropped in different locations by moving the directory fts is currently
traversing. If a problematic concurrent modification is detected, fts
will not enter the directory or abort. Files found in the search are
returned via the current working directory and a pathname not containing
a slash.

For paranoia, find(1) verifies this when -delete is used. However, it is
too paranoid about the root of the traversal. It is already assumed that
the initial pathname does not refer to directories or symlinks that
might be replaced by untrusted users; otherwise, the whole traversal
would be unsafe. Therefore, it is not necessary to do the check for
fts_level == FTS_ROOTLEVEL.

The below patch allows deleting the pathname given to find itself:

Index: usr.bin/find/function.c
===
--- usr.bin/find/function.c (revision 250661)
+++ usr.bin/find/function.c (working copy)
@@ -442,7 +442,8 @@
errx(1, "-delete: forbidden when symlinks are followed");
 
/* Potentially unsafe - do not accept relative paths whatsoever */
-   if (strchr(entry->fts_accpath, '/') != NULL)
+   if (entry->fts_level > FTS_ROOTLEVEL &&
+   strchr(entry->fts_accpath, '/') != NULL)
errx(1, "-delete: %s: relative path potentially not safe",
entry->fts_accpath);
 

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: find -delete broken, or just used improperly?

2013-05-24 Thread Jilles Tjoelker
On Tue, May 21, 2013 at 11:06:39AM -0400, John Baldwin wrote:
> On Monday, May 20, 2013 5:47:31 pm Jilles Tjoelker wrote:
> > The below patch allows deleting the pathname given to find itself:

> > Index: usr.bin/find/function.c
> > ===
> > --- usr.bin/find/function.c (revision 250661)
> > +++ usr.bin/find/function.c (working copy)
> > @@ -442,7 +442,8 @@
> > errx(1, "-delete: forbidden when symlinks are followed");
> >  
> > /* Potentially unsafe - do not accept relative paths whatsoever */
> > -   if (strchr(entry->fts_accpath, '/') != NULL)
> > +   if (entry->fts_level > FTS_ROOTLEVEL &&
> > +   strchr(entry->fts_accpath, '/') != NULL)
> > errx(1, "-delete: %s: relative path potentially not safe",
> > entry->fts_accpath);

> I'm curious, how would you instruct a patched find to avoid deleteing
> the /tmp/foo directory (e.g. if you wanted this to be a job that
> pruned empty dirs from /tmp/foo but never pruned the directory
> itself).  Would -mindepth 1 do it?  (Just asking.  I have also found
> this message annoying but most of the jobs I have seen it on probably
> don't want to delete the root path, just descendants.)

-mindepth 1 works, as does cd /tmp/foo && find . -... (-delete silently
ignores . and ..).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: /bin/sh => STDIN & functions, var scope messing

2013-05-30 Thread Jilles Tjoelker
On Tue, May 28, 2013 at 11:48:47AM +0200, Václav Zeman wrote:
> On 27 May 2013 21:58, Reid Linnemann wrote:
> > from SH(1)

> > "Note that unlike some other shells, sh executes each process in a pipe-
> >  line with more than one command in a subshell environment and as a
> > child
> >  of the sh process."

> > I'm taking this to mean that redirecting to sh_f has sh_f execute in
> > a subshell in which global_scope_var changes, but the original
> > shell's copy is uncahnged.
> Curious. Which of the two behaviours is POSIXly correct?

Both. As per XCU 2.12 Shell Execution Environment, each command in a
multi-command pipeline may or may not be executed in a subshell
environment.

Behaviour different from our sh is most often encountered in the various
versions of the real Korn shell (ksh88 and ksh93), which execute the
last command in a pipeline in the current shell environment.

If things like  jobs | cat  work, that can also be explained using this
rule.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Importing tradcpp (traditional (K&R-style) C macro preprocessor) into base?

2013-06-11 Thread Jilles Tjoelker
On Wed, Jun 12, 2013 at 12:23:44AM +0200, Matthias Andree wrote:
> Am 12.06.2013 00:11, schrieb Baptiste Daroussin:
> > I have been working in importing tradcpp (developped by David A.
> > Holland from NetBSD) into the ports tree, it is a traditional
> > (K&R-style) C macro preprocessor BSD licensed. I first worked on it
> > so that imake can work properly without gcc.

> > I discovered that some part of the base system still needs a
> > traditional preprocessor, like (calendar), what I propose it to
> > import tradcpp into the base system (not the version in port right
> > now but what will become version 0.2).

> > It mostly behave like gcpp, and I'm able to properly use calendar
> > along with tradcpp with this small patch:
> > http://people.freebsd.org/~bapt/tradcpp.diff

> > Any objections against me importing it?

> Shouldn't we fix calendar and imake so that they can use a modern cpp,
> instead of going back 25 years?  Or am I missing the point here?

A modern cpp is only suitable for preprocessing C code. It makes various
assumptions about syntax that make it work better for preprocessing
actual C code but worse for text that is not C. Calendar and imake use
the traditional cpp for preprocessing non-C text.

There is already a patch for removing the dependency on cpp in calendar.
I think that makes more sense than importing a traditional cpp (assuming
there are no other users in base that cannot be fixed).

The "fix" for imake is probably not to use it. Ports that need imake can
continue to use it if it depends on a traditional cpp port.

A further point of confusion is that things that want a "traditional
cpp" in fact want an equivalent of "gcpp -traditional". This is in fact
the point of tradcpp and why the old preprocessors do not work.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: make: evaluation of symbolic link with ../ fails

2007-11-06 Thread Jilles Tjoelker
On Tue, Nov 06, 2007 at 05:20:14PM +0100, Julian H. Stacey wrote:
> It seems to me that all except csh (including bourne shell !) are
> broken !! Amazing ! None of them cope properly actually following
> symbolic links, they all make false premise the /some_path/.. ==
> /some_path !

This may be a little confusing, but is indeed the defined POSIX
behaviour for pwd and cd.  $PWD is used to store the "logical" path
(potentially containing symlinks) that was cd'ed to.  You can avoid the
special behaviour with the -P (physical) option.  cd -P will not treat
symlinks specially and store a pathname not containing symlinks in $PWD;
pwd -P will ignore $PWD and show a pathname not containing symlinks.

A further thing to note is that csh does not have a pwd builtin (you are
supposed to use dirs or prompt substitutions apparently) and that
FreeBSD's /bin/pwd has the -P behaviour by default (you need -L to make
it use $PWD).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: config(8) patch for review for src dir handling

2007-12-23 Thread Jilles Tjoelker
On Sun, Dec 23, 2007 at 10:40:39AM +0100, Dag-Erling Smørgrav wrote:
> Ed Maste <[EMAIL PROTECTED]> writes:
> > Right now config(8) calls realpath("../..", ... to find the src path
> > to write into the kernel Makefile.  I want to change this to use $PWD
> > with the last two path components removed, assuming it's the same dir
> > as ../.. .

> I'm worried that your patch assumes that $PWD is present and correct,
> for which there is no guarantee.  What happens if you use getcwd()
> instead of getenv("PWD")?

getcwd() does not use $PWD, it returns a pathname without symlinks.
So that would lead to the original behaviour.

A better way could be to use $PWD if it is set and an absolute logical
pathname referring to the current directory, as in src/bin/pwd/pwd.c .

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: /dev/null & zero inside chroot for make release

2010-05-13 Thread Jilles Tjoelker
On Thu, May 13, 2010 at 07:44:58PM +0200, Julian H. Stacey wrote:
> Problem with /dev/null & /dev/zero inside a chroot:
> I wanted to build a release from inside a chroot

> What sort of null & zero should be in chroot ?
> man mknod ... deprecated ...
> Should I be running a devfs (I'm not currently)
> Or a jail ? (I dont really want that level of encapsulation ).

Mount devfs in your chroot. Then use devfs(8) to limit what's visible in
that particular devfs, if you want.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: /dev/null & zero inside chroot for make release

2010-05-15 Thread Jilles Tjoelker
On Sat, May 15, 2010 at 09:12:04AM +0200, Tjado Mäcke wrote:
> Am 13.05.2010 19:44, schrieb Julian H. Stacey:
> > Hi Hackers,
> > Problem with /dev/null & /dev/zero inside a chroot:
> > I wanted to build a release from inside a chroot

> > What sort of null & zero should be in chroot ?
> > man mknod ... deprecated ...
> > Should I be running a devfs (I'm not currently)
> > Or a jail ? (I dont really want that level of encapsulation ).

> http://www.ijs.si/software/amavisd/README.chroot

> mknod dev/nullc  2 2   # FreeBSD

This is deprecated as of FreeBSD 5.0 and does not work at all as of
FreeBSD 6.0, see mknod(8). Only device nodes in devfs can be used to
access devices.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: libc symbol versioning difficulties with iconv integration

2010-06-04 Thread Jilles Tjoelker
On Fri, Jun 04, 2010 at 12:58:34PM +0200, Gabor Kovesdan wrote:
> I'm trying to integrate the result of my last SoC work to the base 
> system but I'm facing some difficulties with libc symbol versioning. I 
> placed the iconv code into an iconv subdirectory inside src/lib/libc and 
> I added a Makefile and a symbol map, just like another parts of libc do 
> but when I try to compile this stuff, I get this error in the linking phase:

> building shared library libc.so.7
> /usr/bin/ld: libc.so.7: undefined versioned symbol namefts_o...@fbsd_1.0
> /usr/bin/ld: failed to set dynamic section sizes: Bad value
> *** Error code 1

> I have no idea what's going wrong because I did everything exactly in 
> the same way as another components do. I don't know why does it break at 
> fts_open(), which is unrelated to iconv, not even used in the iconv 
> code. If I just unhook the iconv part fromt he build, everything goes 
> fine. Any ideas?

> Patch is here: http://kovesdan.org/patches/iconv-libc.diff

There is a  .include   in iconv/Makefile.inc, what happens
if you take that out?

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Help with some makefile hackery

2010-06-25 Thread Jilles Tjoelker
On Tue, Jun 22, 2010 at 11:18:43PM -0700, Patrick Mahan wrote:
> src-kern-tools:
>  cd src; ./-kernel-toolchain.sh 2>&1 | tee 

The pipeline will return the status of 'tee' which is almost always 0.
The exit status of the build script is ignored.

A simple fix is store the status in a file and refer to that afterwards.
Another fix uses a separate file descriptor to pass through the status,
like
  { st=$(
{
  { ./kernel-toolchain.sh 2>&1 3>&- 4>&-; echo $? >&3; } | tee logfile >&4
} 3>&1)
  } 4>&1
but this is fairly complicated.

The issue can be sidestepped entirely by sending the output to a file
only; a developer can use tail -f to follow the build if necessary.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: ar(1) format_decimal failure is fatal?

2010-08-29 Thread Jilles Tjoelker
On Sat, Aug 28, 2010 at 07:08:34PM -0400, Benjamin Kaduk wrote:
> I've been working on fixing the OpenAFS network filesystem client for 
> FreeBSD, and it's at the point where I want to use the lazy man's 
> filesystem stress test: buildworld.
> However, quite early in the process, I get the following error:
> >>> stage 1.1: legacy release compatibility shims
> [...]
> ===> tools/build (obj,includes,depend,all,install)
> [...]
> building static egacy library
> ar: fatal: Numeric user ID too large
> *** Error code 70

> This error appears to be coming from 
> lib/libarchive/archive_write_set_format_ar.c , which seems to only have 
> provisions for outputting a user ID in AR_uid_size = 6 columns.
> Now, AFS user IDs are not tied to unix IDs; there is a separate protection 
> server whose "pts" IDs are used for access control; these pts IDs are tied 
> to kerberos authentication.  In this case, I was authenticated as 
> daemon/freebuild.mit@athena.mit.edu, with pts ID 33554737, which needs 
> 8 columns to display.

> It looks like this macro was so defined in version 1.1 of that file, with 
> commit message "'ar' format support for libarchive, contributed by Kai 
> Wang.".  This doesn't make it terribly clear whether the 'ar' format 
> mandates this length, or if it is an implementation decision, so I get to 
> ask: what reasoning (if any) was behind this choice?  Would anything break 
> if it was bumped up to a larger size?  Are there other options for a 
> workaround in my AFS environment?

I wonder if the uid/gid fields are useful at all for ar archives. Ar
archives are usually not extracted, and when they are, the current
user's values seem good enough. The uid/gid also prevent exactly
reproducible builds (together with the timestamp).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: tiny patch to prevent head from closing pipes

2010-08-29 Thread Jilles Tjoelker
On Sat, Aug 28, 2010 at 06:42:29PM +0400, Anonymous wrote:
> Alexander Best  writes:

> > i just had subversion complain about a broken pipe while piping its
> > output through awk straight to head [1]. i decided to add a switch
> > to head which will tell it to never close a pipe unless the input
> > has stopped [2].

> You can do same with sh(1), e.g.

>   $ svn log | (IFS=; while read li; do [ $((i+=1)) -le 10 ] && echo "$li"; 
> done)

> versus

>   $ svn log | (IFS=; while read li && [ $((i+=1)) -le 10 ]; do echo "$li"; 
> done)
>   ...
>   svn: Write error: Broken pipe

Even easier (and avoiding mangling through read) is
  svn log | { head; cat >/dev/null; }
Beware that the cat command will not necessarily get all of the input.
head may buffer reads and it is unable to push the extra data back onto
the pipe.

> But I think subversion should

I think Subversion should not print an error message for broken pipe on
its stdout.

> > there's probably a much more efficient way of discarding the input
> > without closing the pipe unless the input ceased. it's just a 5
> > minute hack in order to see if people find the idea useful or not.
> > ;)

> Can you give an example of usefulness that does not involve subversion?

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Why I can't trace linux process's childs with truss?

2010-09-10 Thread Jilles Tjoelker
On Fri, Sep 10, 2010 at 12:07:05PM -0700, Yuri wrote:
> I am trying to get the log of all system calls that skype makes with 
> truss -f /usr/local/share/skype/skype
> For some reason the resulting log only has the leading process calls and 
> nothing from it's 8 childs.
> Truss doesn't show any 'cloned' processes. Is this a bug in truss that 
> it doesn't follow 'cloned' processes?

> Is there any workaround or other way I can debug skype? strace doesn't 
> work on amd64.
> I am primarily interested why it can't read /dev/video0 device, created 
> by webcamd.

Try using ktrace instead of truss. You will need devel/linux_kdump from
ports to decode the resulting ktrace.out.

Alternatively, if you're familiar with dtrace, you could try that.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sysrc -- a sysctl(8)-like utility for managing /etc/rc.conf et. al.

2010-10-12 Thread Jilles Tjoelker
On Sat, Oct 09, 2010 at 03:39:44PM -0700, Devin Teske wrote:
> On Oct 9, 2010, at 1:25 PM, Garrett Cooper wrote:
[...]
> The second usage (": function") aids in finding the function
> declaration among the usages. See, in Perl, I can simply search for
> "sub" preceding the function name. In C, I tend to split the return
> type from the function name and ensure that the function name always
> starts in column-1 so I can search for "^funcname" to go to the
> declaration opposed to the usages/references. In BASH, `function' is a
> valid keyword and you can say "function funcname ( ) BLOCK" but
> unfortunately in good ol' bourne shell, "function" is not an
> understood keyword, ... but really liking this keyword, I decided to
> make use of it in bourne shell by-way-of simply making it a
> non-executed-expression (preceded it with ":" and terminated it with
> ";").

I think function f() { ... } is a despicable bashism. In POSIX, an empty
pair of parentheses is an unambiguous indication of a function
definition; although FreeBSD sh also accepts () as a subshell containing
an empty command, there is no reason to use that.

In ksh, function f { ... } is different from f() { ... }, so there is a
reason to use it, and function f() { ... } does not exist.

> I originally had been programming in tests for '!' and 'in', but in
> POSIX bourne-shell, they aren't defined (though understood) in the
> keyword table (so type(1) balks in bourne-shell while csh and bash do
> respond to '!' and 'in' queries).

> Since you've pointed out command(1)... I now have a way of checking
> '!'. Though unfortunately, "command -v", like type(1), also does not
> like "in" (in bourne-shell at least).

The type builtin in the original Bourne shell does not know keywords.
The type builtin in FreeBSD sh knows them, but unlike most shells "in"
is not a keyword -- it is only recognized at the appropriate places in a
case or for command (this is probably a POSIX violation and may change
in the future).

> > "x$fmt" != x ? It seems like it could be simplified to:

> > if [ $# -gt 0 ]
> > then
> >local fmt=$1
> >shift 1
> >eprintf "$fmt\n" "$@"
> > fi
> 
> I never understood why people don't trust the tools they are using...
> 
> `[' is very very similar (if not identical) to test(1)
> 
> [ "..." ] is the same thing as [ -n "..." ] or test -n "..."
> [ ! "..." ] is the same things as [ -z "..." ] or test -z "..."
> 
> I'll never understand why people have to throw an extra letter in there and 
> then compare it to that letter.
> 
> If the variable expands to nothing, go ahead and let it. I've traced
> every possible expansion of variables when used in the following
> manner:

> [ "$VAR" ] ...

> and it never fails. If $VAR is anything but null, the entire
> expression will evaluate to true.

Right, except in very old implementations if VAR is -t, but those should
be extinct and are not worth coding for.

However, in some fairly recent implementations [ "$A" = "$B" ] does not
work as expected for all values of A and B. For example, FreeBSD 7.0
returns true for A='(' and B=')' because it applies the POSIX rules in
the wrong order (certainly, [ '(' "$X" ')' ] is true for most non-empty
values of X, but this does not apply if X is an operator).

> >> # depend $name [ $dependency ... ]
> >> #
> >> # Add a dependency. Die with error if dependency is not met.
> >> #
> >> : dependency checks performed after depend-function declaration
> >> : function ; depend ( ) # $name [ $dependency ... ]
> >> {
> >>local by="$1" arg
> >>shift 1
> >> 
> >>for arg in "$@"; do
> >>local d

> > Wouldn't it be better to declare this outside of the loop (I'm not
> > sure how optimal it is to place it inside the loop)?

> I'm assuming you mean the "local d" statement. There's no restriction
> that says you have to put your variable declarations at the beginning
> of a block (like in C -- even if only within a superficial block { in
> the middle of nowhere } ... like that).

> Meanwhile, in Perl, it's quite a difference to scope it to the loop
> rather than the block. So, it all depends on whichever _looks_ nicer
> to you ^_^

Although this works, I do not recommend it. It makes the impression that
the variable is scoped to the do block, which is not the case as there
is only function scope. Also, FreeBSD sh will register another local
variable record if you make the same variable local again (but it will
reset the variable to the correct value later).

> > I'd say that you have bigger fish to try if your shell lacks the
> > needed lexicon to parse built-ins like for, do, local, etc :)...

local might be worth checking for as the original Bourne shell and ksh93
don't know it.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Space character in rc.conf variable

2010-10-30 Thread Jilles Tjoelker
On Sat, Oct 30, 2010 at 01:11:51PM -0700, Doug Barton wrote:
> On 10/30/10 11:56, Harald Servat wrote:
> >My SSID has an space character and I don't know how to deal with
> >it. So my question is easy, how do I add an space in the SSID?
> >I've tried wrapping the SSID using \" , \' also adding \(space),
> >changing " for ' and others without success.

> >As an example my rc.conf entry looks like
> >  ifconfig_wlan0="ssid SSID WITH SPACE dhcp"
> >and after booting I see that ifconfig tried to connect to SSID instead to
> > "SSID WITH SPACE".

> >As I said, if I run ifconfig wlan0 ssid "SSID WITH SPACE" from
> >the command line, works fine.

> The best solution would be to change the SSID to one without spaces. :)

Well, as mentioned elsewhere in the thread write the SSID in hex. You
also need this if *, ? or [ occur in the SSID.

> If you can't or won't do that, one of these should work, please report 
> back which one does:

> ='ssid "ssid with space" DHCP'
> ='ssid \"ssid with space\" DHCP'
> ="ssid 'ssid with space' DHCP"
> ="ssid \'ssid with space\' DHCP"

This will not work, as the value is stored in a variable and then split
on IFS characters by expanding the variable outside quotes. There is no
escape mechanism - either the whole variable is split or none of it (if
in double-quotes).

One workaround would be to set IFS to something not containing ,
for example , and then delimiting fields with tabs instead
of spaces. Apart from the general ugliness all over the rc.confs
(including the defaults/ one), breakage would probably also happen in
rc.subr and network.subr.

Doug's suggestions point to something like
  eval "set -- ${ifconfig_FOO}"
  ifconfig $_if "$@"

This has potential security issues with arbitrary stuff from untrusted
sources (say, some sort of privilege separation where someone may
configure the network but not do arbitrary things) in /etc/rc.conf:
  ifconfig_wlan0='ssid $(chown evilhacker topsecretfile) dhcp'
is perfectly safe now, but will be insecure. (Pathname generation may
lead to information disclosure, but not executing arbitrary commands.)

Also, the syntax for the keywords such as DHCP is too flexible. This
currently does not do much harm other than stopping you from using
"DHCP" as an ssid or something, but is hard to remove from "$@" in the
above code. If they were restricted to appear at the start only, simple
comparisons of "$1" and shift would do the job, but removing from the
middle requires constructing a string with quoting characters and then
using eval on that string.

Array support in the shell could make this easier, but not much unless
the rc.conf syntax would be changed as well, like
  ifconfig_wlan0=(ssid "SSID WITH SPACE" dhcp)
but then changed some more such that "DHCP" can be used as SSID.

Array support would add a fair bit of code to sh, and even then it will
remain fairly limited and clumsy. For example, an array can only be
returned from a function by passing the name of an existing array to
place the result in and using eval, extending setvar in some strange way
or implementing another ksh93 extension, namerefs.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Pls sanity check my semtimedop(2) implementation

2008-07-18 Thread Jilles Tjoelker
On Sat, Jul 12, 2008 at 07:11:26PM -0400, Michael B Allen wrote:
> Below is a semtimedop(2) implementation that I'm using for FreeBSD. I
> was hoping someone could look it over and tell me if they think the
> implementation is sound.

> [snip semtimedop implementation that uses SIGALRM and relies on EINTR]

> The code seems to work ok but when stressing the FreeBSD build of my app
> I have managed to provoke errors related to concurrency (usually when a
> SIGALRM goes off). The Linux build works flawlessesly so I'm wondering
> about this one critical function that is different.

In your implementation, the SIGALRM signal may happen before you even
call semop(2). If so, most likely the semop(2) will hang arbitrarily
long.

A somewhat dirty fix is to put a nonzero value into it_interval. This
will ensure that if a signal is missed another one will be generated
quickly.

You might be able to fix this using setjmp/longjmp, but this is neither
pretty nor efficient.

Another dirty fix is to try non-blocking semop(2) several times with
sleeps in between.

> Do you think it would make any difference if I used
> ITIMER_VIRTUAL / SIGVTALRM instead of ITIMER_REAL / SIGALRM?

This does not fix the inherent problem.

Also, given your use of signals in the first place, your application
must be single threaded. This means the ITIMER_VIRTUAL timer does not
run while semop(2) is waiting.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: open(2) and O_NOATIME

2008-10-31 Thread Jilles Tjoelker
On Fri, Oct 31, 2008 at 02:48:42PM +0100, Paul Schenkeveld wrote:
> utimes(2) allows non-root users to (re)set atime provided they own the
> file or have write permission.  Having O_NOATIME follow the same rules
> would not break any assumed security any further than utimes(2) already
> does but greatfully benefit all kind of backup programs.

This is not entirely correct. utimes(2) with NULL timestamps (reset
atime and mtime to current time) is allowed to root, owner or with write
permission, but utimes(2) with given timestamps is only allowed to root
and owner. O_NOATIME seems equivalent to the latter, and in fact this is
the case in Linux (if someone else than root or the owner tries to open
a file with O_NOATIME, they get EPERM).

There's only a small detail missing: any utimes(2) call updates the
ctime, so you can see "something" happened to the file. Linux's
O_NOATIME does not update any times at all (this speeds up things).

Anyway, O_NOATIME (only for root/owner) seems a useful feature.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: How to tear down a geom mirror?

2009-03-06 Thread Jilles Tjoelker
On Thu, Mar 05, 2009 at 10:27:50PM -0800, Peter Steele wrote:
> I've created a USB boot disk that is used to clone itself onto the
> systems hard drives, setting up mirrored file systems in the process.
> The main difficulty I'm having is reimaging a system with an existing
> OS whose drives are already configured in a mirror. I want of course
> to destroy the mirror and create a complete new one, but I can't find
> the right process to accomplish this reliably. I don't want to make
> any assumptions about what mirrors might exist already and I
> definitely don't want to do "gmirror load" before I get a chance to
> destroy any existing mirrors. 

> What I am doing is to clean the drive using dd. For example, assume my
> target system has two drives ad1 and ad2. I issue the following
> commands: 

> dd if=/dev/zero of=/dev/ad1 bs=512 count=79 
> dd if=/dev/zero of=/dev/ad2 bs=512 count=79 

gmirror and various other geom modules store their metadata on the last
sector(s) of the drive, so you need to wipe that too.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: bin/113860: sh(1): shell is still running when using `sh -c'

2009-04-03 Thread Jilles Tjoelker
I think this can be improved.

Given that I've been digging in /bin/sh already...

Note first that sh already has some of this functionality:

% sh -c '{ echo a; sleep 10;}&'; sleep 1; ps T
a
  PID  TT  STAT  TIME COMMAND
94682  p9  Ss 0:00.07 zsh
94702  p9  S  0:00.00 sleep 10
94704  p9  R+ 0:00.00 ps T
%

This is the EV_EXIT flag to evaltree() and friends, in eval.c.

To make this work for '-c', evalstring() needs some flag like EV_EXIT,
and parsecmd() needs to tell evalstring() that the command it read is
the last (currently, parsecmd() only reports that there is no command
anymore; due to the stack-like memory management it is not really
possible to read ahead a command). Putting "{\n" and "\n}" around the
string could be an alternative for the latter, as any valid string would
consist of one (compound) command only.

The new mode for evalstring() would only be used for '-c' commands when
'-s' is not given.

Apart from bash, ksh93 and Solaris /usr/xpg4/bin/sh (which is basically
ksh88) also treat simple commands in '-c' this way. So I think the idea
is ok. I'm also slightly annoyed by seeing silly 'sh -c blah' processes
hanging around, and it is not always possible or desirable to add
'exec'.

On another note, the EV_EXIT mode is erroneously still used if a trap on
EXIT has been set (or, maybe, any trap at all; particularly if -T is in
effect). This means that such traps may not be executed. Most other
shells seem to do this right.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: bin/113860: sh(1): shell is still running when using `sh -c'

2009-04-12 Thread Jilles Tjoelker
On Fri, Apr 03, 2009 at 11:39:05PM +0200, Jilles Tjoelker wrote:
> I think this can be improved.

I have a patch now; it does not handle all cases but the effect on the
code is minimal. I decided to go with the readahead but only do it for
the first line.

To avoid problems with traps not being executed,
http://www.stack.nl/~jilles/unix/sh-forkiftrapped.patch is needed. This
fixes a bug in EV_EXIT handling, which would be triggered more often
with the change to -c. The patch is also needed for bin/74404.
Example: sh -c '(trap "echo trapped" EXIT; sleep 3)'

http://www.stack.nl/~jilles/unix/sh-minusc-exec.patch the change itself

-- 
Jilles Tjoelker
Don't skip forking for an external command if any traps are active.

Example:
  sh -c '(trap "echo trapped" EXIT; sleep 3)'
now correctly prints "trapped".

With this check, it is no longer necessary to check for -T
explicitly in that case.

diff --git a/eval.c b/eval.c
--- a/eval.c
+++ b/eval.c
@@ -730,7 +730,7 @@ evalcommand(union node *cmd, int flags, 
 	/* Fork off a child process if necessary. */
 	if (cmd->ncmd.backgnd
 	 || (cmdentry.cmdtype == CMDNORMAL
-	&& ((flags & EV_EXIT) == 0 || Tflag))
+	&& ((flags & EV_EXIT) == 0 || have_traps()))
 	 || ((flags & EV_BACKCMD) != 0
 	&& (cmdentry.cmdtype != CMDBUILTIN
 		 || cmdentry.u.index == CDCMD
diff --git a/trap.c b/trap.c
--- a/trap.c
+++ b/trap.c
@@ -222,6 +222,21 @@ clear_traps(void)
 
 
 /*
+ * Check if we have any traps enabled.
+ */
+int
+have_traps(void)
+{
+	char *volatile *tp;
+
+	for (tp = trap ; tp <= &trap[NSIG - 1] ; tp++) {
+		if (*tp && **tp)	/* trap not NULL or SIG_IGN */
+			return 1;
+	}
+	return 0;
+}
+
+/*
  * Set the signal handler for the specified signal.  The routine figures
  * out what it should be set to.
  */
diff --git a/trap.h b/trap.h
--- a/trap.h
+++ b/trap.h
@@ -39,6 +39,7 @@ extern volatile sig_atomic_t gotwinch;
 
 int trapcmd(int, char **);
 void clear_traps(void);
+int have_traps(void);
 void setsignal(int);
 void ignoresig(int);
 void onsig(int);
Avoid leaving unnecessary waiting shells in many forms of sh -c COMMAND.

This change only affects strings passed to -c, when the -s
option is not used.

The approach is to read the first line of commands, then the
second, and if it turns out there is no second line execute
the first line with EV_EXIT. Otherwise execute the first and
second line, then read and execute lines as long as they
exist.

A compound statement such as introduced by {, if, for or
while counts as a single line of commands for this.

Note that if the second line contains a syntactical error,
the first line will not be executed, different from
previously. (pdksh and zsh parse the entire -c string
before executing it.)

Example:
  sh -c 'ps lT'
No longer shows a shell process waiting for ps to finish.

diff --git a/eval.c b/eval.c
--- a/eval.c
+++ b/eval.c
@@ -175,6 +175,44 @@ evalstring(char *s)
 }
 
 
+/*
+ * Like evalstring(), but always exits. This is useful in avoiding shell
+ * processes just sitting around waiting for exit.
+ */
+
+void
+evalstring_exit(char *s)
+{
+	union node *n, *n2;
+	struct stackmark smark;
+
+	setstackmark(&smark);
+	setinputstring(s, 1);
+	do
+		n = parsecmd(0);
+	while (n == NULL);
+	if (n != NEOF) {
+		do
+			n2 = parsecmd(0);
+		while (n2 == NULL);
+		if (n2 == NEOF) {
+			evaltree(n, EV_EXIT);
+			/*NOTREACHED*/
+		}
+		evaltree(n, 0);
+		evaltree(n2, 0);
+		popstackmark(&smark);
+		while ((n = parsecmd(0)) != NEOF) {
+			if (n != NULL)
+evaltree(n, 0);
+			popstackmark(&smark);
+		}
+	}
+	popfile();
+	popstackmark(&smark);
+	exitshell(exitstatus);
+}
+
 
 /*
  * Evaluate a parse tree.  The value is left in the global variable
diff --git a/eval.h b/eval.h
--- a/eval.h
+++ b/eval.h
@@ -47,6 +47,7 @@ struct backcmd {		/* result of evalbackc
 
 int evalcmd(int, char **);
 void evalstring(char *);
+void evalstring_exit(char *);
 union node;	/* BLETCH for ansi C */
 void evaltree(union node *, int);
 void evalbackcmd(union node *, struct backcmd *);
diff --git a/main.c b/main.c
--- a/main.c
+++ b/main.c
@@ -178,7 +178,10 @@ state2:
 state3:
 	state = 4;
 	if (minusc) {
-		evalstring(minusc);
+		if (sflag)
+			evalstring(minusc);
+		else
+			evalstring_exit(minusc);
 	}
 	if (sflag || minusc == NULL) {
 state4:	/* XXX ??? - why isn't this before the "if" statement */
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Re: fdescfs brokenness

2009-05-08 Thread Jilles Tjoelker
On Fri, May 08, 2009 at 11:12:03PM +0300, Kostik Belousov wrote:
> On Thu, May 07, 2009 at 08:07:46PM -0700, Tim Kientzle wrote:
> > Colin Percival recently pointed out some issues
> > with tar and fdescfs.  Part of the problem
> > here is tar; I need to rethink some of the
> > traversal logic.

> > But fdescfs is really wonky:

> >  * This is a nit, but:  ls /dev/fd/18 should not
> >return EBADF; it should return ENOENT, just
> >like any other reference to a non-existent filename.
> >(Just because a filename reflects a file descriptor
> >does not mean it is a file descriptor.)
> This is a traditional behaviour for fdescfs. According to man page,
> open("dev/fd/N") shall be equivalent to fcntl(N, F_DUPFD, 0).
> Solaris behaviour is the same.

On open, yes, but stat behaves differently on a Solaris 10 machine here.
A valid but unallocated fd number will still stat as a character
device, like an allocated fd.

% ls -l /dev/fd/0 /dev/fd/999
crw-rw-rw-   1 root root 320,  0 May  9 00:06 /dev/fd/0
crw-rw-rw-   1 root root 320, 999 May  9 00:06 /dev/fd/999

By the way, both FreeBSD and Solaris also behave strangely if you try to
access fd numbers 1<<32 or higher.

Linux seems to behave strangely as well: the fds show up as symlinks,
some of which do not contain valid file names but can still be opened.
However, a command like
  { read x <&5; read y  >  * The fairly routine recursive directory walker
> >below gets hung in fdescfs.  It appears that
> >the two opendir() invocations active at the
> >same time interfere with each other.
> What you mean by "gets hung" ? In my limited testing, it works.
> Opendir creates a new directory in /dir/fd by the mere fact of opening
> the directory. So it walks into that dir, returning to step 1.

> >  * A similar chdir()-based version of the directory
> >walker below breaks badly; you can chdir() into
> >a directory under /dev/fd, but you can't chdir("..")
> >to get back out of it.  (This is the particular
> >problem that tar is running afoul of.)
> Not sure about this one. I think that fdescfs vnodes do not support
> lookup on anything not being root of the fdescfs.

> >  * Running "find /dev/fd" generates bogus ENOENT errors
> >because you can opendir() a directory inside of /dev/fd,
> >and read the entries, but you can't access those entries
> >because path searches don't work through fdescfs.
> Again, this may be a consequence of the previous issue.

> > I think the right solution here is to add a VOP_ACCESS
> > handler to fdescfs that bars all access to directory
> > nodes under /dev/fd.  Basically, if your program has a
> > directory open, that should be reflected as a directory
> > node that you can't do anything with.  The current implementation
> > allows you to chdir(), opendir(), etc, those directory
> > nodes, but the machinery to fully support those operations
> > is missing so they just screw things up.
> This would chomp the fdescfs functionality, IMHO. Why directory
> file descriptors should behave differently then any other file
> descriptor ?

Linux and Solaris do not have these problems because their /dev/fd does
not copy stat information from the underlying file, instead showing a
character device (Solaris) or a symlink (Linux). I think a character
device would fit best, because you can do little else with it than open
it. The open operation is also different from opening the underlying
file because it does not create a new open file description.

devfs's /dev/fd/0, /dev/fd/1 and /dev/fd/2 work like this as well: they
always show up as character devices no matter what the underlying file
is. When opened, they duplicate the respective fd just like the full
/dev/fd does. (These are located at the end of /sys/kern/kern_descrip.c.)

Apparently someone noticed earlier this could be a problem, because the
R and X mode bits are cleared from directories that show up in /dev/fd.
It does not come as a surprise to me that that hack does not work.

> I think that the actual solution for the walker problems is to
> ignore the synthetic filesystems altogether. The information is
> provided by sysctl vfs.conflist (note that the output is binary),
> see VFCF_* flags, esp. VFCF_SYNTHETIC. The flag is correctly
> set at least by procfs, devfs and fdescfs.

I think it should be possible to write a directory walker program using
only standard interfaces.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: pthread_mutexattr_getprioceiling error?

2009-06-13 Thread Jilles Tjoelker
On Sat, Jun 13, 2009 at 05:59:12PM +0400, Михаил Кипа wrote:
> Next little program:
> #include 
> #include  

> int main()
> {
> pthread_mutexattr_t t;
> if (pthread_mutexattr_init(&t)) return 1;
> int i;
> std::cout << pthread_mutexattr_getprioceiling(&t, &i) << std::endl;
> }

> always print 22. It means that pthread_mutexattr_getprioceiling always
> fails with EINVAL. Under Linux this example works fine, but under
> FreeBSD 7.2 it does`n work. Is it a bug in FreeBSD thread library or
> it ai my misunderstanding?

The priority ceiling is only meaningful with the PTHREAD_PRIO_PROTECT
protocol (pthread_mutexattr_setprotocol()). The FreeBSD threads library
returns EINVAL if you try to get/set the priority ceiling with another
protocol.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Regarding Signal IPC

2009-06-19 Thread Jilles Tjoelker
On Fri, Jun 19, 2009 at 09:03:28AM -0400, Daniel Eischen wrote:
> On Fri, 19 Jun 2009, vasanth raonaik wrote:
> > I want to print out the process ID of the process which is sending the
> > Signal.

> > Is it possible. if yes, can you please point me to any related documents.

> Though I have not tried this, there is an si_pid field
> (and other fields you might be interested in) in
> struct siginfo.  If you use a POSIX signal handler
> (see sigaction(2)), a pointer to a struct siginfo is
> the 2nd argument to your signal handler.

> See  for the definition of struct siginfo.

Meaningful siginfo is only given for signals sent by sigqueue(2), traps,
child processes (SIGCHLD) and sigevent structures (for example
mq_notify(2)).

If you want to handle signals synchronously, you can use
sigtimedwait(2). It is still necessary to call sigaction(2) with the
SA_SIGINFO flag, even though the signal handler will not be called
because the signal is blocked (a necessary step before using
sigwait-like calls).

This alternate mechanism may be particularly useful here because getting
the siginfo data out of the signal handler safely is a tricky business.

Although I haven't tried this, kqueue's SIGEV_SIGNAL may be helpful in
managing this from a single-threaded program. This is, however, not
portable.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: carriage return with stdout and stderr

2009-07-05 Thread Jilles Tjoelker
On Sun, Jul 05, 2009 at 01:42:01PM +0200, Alexander Best wrote:
> i'm running something similar to this pseudo-code in an app of mine:

> for (i=0 )
> fprintf(stdout,"TEXT %d\r", int);

> what's really strange is that if i print to stdout the output isn't very
> clean. the cursor jumps randomly within the output (being 1 line). if i print
> to stderr however the output looks really nice. the cursor says right at the
> front of the output all the time. just like in burncd e.g.

> what's causing this? because i'd rather print to stdout.

If you are writing to a terminal, stdout is line-buffered. This means
that output is flushed when the buffer is full or a '\n' is written. A
'\r' is not good enough. You can force a write using fflush(stdout).

stderr is always unbuffered, so everything is written immediately.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Problem in bin/sh stripping the * character through ${expansion%}

2009-09-11 Thread Jilles Tjoelker
On Fri, Aug 07, 2009 at 03:26:50AM +0400, Eygene Ryabinkin wrote:
> Thu, Aug 06, 2009 at 11:15:12AM -0700, Doug Barton wrote:
> > I came across this problem during a recent portmaster update. When
> > trying to strip off the * character using variable expansion in bin/sh
> > it doesn't work. Other "special" characters do work if they are
> > properly escaped.

> > The attached mini-script clearly shows the problem:

> > $ sh sh-strip-problem
> > var before stripping: foo\*
> > var after stripping: foo\*

> > var before stripping: foo\$
> > var after stripping: foo\

> According to the sh(1), it is not a problem.  Namely,
>  - \* being unquoted at all will produce a lone '*';
>  - '*' when treated as the smallest pattern, will result in a stripping
>of a zero-length string -- it is the smallest pattern in the case of
>'*' that matches anything.

That is indeed an explanation why it works that way, but I think it is
wrong. Generally, the shell command language avoids unnecessary levels
of quoting. In the POSIX spec, "Shell Command Language", note the part
about "${x#*}" (pattern) and ${x#"*"} (literal asterisk). Also compare
with  case $something in \*) echo asterisk;; esac  which matches a
literal asterisk.

Two PRs already exist for aspects of stripping: bin/57554 (double
quotes) and bin/117748 (trying to match pattern matching characters
literally).

> In order to strip the trailing star you should use
> -
> var=${var%[*]}
> -
> This gives you the pattern of '[*]' that is properly treated as the
> single star -- it's a weird way to escape the star in the patterns.

This is indeed a good workaround.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: sigwait - differences between Linux & FreeBSD

2009-10-09 Thread Jilles Tjoelker
On Thu, Oct 08, 2009 at 01:02:09PM +0300, Kostik Belousov wrote:
> On Thu, Oct 08, 2009 at 11:53:21AM +1100, Stephen Hocking wrote:
> > In my efforts to make the xrdp port more robust under FreeBSD, I have
> > discovered that sigwait (kind of an analogue to select(2), but for
> > signals rather than I/O) re-enables ignored signals in its list under
> > Linux, but not FreeBSD. The sesman daemon uses SIGCHLD to clean up
> > after a session has exited. Under Linux this works OK, under FreeSBD
> > it doesn't. I have worked around it in a very hackish manner (define a
> > dummy signal handler and enable it using signal, which means that the
> > sigwait call can then be unblocked by it), but am wondering if anyone
> > else has run across the same problem, and if so, if they fixed it in
> > an elegant manner. Also, does anyone know the correct semantics of
> > sigwait under this situation?

> ports@ is the wrong list to discuss the issue in the base system.

> Solaris 10 sigwait(2) manpage says the following:
> If sigwait() is called on an ignored signal, then the occurrence of the
> signal will be ignored, unless sigaction() changes the disposition.

> We have the same behaviour as Solaris, ingored signals are not queued or
> recorded regardeless of the presence of sigwaiting thread.

POSIX permits both behaviours here: a blocked and ignored signal may or
may not be discarded immediately on generation. Making this depend on
whether there is a sigwaiting thread seems broken, and I don't think
Linux does that.

I think your "very hackish" approach is correct: set up a dummy signal
handler after blocking the signal. 

Additionally, POSIX requires applications to set the SA_SIGINFO flag if
they want queuing. This applies even if the signals are blocked and
received using sigwaitinfo(2) or sigtimedwait(2). The SA_SIGINFO flag
can only be set by setting a handler using sigaction(2). (Note, this
does not mean that all signals are queued if SA_SIGINFO is set. It means
that signals may not be queued if SA_SIGINFO is not set.)

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: semaphores between processes

2009-10-22 Thread Jilles Tjoelker
On Thu, Oct 22, 2009 at 04:08:11PM -0400, Andrew Gallatin wrote:
> We then moved on to posix semaphores.  Using sem_wait/sem_post
> with the sem_t residing in a shared page seems to work on
> all 3 platforms.  However, the FreeBSD (7-stable) man page
> for sem_init(3) has this scary text regarding the pshared
> value:

>   The sem_init() function initializes the unnamed semaphore pointed 
> to by
>   sem to have the value value.  A non-zero value for pshared specifies a
>   shared semaphore that can be used by multiple processes, which this
>   implementation is not capable of.

> Is this text obsolete?  Or is my test just "getting lucky"?

They work, but only if the processes are related and do not exec and the
parent process initializes the semaphores before forking. This is
because sem_t is a pointer to a malloc'ed structure. For process-shared
semaphores this really only contains an identifier of the kernel
semaphore. This also means process-shared semaphores are slower than
in-process semaphores (libthr implements those using atomics and umtx so
that system calls are only needed if a thread needs to sleep or be
awakened).

This is documented in comments in the source code, but not in man pages
or other documentation.

> Is there recommended way to do this?

Apart from sysv semaphores, perhaps posix named semaphores (sem_open()
etc). These require a 'kldload sem' on older versions though.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: help needed to fix contrib/ee crash/exit when receiving SIGWINCH

2009-10-24 Thread Jilles Tjoelker
On Fri, Oct 23, 2009 at 07:56:35PM +0400, Eygene Ryabinkin wrote:
> Hmm, we can transform this code to the following one:
> -
> errno = 0;
> do {
>   in = wgetch(text_win);
> } while (errno == EINTR);
> if (in == -1)
>   exit(0);
> -
> This won't help with FreeBSD's ncurses, but may be other variants
> will feel much better with such a event loop variant.

That should be:
-
do
in = wgetch(text_win);
while (in == -1 && errno == EINTR);
if (in == -1)
exit(0);
-

errno should only be checked after failed function calls or for
functions where it is documented that errno should be used to check for
error conditions (example: readdir).

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: [patch] add pwait utility

2009-11-07 Thread Jilles Tjoelker
On Sat, Nov 07, 2009 at 03:01:36PM +0200, Kostik Belousov wrote:
> On Sat, Nov 07, 2009 at 11:28:32AM +0100, Gary Jennejohn wrote:
> > On Fri, 6 Nov 2009 23:24:46 +0100
> > Jilles Tjoelker  wrote:

> > > I propose adding a small new utility to /usr/bin: pwait. Similar to the
> > > Solaris utility of the same name, it waits for any process to terminate.

> > Why not /bin so it can be used before /usr is mounted?
> And it seems to make sense to add this functionality to pkill/pgrep
> binary, creating another hardlink to it.

Hmm, pwait's syntax is incompatible: it takes pids (pkill says: use
kill) and the -v option does something totally different.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: CFR: Exceedingly minor fixes to libc

2009-11-13 Thread Jilles Tjoelker
On Fri, Nov 13, 2009 at 12:18:49AM -0500, Garrett Wollman wrote:
> If you have a moment, please take a look at the following patch.  It
> contains some very minor fixes to various parts of libc which were
> found by the clang static analyzer.  They fall into a few categories:

> 1) Bug fixes in very rare situations (mostly error-handling code that
> has probably never been executed).

> 2) Dead store elimination.

> 3) Elimination of unused variables.  (Or in a few cases, making use of
> them.)

> Some minor style problems were also fixed in the vicinity.  There
> should be no functional changes except in very unusual conditions.

This looks mostly ok, with a few exceptions.

> [...]
> Index: gen/getpwent.c
> ===
> --- gen/getpwent.c(revision 199242)
> +++ gen/getpwent.c(working copy)
> @@ -257,22 +257,19 @@
> [...]
> @@ -1876,7 +1871,6 @@
>   syslog(LOG_ERR,
>"getpwent memory allocation failure");
>   *errnop = ENOMEM;
> - rv = NS_UNAVAIL;
>   break;
>   }
>   st->compat = COMPAT_MODE_NAME;

I think this change hides the wrongness in the handling of malloc errors
while leaving it as broken as it is.

> [...]
> Index: net/getservent.c
> ===
> --- net/getservent.c  (revision 199242)
> +++ net/getservent.c  (working copy)
> @@ -476,11 +476,11 @@
>   struct nis_state *st;
>   int rv;
>  
> - enum nss_lookup_type how;
>   char *name;
>   char *proto;
>   int port;
>  
> + enum nss_lookup_type how;
>   struct servent *serv;
>   char *buffer;
>   size_t bufsize;
> @@ -491,7 +491,8 @@
>  
>   name = NULL;
>   proto = NULL;
> - how = (enum nss_lookup_type)mdata;
> +
> + how = (intptr_t)mdata;
>   switch (how) {
>   case nss_lt_name:
>   name = va_arg(ap, char *);

In what way is this an improvement?

> Index: posix1e/acl_delete_entry.c
> ===
> --- posix1e/acl_delete_entry.c(revision 199242)
> +++ posix1e/acl_delete_entry.c(working copy)
> @@ -89,20 +89,20 @@
>   return (-1);
>   }
>  
> - if ((acl->ats_acl.acl_cnt < 1) ||
> - (acl->ats_acl.acl_cnt > ACL_MAX_ENTRIES)) {
> + if ((acl_int->acl_cnt < 1) ||
> + (acl_int->acl_cnt > ACL_MAX_ENTRIES)) {
>   errno = EINVAL;
>   return (-1);
>   }

If you are changing this code anyway, consider fixing a style bug
(extraneous parentheses) here.

> - for (i = 0; i < acl->ats_acl.acl_cnt;) {
> - if (_entry_matches(&(acl->ats_acl.acl_entry[i]), entry_d)) {
> + for (i = 0; i < acl_int->acl_cnt;) {
> + if (_entry_matches(&(acl_int->acl_entry[i]), entry_d)) {
>   /* ...shift the remaining entries... */
> - for (j = i; j < acl->ats_acl.acl_cnt - 1; ++j)
> - acl->ats_acl.acl_entry[j] =
> - acl->ats_acl.acl_entry[j+1];
> + for (j = i; j < acl_int->acl_cnt - 1; ++j)
> + acl_int->acl_entry[j] =
> + acl_int->acl_entry[j+1];
>   /* ...drop the count and zero the unused entry... */
> -     acl->ats_acl.acl_cnt--;
> - bzero(&acl->ats_acl.acl_entry[j],
> + acl_int->acl_cnt--;
> + bzero(&acl_int->acl_entry[j],
>   sizeof(struct acl_entry));
>   acl->ats_cur_entry = 0;
>   

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: Workaround for ntop as daemon, is it ok?

2009-11-27 Thread Jilles Tjoelker
On Fri, Nov 27, 2009 at 09:19:11AM +0100, Henner Morten Kruse wrote:
> I have just set up an ntop server based on 8.0-RELEASE.

>  FreeBSD ntop 8.0-RELEASE FreeBSD 8.0-RELEASE #0: Sat Nov 21 15:48:17 UTC 2009
>  r...@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386

> After installing ntop 1.3.10 and all dependencies from the ports ntop
> did work, but when running ntop as a daemon I got permanent and repeating
> warning messages:

> [warn] kevent: Bad file descriptor

> [...]

> I found out that ntop forks another thread for the daemon and kills the
> initial one. The problem with this behaviour is that the kqueue is
> started by the initial thread and the daemon thread doesn't use the
> same file descriptors. So the kqueue is lost.

That's a process, not a thread.

> [...]

> When I change the fork() in line 186 to rfork(RFPROC) everything works
> and I get no more warning messages and procstat reports an existing
> kqueue for the daemonized ntop:

> So my question to this is:
> Is my workaround ok or could this cause any problems? And what is the cause
> of these warnings? Is it a bug or incapability in the kqueue implementation
> or is it caused by bad code in ntop?

Because it refers to other file descriptors, a kqueue is only really
meaningful in the file descriptor table it was created in (this could be
avoided for events that do not refer to file descriptors, and I think
Solaris's "ports" system does that).

As described in the kqueue(2) man page, calling rfork(2) without RFFDG
will allow sharing the kqueue between the two processes.

As long as the parent process exits quickly, or no "tricks" with file
descriptor numbers are done, this is pretty safe.

Another fix is to create the kqueue in the child process.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


why does _PATH_STDPATH contain the current directory?

2009-12-23 Thread Jilles Tjoelker
/usr/include/paths.h has:
/* All standard utilities path. */
#define _PATH_STDPATH   "/usr/bin:/bin:/usr/sbin:/sbin:"

The current directory appears to have been added accidentally years ago.
Can I go ahead and take it out (the colon)?

The paths for rescue already do not have the current directory.

The main use for _PATH_STDPATH is to find all standard utilities, such
as for the POSIX-specified confstr(_CS_PATH), 'getconf PATH' and
'command -p'. Some utilities also use it instead of _PATH_DEFPATH for
root, but these values tend to be overridden by /etc/login.conf and/or
shell startup files.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: linprocfs Input/output error

2010-01-01 Thread Jilles Tjoelker
On Fri, Jan 01, 2010 at 06:45:33PM +0100, Fernando Apesteguía wrote:
> Hi all, I post here cause I didn't get any answers in freebsd-emulation.

> In 8.0-RELEASE-p1 if I try to execute this:

> cat /compat/linux/proc/cpuinfo > ~/cpuinfo.txt

> I get

> cat: /compat/linux/proc/cpuinfo: Input/output error

> truss shows the read system call returns ERR#5. It is the same with
> other files from linprocfs.

> cat /compat/linux/proc/[any_file] works fine

> What am I missing?

You are not missing anything, this is a bug. A while ago, cat(1) was
changed to use larger buffers when writing to regular files on machines
with sufficient RAM, usually a multiple of MAXPHYS. On the other hand,
pseudofs (the general framework for filesystems such as procfs,
linprocfs and linsysfs) in src/sys/fs/pseudofs/pseudofs_vnops.c
pfs_read() fails any read over MAXPHYS + 1 with EIO. This limit probably
has to do with the allocation of a buffer of that size using sbuf_new(9)
(and so, malloc(9)).

Some sort of limit seems appropriate, but MAXPHYS seems unrelated, and
if the request is too long it should perhaps just truncate it.

-- 
Jilles Tjoelker
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


  1   2   >