Dying jail

2016-10-26 Thread Eugene Grosbein
Hi!

Recently I've upgraded one of my server running 9.3-STABLE with jail containing 
4.11-STABLE system.
The host was source-upgraded upto 10.3-STABLE first and next to 11.0-STABLE
and jail configuration migrated to /etc/jail.conf. The jail kept intact.

"service jail start" started the jail successfully
but "service jail restart" fails due to jail being stuck in "dying" state for 
long time:
"jls" shows no running jails and "jls -d" shows the dying jail.

How do I know why is it stuck and how to forcebly kill it without reboot of the 
host?

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Dying jail

2016-10-26 Thread Matthew Seaman
On 10/26/16 09:09, Eugene Grosbein wrote:
> Recently I've upgraded one of my server running 9.3-STABLE with jail 
> containing 4.11-STABLE system.
> The host was source-upgraded upto 10.3-STABLE first and next to 11.0-STABLE
> and jail configuration migrated to /etc/jail.conf. The jail kept intact.
> 
> "service jail start" started the jail successfully
> but "service jail restart" fails due to jail being stuck in "dying" state for 
> long time:
> "jls" shows no running jails and "jls -d" shows the dying jail.
> 
> How do I know why is it stuck and how to forcebly kill it without reboot of 
> the host?

I've seen this fairly frequently.  I think it may have something to do
with old network connections waiting to be cleaned up -- if you run
sockstat it's all the stuff that gets listed at the end with lots of
question marks.  BICBW.

One tip I've found is *not* to specify the JID number in jail.conf, and
just let the system allocate a new one as it feels necessary.  If you've
scripting that uses the JID to operate on a specific jail, it's easy to
substitute the jail name instead.

Cheers,

Matthew






signature.asc
Description: OpenPGP digital signature


Re: Dying jail

2016-10-26 Thread Eugene Grosbein
On 26.10.2016 15:45, Matthew Seaman wrote:
> On 10/26/16 09:09, Eugene Grosbein wrote:
>> Recently I've upgraded one of my server running 9.3-STABLE with jail 
>> containing 4.11-STABLE system.
>> The host was source-upgraded upto 10.3-STABLE first and next to 11.0-STABLE
>> and jail configuration migrated to /etc/jail.conf. The jail kept intact.
>>
>> "service jail start" started the jail successfully
>> but "service jail restart" fails due to jail being stuck in "dying" state 
>> for long time:
>> "jls" shows no running jails and "jls -d" shows the dying jail.
>>
>> How do I know why is it stuck and how to forcebly kill it without reboot of 
>> the host?
> 
> I've seen this fairly frequently.  I think it may have something to do
> with old network connections waiting to be cleaned up -- if you run
> sockstat it's all the stuff that gets listed at the end with lots of
> question marks.  BICBW.

My jails has public IPv4 distinct from host's one and sockstat shows no lines
for jail's IP.

> One tip I've found is *not* to specify the JID number in jail.conf, and
> just let the system allocate a new one as it feels necessary.  If you've
> scripting that uses the JID to operate on a specific jail, it's easy to
> substitute the jail name instead.

I do not specify JID number in jail.conf.
OTOH, its jail configuration section in jail.conf is numeric-named
and the same number automatically assigned as its jid for unknown reason.


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Dying jail

2016-10-26 Thread Eugene Grosbein
On 26.10.2016 15:45, Matthew Seaman wrote:

> One tip I've found is *not* to specify the JID number in jail.conf, and
> just let the system allocate a new one as it feels necessary.  If you've
> scripting that uses the JID to operate on a specific jail, it's easy to
> substitute the jail name instead.

Thank you for the tip. I've renamed the section in /etc/jail.conf
to start with non-numeric symbol and "service jail start" successfully
started my jail assigning it JID 1 this time. Now "jls -d" shows me two
jails having same IP address and path but distinct JIDs.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Dying jail

2016-10-26 Thread Eugene Grosbein

26.10.2016 20:40, krad пишет:

on a side note there is no such thing as 9.3-STABLE, but there are 9-STABLE and 
9.3-RELENG. The difference being stable is a constantly moving thing where as 
releng is just security errata and bugfixes.


9-STABLE currently call itself 9.3-STABLE:

# uname -r
9.3-STABLE

See also 
https://svnweb.freebsd.org/base/stable/9/sys/conf/newvers.sh?revision=268592&view=markup#l34
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Re: VT not showing cyrillic characters in text mode

2016-10-26 Thread Antony Uspensky

On Wed, 19 Oct 2016, a...@a-real.ru wrote:


Hello everyone.

Cannot figure out if vt console driver supports non-ansi characters when 
started in textmode. The font in sys/dev/vt/vt_font_default.c seems to 
include cyrillic glyphs, but either it is not being used in textmode or 
something else is broken since only '?' are displayed instead of proper 
symbols. The LANG/LC_ALL doesn't affect anything. The problem occurs on 
11.0-RELEASE and 10.3-RELEASE.
Textmode is required for me since vt performance in graphical mode is very 
poor on HyperV and textmode looks like the only option.


VT cannot load fonts in text mode - man vt.
If you need loadable fonts in text mode console - use sc.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: stable/11 -r307797 on BPi-M3 (cortex-a7): xgcc's cc1 during lang/gcc6 build gets SIGSYS failures (/usr/ports -r424540)

2016-10-26 Thread Mark Millard
[A top post noting that user "ast" CSW's are involved and other details in the 
sequence leading up to the failure.]

Using "ktrace -i -t +fw" it looks like every repeat of the problem ends up with 
the following sort of sequence (a variation is shown later):

 34629 cc1  CALL  
mmap(0,0x4000,0x3,0x1002,0x,0x1c,0,0)
 34629 cc1  RET   mmap 568225792/0x21de7000
 34629 cc1  PFLT  0x21de7000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x21de8000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x21de9000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x21dea000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x229e8000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x229e9000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x229ea000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  CSW   stop user "ast"
 34629 cc1  CSW   resume user "ast"
 34629 cc1  PFLT  0x229eb000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  PFLT  0x229ec000 VM_PROT_WRITE
 34629 cc1  PRET  KERN_SUCCESS
 34629 cc1  CALL  [-17504]
 34629 cc1  RET   [-17504] -1 errno 78 Function not implemented
 34629 cc1  PSIG  SIGSYS SIG_DFL code=SI_KERNEL
 34629 cc1  NAMI  "cc1.core"
 34630 as   CSW   stop kernel "piperd"
 34630 as   Events dropped.
 34630 as   RET   read 0
 34630 as   CALL  close(0)
 34630 as   RET   close 0
. . .

I'll note that for the source this was compiling I used gdb truss with run -feH 
-o truss.log and it reported:

(gdb) print t->cs.number
$5 = 580828064

FYI: 580828064 = 0x229EBBA0

where the truss segmentation fault was at line 385 of the following (sc==NULL 
in the context):

> 380   t->cs.name = sysdecode_syscallname(t->proc->abi->abi, 
> t->cs.number);
> 381   if (t->cs.name == NULL)
> (gdb) 
> 382   fprintf(info->outfile, "-- UNKNOWN %s SYSCALL %d --\n",
> 383   t->proc->abi->type, t->cs.number);
> 384   
> 385   sc = get_syscall(t->cs.name, narg);
> 386   t->cs.nargs = sc->nargs;
> 387   assert(sc->nargs <= nitems(t->cs.s_args));
> 388   
> 389   t->cs.sc = sc;

The 229E matched the upper part of local PFLT activity around the user "ast" 
CSW's, including just before the bad call.

But the details do vary some based on the source file being compiled. For 
example here the user "ast" CSW's are just before the mmap but are still just 
after the 0x229ea000 PFLT:

 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0xbfbf2000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x229e7000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x229e8000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x229e9000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x229ea000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  CSW   stop user "ast"
 34698 cc1  CSW   resume user "ast"
 34698 cc1  CALL  
mmap(0,0x4000,0x3,0x1002,0x,0,0,0)
 34698 cc1  RET   mmap 568225792/0x21de7000
 34698 cc1  PFLT  0x21de7000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x21de8000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x21de9000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x21dea000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  PFLT  0x229eb000 VM_PROT_WRITE
 34698 cc1  PRET  KERN_SUCCESS
 34698 cc1  CALL  [-25840]
 34698 cc1  RET   [-25840] -1 errno 78 Function not implemented
 34698 cc1  PSIG  SIGSYS SIG_DFL code=SI_KERNEL
 34698 cc1  NAMI  "cc1.core"
 34699 as   CSW   stop kernel "piperd"
 34699 as   Events dropped.
 34699 as   RET   read 0
 34699 as   CALL  close(0)
 34699 as   RET   close 0

-25840 in 2's complement is: 0xF...F9B10

Here doing the gdb truss instead reports:

(gdb) print t->cs.number
$1 = 580819728

and 580819728 = 0x229E9B10

and the 229E part matches several PFLT's in the area, including just before the 
bad call as well as just before the user "ast"s. Between them are some PFLT's 
that do not match.

I would guess that the 229E in t->cs.number in truss is from the PFLT just 
before the failing syscall in each case.

===
Mark Millard
markmi at dsl-only.net

On 2016-Oct-25, at 2:32 PM, Mark Millard  wrote:

> [I'll be submitting some of the below information to bugzilla.]
> 
> While trying to build lang/gcc6 on a BPI-M3 (Cortex-A7, ALLWINNER) I got 
> "xgcc: internal compiler error: Bad system call (program cc1)", which means a 
> SIGSYS (signal 12) resulted.
> 
> [I will note that I'v never seen this issue (so far) on the rpi2: This may be 
> KERNCONF=ALLWINNER specific. But I've not yet updated to -r307797 on the 
> rpi2. The BPI-M3 context 

Jenkins build became unstable: FreeBSD_stable_10 #444

2016-10-26 Thread jenkins-admin
https://jenkins.FreeBSD.org/job/FreeBSD_stable_10/444/
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"