FreeBSD 9 crash/deadlock when dump(8)ing file system with journaling enabled.

2012-01-30 Thread Adam Strohl
A few days ago I discovered that running dump(8) on a file system which 
has journaled soft updates enabled causes the machine to become 
unresponsive.  This is using FreeBSD 9.0-R.   I can still interact with 
the system a little bit (typing echos back) but never get a response to 
any action.   It seems like there is an I/O issue (deadlock or overload) 
and CPU usage spikes as well.  Resetting the machine seems to be the 
only solution once this occurs (CTRL-C, CTRL-ALT-DEL, logging in again, 
etc all fail).


Disclaimer: I have only tested this as a VM under ESXi 5.0 as I don't 
have access to any physical servers with 9.0 on them that I can test 
with (I will soon though).


As a side note this is the same issue reported here:
http://forums.freebsd.org/showthread.php?t=25787

For now I've turned off journaling (soft updates seem fine) and that 
works around the issue.


Let me know if I can provide more details etc!

--

Adam Strohl
A-Team Systems

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9 crash/deadlock when dump(8)ing file system with journaling enabled.

2012-02-03 Thread Adam Strohl

On 2/3/2012 15:45, Marcel Bonnet wrote:

Hi, what would be the best choice?

1. turn off SUJ and dump using -L flag and the turn it on again (but I'm
affraid to destroy the system this way - is this possible?)


Yes, this is what I have done for my 9.0 servers.  Unfortunately you 
need to be in single user mode if you need to change /'s setting.   This 
is because you cannot change this option while a file system is mounted 
read+write.  Rebooting into single user mode is the easiest way to do 
this, as it leaves you with / mounted read only and everything else 
dismounted for you.


Then I ran:

tunefs -j disable /

For each volume that journaling was enabled for.

As a side note remove the .sujournal file in the base of each volume 
afterwards as it just takes up space.


After doing this I have been using dump(8) nightly via cron(8) for about 
a week now under 9.0 without issue.  Running it on 11 servers currently.



2. dump the unmounted partitions with a live cd, per example

Sorry for top posting, mobile mail client problem.

Em 30/01/2012 12:56, "Ivan Voras"escreveu:

On 30/01/2012 13:06, Jeremy Chadwick wrote:


For now I've turned off journaling (soft updates see...

It's a known bug: SU+J currently deadlocks when used with UFS snapshots.




___
freebsd-stable@freebsd.org mailing list
http://li...
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9 crash/deadlock when dump(8)ing file system with journaling enabled.

2012-02-03 Thread Adam Strohl

On 2/3/2012 17:15, Adam Strohl wrote:
After doing this I have been using dump(8) nightly via cron(8) for 
about a week now under 9.0 without issue.  Running it on 11 servers 
currently.




P.S.
To be clear, I am using dump with -L (snapshots) without issue.



FreeBSD 9: Group quotas increase but don't decrease automatically

2012-02-03 Thread Adam Strohl
I'm running FreeBSD 9 on a number of systems and finally decided to take 
advantage of the quota system to enforce limits on my users.


No real issues setting it all up aside from finding that 
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/quotas.html 
needs to be updated.   The new /etc/rc.conf entry is quota_enable="YES" 
not enable_quotas="YES" as it says (assuming it used to be this in 
8.x?).  I'll file a PR for this shortly.


I did however run into a more serious issue (I think):

A group or user's allocation as reported by repquota(8) will increases 
with new/growing files, however when a file is deleted or chgrped out of 
the quota's group, the amount of space reported by repquota(8) does not 
decrease.  I have verified that the system does not register the freed 
space by going over the soft limit, being denied write, then deleting 
files.  Even if I delete files which drop me below the soft quota limit, 
I will not be able to add them as I am still "over quota".So it does 
not appear to be reporting issue, the system really doesn't realize the 
usage has gone down.


Interestingly the inode counts do decrease automatically/"instantly" as 
I would expect.


Running quotacheck(8) fixes the issue and updates the allocation counts, 
but does not magically fix auto-updating, so needs to be done 
periodically which can be a bit intensive depending on file count.


I see this on all FreeBSD 9 machines with quotas turned on.

For now I have a cron script which tries to guess (based on changing 
inode counts, etc) if it should run quotacheck, and does so if needed 
(to avoid just blindly running it periodically).


Anyone else run into this?  Am I missing something?  Known issue?  Let 
me know if anyone wants more info, etc.   I can also paste the work 
around "smart" cron script if anyone is interested (and I'm not missing 
something silly :P).



--

Adam Strohl
A-Team Systems
http://ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9: Group quotas increase but don't decrease automatically

2012-02-06 Thread Adam Strohl

On 2/3/2012 21:48, Konstantin Belousov wrote:

This is a bug in +J code (even if you do not use +J). Do you have
softupdates enabled on the volume ? If yes, try the following patch.

diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
index 5b4b6b9..ed2db79 100644
--- a/sys/ufs/ffs/ffs_softdep.c
+++ b/sys/ufs/ffs/ffs_softdep.c
@@ -43,6 +43,7 @@
  __FBSDID("$FreeBSD$");

  #include "opt_ffs.h"
+#include "opt_quota.h"
  #include "opt_ddb.h"

  /*
@@ -6428,7 +6429,7 @@ softdep_setup_freeblocks(ip, length, flags)
}
  #ifdef QUOTA
/* Reference the quotas in case the block count is wrong in the end. */
-   quotaref(vp, freeblks->fb_quota);
+   quotaref(ITOV(ip), freeblks->fb_quota);
(void) chkdq(ip, -datablocks, NOCRED, 0);
  #endif
freeblks->fb_chkcnt = -datablocks;


Bingo, this fixes the issue for me!  Testing it out on one machine now 
and will push it out to the others gradually ... thanks!

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD9 and the sheer number of problem reports

2012-02-23 Thread Adam Strohl


On 2/24/2012 1:39, Kurt Buff wrote:

On Thu, Feb 23, 2012 at 10:25, Damien Fleuriot  wrote:


Now, I find the number of problem reports regarding 9.0-RELEASE alarming
and I'm growing more and more fearful towards it.

In the current state of things, I have *absolutely* no wish to run it in
production :(

I'd love to hear feedback.

Feedback: If you're worried, wait until you aren't.


Thorough testing ahead of time will either make you confident or give 
you the option to report issues which affect you directly (and help 
improve FreeBSD).


I say that having run into one issue with 9.0 (and reported it).  I am 
still using and deploying more 9.0 servers into production.  My .02.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD9 and the sheer number of problem reports

2012-02-23 Thread Adam Strohl


On 2/24/2012 1:39, Kurt Buff wrote:

On Thu, Feb 23, 2012 at 10:25, Damien Fleuriot  wrote:


Now, I find the number of problem reports regarding 9.0-RELEASE alarming
and I'm growing more and more fearful towards it.

In the current state of things, I have *absolutely* no wish to run it in
production :(

I'd love to hear feedback.

Feedback: If you're worried, wait until you aren't.


Thorough testing ahead of time will either make you confident or give 
you the option to report issues which affect you directly (and help 
improve FreeBSD).


I say that having run into one issue with 9.0 (and reported it).  I am 
still using and deploying more 9.0 servers into production.  My .02.




Re: flowtable usable or not

2012-03-03 Thread Adam Strohl

On 3/3/2012 22:32, H wrote:

then you tell us today that ports is the best ever happened to you


It definitely is for me, and is a major reason why I love FreeBSD.  
Yum/RPM/etc are not without their own issues, and definitely is not fool 
proof nor 100% reliable in my experience.




Re: Request for flowtable testers and actionable feedback RE: flowtable usable or not

2012-03-05 Thread Adam Strohl

On 3/5/2012 15:00, Daniel Kalchev wrote:

I happen to share the opinion and the experience of Mark Linimon in situations 
like this and yes, I do believe you have been rude here. For no reason 
whatsoever.


I agree.  This "H" person has been hijacking threads over the last week 
or so, and all of the messages I've seen from them boil down trolling.


This is in contrast to the patient, well thought out replies from the 
rest of the list.


I'm at a loss as to what "H's" endgame is, but it probably has more to 
do with writing poorly executed metaphors than it does with helping 
FreeBSD or its users (whom he/she implies they represent).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-03-10 Thread Adam Strohl
I've now seen this on two different VMs on two different ESXi servers 
(Xeon based hosts but different hardware otherwise and at different 
facilities):


Everything runs fine for weeks then (seemingly) suddenly/randomly the 
clock STOPS.  In the first case I saw a jump backwards of about 15 
minutes (and then a 'freeze' of the clock).  The second time just 'time 
standing still' with no backwards jump.  Logging accuracy is of course 
questionable given the nature of the issue, but nothing really jumps out 
(ie; I don't see NTPd adjusting the time just before this happens or 
anything like that).


Naturally the clock stopping causes major issues, but the machine does 
technically stay running.  My open sessions respond, but anything that 
relies on time moving forward hangs.  I can't even gracefully reboot it 
because shutdown/etc all rely on time moving forward (heh).


So I'm not sure if this is a VMWare/ESXi issue or a FreeBSD issue, or 
some kind of interaction between the two.   I manage lots of VMWare 
based FreeBSD VMs, but these are the only ESXi 5.0 servers and the only 
FreeBSD 9.0 VMs.  I have never seen anything quite like this before, and 
last night as I mentioned above I had it happen for the second time on a 
different VM + ESXi server combo so I'm not thinking its a fluke 
anymore.  I've looked for other reports of this both in VMWare and 
FreeBSD contexts and not seeing anything.


What is interesting is that the 2 servers that have shown this issue 
perform similar tasks, which are different from the other VMs which have 
not shown this issue (yet).  This is 2 VMs out of a dozen VMs spread 
over two ESXi servers on different coasts.  This might be a coincidence 
but seems suspicious. These two VMs run these services (where as the 
other VMs don't):


- BIND
- CouchDB
- MySQL
- NFS server
- Dovecot 2.x

I would also say that these two VMs probably are the most active, have 
the most RAM and consume the most CPU because of what they do (vs. the 
others).


I have disabled NTPd since I am running the OpenVM Tools (which I 
believe should be keeping the time in sync with the ESXi host, which 
itself uses NTP), my only guess is maybe there is some kind of collision 
where NTPd and OpenVMTools were adjusting the time at the same time.  
I'm playing the waiting game now to see what this brings (again though I 
am running NTPd and OpenVMTools on all the other VMs which have yet to 
show this issue).


Anyone seen anything like this?  Ring any bells?

--

Adam Strohl
A-Team Systems
http://ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-03-10 Thread Adam Strohl

On 3/10/2012 17:10, Bjoern A. Zeeb wrote:

On 10. Mar 2012, at 08:07 , Adam Strohl wrote:


I've now seen this on two different VMs on two different ESXi servers (Xeon 
based hosts but different hardware otherwise and at different facilities):

Everything runs fine for weeks then (seemingly) suddenly/randomly the clock 
STOPS.


Apart from the ntp vs. openvm-tools thing, do you have an idea what "for weeks" 
 means in more detail?  Can you check based on last/daily mails/.. how many days it was 
since last reboot to a) see if it's close to a integer wrap-around or b) to give anyone 
who wants to reproduce this maybe a clue on how long they'll have to wait?  For that 
matter, is it a stock 9.0 or your own kernel?  What other modules are loaded?


Uptime was 31 days on the first incident / server (occurred 5 days ago)
Uptime was 4 days on the second incident / server (occurred last night)

One additional unique factor I just thought of: the two problem VMs have 
4 cores allocated to them inside ESXi, while the rest have 2 cores.


Kernel config is a copy of GENERIC (amd64) with the following lines 
added to the bottom.  All the VMs use this same kernel which I compiled 
once and then installed via NFS on the rest:


# -- Add Support for nicer console
#
options VESA
options SC_PIXEL_MODE

# -- IPFW support
#
options IPFIREWALL
options IPFIREWALL_VERBOSE
options IPFIREWALL_VERBOSE_LIMIT=10
options IPDIVERT
options IPFIREWALL_FORWARD
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-03-18 Thread Adam Strohl

On 3/12/2012 0:01, Ian Lepore wrote:
> It seems unlikely to me that ntpd and the vm tools would be fighting in
> a way that caused this symptom.  The way ntpd affects timing is to step
> the clock (which gets logged), or to numerically steer the kernel's
> timekeeping routines.  The steering is clamped at 500 ppm; to make the
> clock appear to stop it would have to steer at 1e6 ppm.  I've always
> assumed that VM guest services daemons that handle timekeeping use the
> same ntp_adjtime() interface to the kernel timekeeping that ntpd itself
> uses, so the same steering limits would apply.

An excellent point.

>
> If it happens again, interesting data might be found in the output of:
>
>sysctl kern.timecounter
>sysctl kern.eventtimer
>vmstat -i
>ntpdc -c kerninfo
>

Will do, I know there was nothing in dmesg, I will definitely check all 
of this though if/when it happens again.  I just brought up another ESXi 
5.0 host with FreeBSD 9.0 VMs (created from dump/restore from the 
existing ones), so there is an increased chance of me seeing this 
hopefully and getting to the bottom of it.  Or it never happens again :P



On 3/19/2012 1:36, Steve Wills wrote:

I've experienced something similar once or twice with ESXi 5.0. The
second time it happened, I found that kern.timecounter.tc.HPET.counter
stopped changing. I was told on IRC that this indicated a "hardware"
problem, which I took to indicate a possible bug in ESXi. I haven't
upgraded to ESXi 5.0 Update 1 yet to see if that changes anything.
Rebooting of course fixed it, it has been a while since this happened
and it hasn't happened again since so I haven't pursued it. Just another
data point, hope it hopes.


Thanks for the info!  I didn't realize there was an update out already 
for 5.0 (I don't see it on VMWare's site).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


FreeBSD 9 "gptboot: invalid backup GPT header" error (boots fine though)

2012-04-30 Thread Adam Strohl
I've been deploying FreeBSD 9 without issue on a number of 
near-identical servers for a client, but have run into an interesting 
annoyance when I hit the two DB servers.


These DB servers have an LSI 3ware 9750-8i (running a 6 disk RAID 10 in 
a single 3TB virtual volume) which puts them apart from the other two 
servers in this cluster (which don't show either issue I am about to 
discuss).  Otherwise the hardware is identical (Dual Xeon E5620s, 16GB 
RAM).  I've also never seen this before on other physical (or VM) 
FreeBSD 9 instances and I've probably done 50+ FreeBSD 9 VM and physical 
installs at this point (and run through the installer process probably 
over 150 times :P).


Before I get into the GPT error, I want to mention this in case its 
relevant:


I found I had to partition via the shell (gpart create/gpart add/etc 
etc) the disks during install or the kernel would fail to re-mount the 
root disk after booting into the new OS.   If I used the default layout, 
or the partition GUI at all (ie; 'manual mode') the new OS wouldn't 
remount root on boot.


I could manually specify the proper root device ie; ufs:/dev/da0p3 and 
continue booting without issue, so this is an installer thing.   I'm 
sure I could have fixed this in /boot/loader.conf or similar but wanted 
to try to figure out what was breaking (now I know its something the 
installer is doing since it doesn't happen when I do it manually).  So I 
kept reOSing it doing different things and ultimately found shell-based 
manual partitioning worked fine.


However, I see the following error right before BTX comes up (and did 
previously when using the installer's partition GUI):


gptboot: invalid backup GPT header

The machine boots fine, so I'm not stuck  but it is an annoyance for 
an A-type sysadmin like myself.  Even if its superficial I dislike 
setting up a client's machine to generate "errors" on boot, especially 
without an explanation or understanding behind it.   I also obviously 
wanted to raise the issue here in case there is actually a rare problem 
or this is a symptom of one.


I could find nothing that related specifically to this issue, so I was 
wondering if anyone else had seen this or had thoughts.


My suspicion is that maybe the large size of the volume (3TB or 2.7TB 
formatted) makes it too large for the boot loader to "address all of" 
and thus can't get to the end of the disk where the backup GPT header is 
to validate it..


Or maybe the RAID adapter is doing something weird at the end of the 
disk.  This seems unlikely since it presents the RAID as a single volume 
so I'd assume it would hide any tagging or RAID meta data from the OS' 
virtual volume though.


That's about all I can think of.

Selected dmesg output:
LSI 3ware device driver for SAS/SATA storage controllers, version: 
10.80.00.003
tws0:  port 0x1000-0x10ff mem 
0xb194-0xb1943fff,0xb190-0xb193 irq 32 at device 0.0 on pci4

tws0: Using legacy INTx
tws0: Controller details: Model 9750-8i, 8 Phys, Firmware FH9X 
5.12.00.007, BIOS BE9X 5.11.00.006


da0 at tws0 bus 0 scbus0 target 0 lun 0
da0:  Fixed Direct Access SCSI-5 device
da0: 6000.000MB/s transfers
da0: 2860992MB (5859311616 512 byte sectors: 255H 63S/T 364725C)


Let me know anyone wants to see anything else/has seen this/has any 
theories!


--

Adam Strohl
A-Team Systems
http://ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9 "gptboot: invalid backup GPT header" error (boots fine though)

2012-05-02 Thread Adam Strohl

Thanks Andrey,

I've just recompiled /boot/gptboot after updating gpt.c and installed it 
via:


gpart bootcode -p /boot/gptboot -i 1 da0

I still see "gptboot: invalid backup GPT header" on boot (but it does 
still boot).


On 5/2/2012 12:58, Andrey V. Elsukov wrote:

On 30.04.2012 23:14, Adam Strohl wrote:

da0 at tws0 bus 0 scbus0 target 0 lun 0
da0:  Fixed Direct Access SCSI-5 device
da0: 6000.000MB/s transfers
da0: 2860992MB (5859311616 512 byte sectors: 255H 63S/T 364725C)


Let me know anyone wants to see anything else/has seen this/has any theories!


Can you try patch from the r234693, update and reinstall gptboot, does it help?
http://svnweb.freebsd.org/base?view=revision&revision=234693


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9 "gptboot: invalid backup GPT header" error (boots fine though)

2012-05-02 Thread Adam Strohl


On 5/2/2012 20:46, Mark Saad wrote:
Did you try to repair the header ? I saw a similar issue on upgraded 
boxes that were 7-STABLE upgraded to 9-STABLE. and recovering made the 
warning go away . I may be way off here but just my 2 cents .


% gpart recover da0 


Good thought, but no dice:

$ gpart recover da0
da0 recovering is not needed
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: FreeBSD 9 "gptboot: invalid backup GPT header" error (boots fine though)

2012-05-09 Thread Adam Strohl

On 5/2/2012 23:08, Andrey V. Elsukov wrote:

On 02.05.2012 17:53, Adam Strohl wrote:

% gpart recover da0


Good thought, but no dice:

$ gpart recover da0
da0 recovering is not needed


I already saw several reports about gptboot's complains on 3ware
controllers, but don't know what is the problem.
The only guess is that a controller incorrectly handles BIOS requests,
when gptboot tries to read GPT header from the end of a large virtual disk.



Thanks for your input on this Andrey.  Just to clarify I am assuming 
that "da0 recovering is not needed" means that gpart has no problem 
reading and verifying the backup GPT header?


(which is why its probably the BIOS for the RAID controller as the GPT 
is actually intact)

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 9.0-RELEASE amd64: No Boot on VMWare Workstation

2012-05-14 Thread Adam Strohl

On 5/14/2012 22:18, Larry Rosenman wrote:

Is there a known issue with 9.0-RELEASE on amd64 VMWare Workstation?


Since nobody has chimed in I felt I should:

I use FreeBSD 9.0 routinely under VMWare Workstation without issue (my 
current VMWare Workstation version is 8.0.0 which is slightly out of 
date, 8.0.3 is available).



I tried to build a new VM on my new Lenovo W520 Laptop (Windows 7
Pro/64-Bit, 16G ram) and it gets to the Beastie menu, and times out, then
dies.

Any ideas?

What can I provide?


The version of VMWare Workstation you are running.


And, is there an issue with 9 in general, or could I install 8.3 and then
source update it to 9 or 10?


I build test VMs for 9 and do test upgrades from 6.x, 7.x and 8.x to 9 
using VMWare Workstation on my desktop and laptop without issue.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 9.0-RELEASE amd64: No Boot on VMWare Workstation

2012-05-15 Thread Adam Strohl

On 5/15/2012 21:49, Larry Rosenman wrote:

This is VMWare Workstation 8.0.3 booting off the release ISO.

Ideas?


Is this the installer that doesn't boot or is it the OS after you've 
installed?


If its the former you might just have a bad ISO download. Have/did you 
verified the checksum of the ISO?

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 9.0-RELEASE amd64: No Boot on VMWare Workstation

2012-05-16 Thread Adam Strohl

On 5/16/2012 8:12, Larry Rosenman wrote:

Ok, I'm just impatient.  I let it sit, and it eventually came up.

Would it be possible for the next 9.x release to set hw.memtest.tests="0"
when we discover we're under a hypervisor to avoid doing the tests? (or
default it to 0 in the installer kernel?)?



FWIW this seems odd/unique to your setup.

I see no such delay under any VMWare product, though I have not yet 
upgraded to Workstation 8.0.3 from 8.0.0.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 9.0-RELEASE amd64: No Boot on VMWare Workstation

2012-05-16 Thread Adam Strohl

On 5/16/2012 22:05, Larry Rosenman wrote:

I believe this is due to the 8G of memory I put on it. (I like to build
big VM's.

It's directly proportional to the size of the VM.



Ahh!  Yeah I rarely build a VM with more than a gig or two here in the 
office (ie; where I use Workstation).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You Using FreeBSD?

2012-05-30 Thread Adam Strohl

On 5/31/2012 1:20, David Chisnall wrote:

I am currently looking at updating some of our advocacy material (which 
advertises exciting new features like SMP support), and before I do I'd like to 
get a better feel for why the rest of you are using FreeBSD.  If you had to 
list the three things you most like about FreeBSD, which would you pick?  Are 
they the same as when you first started using it?


1. High performance with security and stability focus -- truly makes it 
the ideal server platform

2. The ports system (and supporting tools like portupgrade, portaudit, etc)
3. The OS "makes sense" (as Chris N. mentioned).  The file system 
layout, tools, etc are consistent.


There is so much other stuff too.  Like PF and CARP, ZFS and more ...  a 
kick-ass combo of features and very server-focused.


As a professional admin FreeBSD is a pleasure to work with day in and 
day out.  I've never heard a admins of "other" OSes say that :P


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You Using FreeBSD?

2012-05-31 Thread Adam Strohl

On 5/31/2012 21:22, Damien Fleuriot wrote:

On 5/31/12 4:01 PM, Jim Ohlstein wrote:

To add others, in no particular order:

Ease of upgrade. While some have noted that binary upgrades are easier
on Debian, it's far and away superior, IMMHO, to have a locally compiled
system. Many Linux distros have no upgrade path short of a wipe and
re-install.


Far superior, check, FAR MORE TIME CONSUMING, check as well !


This brings up another point: Repair is always possible with FreeBSD.

You can back out all packages or types of packages easily (and 
re-compile or reinstall them if needed).  You can recompile/reinstall 
the OS if needed (somewhere else too and copy it over).  Or just copy 
pieces from a live cd or restore tarball.  And it's pretty 
straightforward to do even for a non-admin person.


You can even restore over a live running system with tar, which I do 
occasionally when cloning machines or restoring them with dump/restore.  
Very slick.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You Using FreeBSD?

2012-05-31 Thread Adam Strohl


On 5/31/2012 21:47, Damien Fleuriot wrote:

Regarding packages, I've never really explored it, would you detail a bit ?


Well, I really mean the resulting pkg info from a port.  A good example 
is PHP, sometimes you have to say "everyone out of the pool" because of 
an upgrade:


cd /var/db/pkg && PKGS=`ls | egrep "^(php|pear|pecl)"`; for PKG in 
$PKGS; do echo " $PKG"; pkg_delete "$PKG"; done;


Running that a few times until it stops picking things up, then its a 
few commands to re-install PHP and its extensions (because of the 
extensions roll-up port).


You can of course script it further, which is part of why I like FreeBSD 
so much.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You Using FreeBSD?

2012-06-01 Thread Adam Strohl

On 6/1/2012 17:19, Katinka wrote:
There's a nice discussion going on, over at Phoronix. 

For some reason, they don't seem to like us very much. 


Lots of the comments remind me about Linux vs. Windows in the late 90s, 
and taken with a grain of salt are fairly amusing because of how 
ignorant a lot of them are.


I found this particularly fitting comment at the very end:

"If you'd ask me for the biggest difference between Linux and BSD users: 
We know all about Linux - They know nothing about BSD. "


Which is sad really, their lives could be so much easier if only they 
knew how much better it could be ;D  (My opinion of course, I'm sure 
lots of people think Windows Server administration is easier than any 
UNIX -- just not on this list).  To each their own, and arguing about it 
is counter-productive.


I do think that forum post underscores the need for advocacy though -- 
we need to get the message out as to why FreeBSD is better than any OS 
in a lot of applications (which is different than arguing it out on 
Linux forums).  We need them to try it out and expose them to the things 
that make it great so they see it first hand.  Because it is clear most 
of these posters are very ignorant about FreeBSD -- that's really "our" 
collective fault.


Trolls and fanbois aside there is probably a huge number of Linux admins 
out there who just use it "because that is what they use" .. in the same 
way that Windows admins in the 90s hadn't really heard of Linux and 
feared it because they didn't understand it.


My 2 cents + attempt at keeping this thread constructive  I think 
I'm going to go sign up for the FreeBSD-advocacy list now ...

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You Using FreeBSD?

2012-06-01 Thread Adam Strohl

On 6/1/2012 18:03, Jason Leschnik wrote:

I may be totally incorrect with my above ideas, but it's what i would
like to see from FreeBSD *again*... This is the reason in the first
place most people used FreeBSD, stability/scalability/performance are
the hallmarks of FreeBSD. If we have these hard hitting numbers
released frequently it gives the dev team a good indication of how
changes reflect on performance.


This is a good point and the kind of stuff that would make a, for 
example, great Slashdot post once finished.


Of course there would be arguments but I think it would be good 
exposure.  It certainly would be nice to have a place to point to these 
things vs. just saying "its more better and stabler", too.  And if its 
not at least its acknowledged so it can be fixed.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-03 Thread Adam Strohl

On 6/3/2012 10:09, Mark Linimon wrote:

On Sun, Jun 03, 2012 at 01:43:43AM +0200, Fritz Wuehler wrote:

So there could be lots of overlap and just looking at the two numbers
you posted doesn't really tell the whole story.

No, I agree that it doesn't.  I was just trying to add an aside, and
point out that the task would not be trivial.

Since I'm heavily invested in FreeBSD ports I think I need to step back
and let other folks comment in this thread.


I manage and support a little over 50 FreeBSD servers (VMWare, Xen and 
native) and feel that the port system, on the whole, is excellent.  Its 
easily one of the best features about FreeBSD.   Portaudit reports 
issues and I can plan and upgrade them as needed.  Portupgrade works 
great 99% of the time and when it doesn't it has the good sense to roll 
back what its done.  If there is any question as to what it should do it 
errors and tells me, which is exactly what I want it to do.


I've been a FreeBSD user for about 18 years and supported it 
professionally for about 10.  In this thread I've read a few posts that 
contain blanket statements like "ports are broken" and "never work", I'm 
at a loss as to how to respond to this as it is completely counter to my 
experience.   I wish I could see what they were talking about and figure 
out what happened so I could understand what caused them to make such a 
statement.  It's like they're talking about a different OS than the one 
I know.


I've written a simple script to run portaudit and pop up a dialog with 
check boxes that then kicks off portupgrade for the selected ports which 
have issues.   99% of the time its that simple.  This is what I want in 
a server environment.  I do not want things auto-updating (a.k.a. auto 
breaking) or making decisions about supporting libraries behind my back. 
  PHP is a good and common example why: an upgrade can and does break 
web sites that ran fine before.   Updates need to be managed in a 
process which is outside the scope of the OS (because its a server not a 
desktop).  FreeBSD has all these great tools for managing the mechanical 
action of updating and imposes minimal process which is perfect because 
I have my own process.  And if things get mucked up (which mostly isn't 
the ports system fault when it does happen), its easy to back out and 
re-do if needed.


After reading this thread I am wondering if I should clean the update 
dialog script up and submit to the ports tree.  It seems like people 
think the port update process is harder than it is because it lacks a 
Windows Update like dialog which is essentially what this is akin to 
(and there might be a port which does this already, too .. anyone?).  
All the hard stuff has been done by the FreeBSD team, all I did was put 
a bash/dialog script on it.


I very rarely run into ports that don't build on supported versions of 
FreeBSD (ie; ones that haven't reached EoL).  I have a number of 
customers with a few 6.2 boxes [which I can't wait to upgrade] and still 
almost everything builds without tinkering.


All of this is in the scope of servers though (web, DB, application, 
etc) and not on the desktop.  I haven't used a FreeBSD desktop since 
probably 4.x, and while I don't begrudge the work people are doing for 
the desktop experience it just doesn't apply to me nor is it why I love 
FreeBSD.   I won't say something like "you're running a server OS on 
your desktop and expecting it to be like a Mac".  What will say is: I'm 
getting from this thread that a lot of the complaints people have seem 
to be based around the desktop.  My guess is that this is a super 
minority of actual use (by server count).


BUT: I feel like people are judging how fit an FreeBSD is for server 
work by how easy/Mac/Windows/whatever like (as many Linux distros try to 
emulate) it is to update.  Not good ... but it makes sense from a 
social/human perspective, and is probably another thing we should 
consider in terms of advocacy.


I'm interested in what people think about this, and yeah this should 
probably be in the advocacy list but its not so thhblt :P


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD?

2012-06-03 Thread Adam Strohl

On 6/3/2012 11:14, Erich wrote:

What I really do not understand in this whole discussion is very simple. Is it 
just a few people who run into problems like this or is this simply ignored by 
the people who set the strategy for FreeBSD?

I mention since yeares here that putting version numbers onto the port tree 
would solve many of these problems. All I get as an answer is that it is not 
possible.

I think that this should be easily possible with the limitation that older 
versions do not have security fixes. Yes, but of what help is a security fix if 
there is no running port for the fix?


I feel like I'm missing something.  Why would you ever want to go back 
to an old version of the ports tree?  You're ignoring tons of security 
issues!


And if a port build is broken then the maintainer needs to fix it, that 
is the solution.


I must be missing something else here, it just seems like the underlying 
"need" for this is misguided (and dangerous from a security perspective).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-03 Thread Adam Strohl

On 6/3/2012 17:51, Mehmet Erol Sanliturk wrote:

Always I am stressing that to manage FreeBSD,  a fair amount of expertise
is required which I think this level may be reduced by improving the
FreeBSD management by transferring knowledge to its managing parts ( for
example : package management , repair of broken parts , installation steps
to reach a state like in very easily usable Linux distributions such as
Fedora , Mageia , Mandriva , and many others , etc. )


Yeah or a GUI to reduce the need for knowledge transfer.


You know what to do by your expertise gained over use , which such an
expertise is completely missing in a new comer , and even sometimes in very
highly experienced computer professionals because a different operating
system reduces them to a little experienced new starter .



I agree and your issue with USB sticks proves my point.  I've never 
tried to mount an NTFS USB stick and I'm OK with that.  But for you it 
is a big hassle (understandably so) and it has definitely negatively 
impacted your view of FreeBSD.



Compare the cost of a Linux or Windows and personal time , and make a
decision which one to choose .

Another point frequently mentioned is that FreeBSD is leaned toward servers
.
Only I want to say that , "Please , install a CentOS , Debian , or Windows
Server trial , and see how a server may be ..."


I manage Windows, CentOS and Debian (and RedHat and a few others) 
servers too.   I've found FreeBSD is more reliable on the whole and 
takes less time to maintain (which means less expensive for my clients). 
 This is one area where FreeBSD shines.  And when things do break it is 
possible to recover fairly easily.  That is another.


And yes, in terms of that initial learning curve my experience helps but 
its the OS that is doing the work here.  If I was more experienced with 
Windows or Linux it wouldn't make them any easier to update, either 
though.  So there is a point at which "knowing what to do" stops being 
the limiting issue and its just "ok well this is broken now and it can't 
be cost-effectively fixed".   That crossover point is something that is 
almost never reached with FreeBSD in my experience.


All of this is completely parallel and unrelated to your (or another 
person's) experience as a desktop user though.  What you see is "USB 
thumbdrives don't work" :)   So you decide to use another OS, and 
probably wouldn't advocate for FreeBSD if presented the chance in a 
server context because of that experience.  That is a shame in my book. 
(I know I'm putting words in your mouth but its simply to illustrate my 
thinking on how public perception is formed).


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD?

2012-06-03 Thread Adam Strohl



On 6/3/2012 19:24, Erich wrote:

yes, you miss a very simple thing. Updated this morning your ports tree. Your 
client asks for something for Monday morning for which you need now a program 
which needs some kind of PNG but you did not install it.

Do you have a machine that is fast enough to upgrade all your ports and still 
finish what your client needs Monday morning?


All I'd need to do is compile and install the libpng and then compile 
the program.   There is no need to "upgrade all my ports".




The ports tree is not broken as such. Only the installation gets broken in some 
sense. Have a version number there would allow people to go back to the last 
known working ports tree, install the software - or whatever has to be done - 
with a working system.

Of course, the next step will be an upgrade. But only after the work which 
brings in the money is done.


I don't understand what you are saying here, sorry.  Or why you'd 
upgrade all your ports to install 1 new one.




You do not face this problem on Windows. You can run a 10 year old 'kernel' and 
still install modern software.


Not true at all.  Lots of Windows software requires minimum service pack 
and KB patch levels.




Erich

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-08 Thread Adam Strohl

On 6/9/2012 3:34, Steve Franks wrote:

Every time libjpeg or
perl or python bumps the rev, I have to explain to my boss that I
won't be using my computer for 48 hours.


Why is this?  And why are you updating every time there is a rev bump?

It almost sounds like you're recompiling everything just for the heck of 
it, though I don't get how even that takes 48 hours.  Even make 
buildworld is done in multi-user mode and so you could use your 
workstation during the build.  And we're talking about ports here so ...


Just curious!

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: su problem

2012-06-09 Thread Adam Strohl

On 6/9/2012 20:29, Sami Halabi wrote:

Hi,
/var/log/messages - no new logs


Sorry if this has been asked, anything in dmesg?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: su problem

2012-06-09 Thread Adam Strohl

On 6/9/2012 20:33, Sami Halabi wrote:

its the same as /var/log/messages


I assume you mean there is nothing there because it's not the same thing 
(yes dmesg stuff should get logged into syslog but your system obviously 
isn't working right so ...).


Past that I've been skimming this thread since you posted and I can't 
think of anything here that would resolve this except that it might be 
worth a try to have someone ctrl-alt-del it (requires no FreeBSD 
knowledge, passwords, etc by the person doing it and should gracefully 
reboot the server).   Its a total Hail Mary [pass] though [and probably 
won't work].


It might lock you out entirely, too.

P.S.
Beyond this incident obviously setting up a remote console is ideal, 
IPMI is very worth it, but my guess is you'd have it setup if your MB 
had it.  If you don't have an IPMI module and you happen to have another 
box there cross-patching their serial consoles to each other so if one 
goes down you can serial via the other one (ie; server1's com1 to 
server2's com2, and server2's com1 to server1's com2).  You need to set 
this up as root though so no help now.


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-09 Thread Adam Strohl

On 6/9/2012 14:50, O. Hartmann wrote:

Lucky man! We are "off" from some desktop services (like LibreOffice and
Firefox) for more than a week now!


Why did you update to begin with?  Bug/security fix?

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-09 Thread Adam Strohl

On 6/9/2012 21:04, O. Hartmann wrote:


Well, this is a good question. Unfortunately, I did an update of the
ports tree and PNG update rushed in. The information in UPDATING came a
in bit later, but since then several ports have been updated already -
and rendered some applications unuseable.

The question "why" isn't applicable here. Sometimes ports need updates
or a port that is installed reels in another or even an update and this
triggers the avalnche of messes.



Fair enough, I just feel like people reporting "48 hours of not using 
their computer" are doing something extraordinarily weird and I'm just 
at a loss as to what they're doing and why.


I get the feeling people are updating their ports tree and then 
recompiling/reinstalling everything "just because" and then are 
complaining when one thing breaks (its the only thing I can think of).


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why Are You NOT Using FreeBSD ?

2012-06-09 Thread Adam Strohl

On 6/9/2012 21:36, H wrote:

why is there an update, would be a little bit better


My point was why do you need the update, and can't wait until its been 
better vetted.  The porters do the best they can but can't test everything.



but a real good question would be, why is there a not working/compiling
update released to the ports tree


Because it was just released and every combination of system 
configuration hasn't been tested, so there is some lag time before it 
stabilizes, especially with complicated software.


There in lies the question -- why do you need to compile a port which 
was just released?   Is it a security thing or is it "I want the latest" 
?  I'm just curious (and totally uninterested in how this ranks in your 
"worse question" list).


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Backups with 9-STABLE -- Options?

2012-06-10 Thread Adam Strohl

On 6/10/2012 3:08, Karl Denninger wrote:

With SU+J as the default filesystem, what options actually WORK now?

1. Dump "L" will NOT -- it doesn't hang any more but now just bitches
and refuses to run.  I suppose that beats a hang


Heh, yeah that is improved from what it did before ;D


2. Dump without "L" and take your chances?  What risks am I running by
doing this on a running system?


Depends on what is running and how it does file writes.  For example SQL 
DB storage engines are unlikely to do well (ie; the restore will be 
corrupted if there are changes during the process).  Something like 
CouchDB though which is "always consistent on disk" probably wouldn't care.


Past specific applications (or user activity) the inherent risk is 
unpredictable usefulness of your backups.  Since you're doing backups as 
a safeguard (and are very likely your last hope if things really go 
wrong) you don't want to find out that a key piece corrupted or missing 
entirely due to files moving around during the dump when you end up 
needing it.



3.  Other?

Dump has been the canonical means of backing up... forever.  And it
still is claimed to be the canonical means in the documentation.

So what options do we have now that actually work -- is there now a new
"canonical" backup method that is recommended?


My solution is to turn off journals for any build.   Dump is a great 
tool (especially when scripted) and is very efficient.


And as neat as journals are, backups using dump with snapshots is way 
more valuable and important in my book.


My .02.

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Backups with 9-STABLE -- Options?

2012-06-10 Thread Adam Strohl

On 6/10/2012 22:26, Karl Denninger wrote:

Well, backup with snapshots don't do well EITHER on a database unless
you can snapshot BOTH the dbms data store(s) and the transaction log
store(s) /*at the exact same instant*/.  If you cannot then you're
asking for trouble and are likely to get it.  But I've dealt with that
particular "gotcha" problem in a different way for the DBMS I use
(Postgresql)


You asked what would happen, not what was the best way to back up a SQL 
DB, but your point is valid.


Snapshots don't fix this issue entirely but drastically reduce the 
chance of a 100% broken backup.


SQL servers should be dumped out to disk (ie; mysql_dump) to avoid this 
or have a dedicated backup client (which means you're probably not using 
dump anyway).



So basically what you're saying is that SU+J leaves you exposed to
having no real backup option that provides a rational guarantee of the
ability to restore the backup taken.


That's a bit of a gloss over on what I said.  My point was that you 
might end up missing something if its changing at the time the backup 
was taken.  It really depends on what specifically that server is doing.


There is also a consistency issue too, using snapshots makes it so that 
all the files make sense together, instead of the files getting more and 
more recent as the end of the backup block approaches.


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: IPv6 and CARP crashes boxes

2012-06-12 Thread Adam Strohl

On 6/12/2012 19:48, Pete French wrote:

I ran into some - aliases on a CARP integface did not seem
to work proprly - but if you workaround that then it appears
to work fine. We are using it in production with no problems.


I have noticed this issue (CARP + IPv4 aliases) with older (pre 9.x) 
versions of FreeBSD.


I maintain some legacy 6.2 servers and had to eventually add ifconfig 
statements inside rc.local to get the links to coalesce.  6.2 appears to 
ignore _alias directives entirely inside rc.conf, and has real issues 
if you add/delete aliases to a CARP interface while its up (both peers 
end up thinking they're MASTER).


In 9.x it all works as expected at least for IPv4 (rc.conf 
carp_alias entries, aliases, on the fly reconfiguring).


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: IPv6 and CARP crashes boxes

2012-06-12 Thread Adam Strohl

On 6/12/2012 20:08, Pete French wrote:

I have noticed this issue (CARP + IPv4 aliases) with older (pre 9.x)
versions of FreeBSD.


Ah, just to be clear, the only problems I had with aliases weher IPv6 - it
always worked properly with IPv4. But I didnt try on anything pre 8.1!

-pete.


Doh, I caught this just as I hit send :P

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Recommendation for Hyervisor to host FreeBSD

2012-07-05 Thread Adam Strohl

On 7/5/2012 21:27, Rainer Duffner wrote:

They come (or came, last time I looked) with a lot of
run-time dependencies and even more at build-time.
And AFAIK, they don't offer the full functionality either.


There is a number of dependencies, but as far as I know it isn't missing 
anything: memory driver, OS control (ie; shutdown), etc.


I manage dozens of FreeBSD VMs under ESXi 3.5, 4.x and 5.0 ... most of 
them using OpenVM tools (ie; the 9.x hosts), works great.



--
Adam Strohl
http://www.ateamsystems.com/


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SOLVED: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-08-03 Thread Adam Strohl
Just a heads up on the original issue, which is FreeBSD's timer/clock 
stopping under ESXi 5.0 and some later versions of VMware Workstation.


I've gotten a few direct messages that this thread ranks high on Google 
but people are missing the solution.  A few months ago I found this 
forum posting (I believe this was linked in this thread already) 
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2012-03/msg00201.html 



The long and short of it is that changing the kern.timecounter sysctl 
value to ACPI-fast or (ACPI-safe if you're not running 9.x yet) fixes 
the hanging issue so far for us.


To temporarily enable it under 9.x:
sysctl kern.timecounter.hardware=ACPI-fast

Pre 9.x (which doesn't have the ACPI-fast mode):
sysctl kern.timecounter.hardware=ACPI-safe

To make this persist across reboots and be enabled by default add this 
line to your /etc/sysctl.conf


Under 9.x:
kern.timecounter.hardware=ACPI-fast

Pre 9.x:
kern.timecounter.hardware=ACPI-safe

Hope this helps anyone running across this issue.

--
Adam Strohl
http://www.ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SOLVED: Time Clock Stops in FreeBSD 9.0 guest running under ESXi 5.0

2012-08-03 Thread Adam Strohl

Doh, correct URL for the forum post is:
http://forums.freebsd.org/showthread.php?t=31929&page=2

On 8/3/2012 14:38, Adam Strohl wrote:

Just a heads up on the original issue, which is FreeBSD's timer/clock
stopping under ESXi 5.0 and some later versions of VMware Workstation.

I've gotten a few direct messages that this thread ranks high on Google
but people are missing the solution.  A few months ago I found this
forum posting (I believe this was linked in this thread already)
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2012-03/msg00201.html


The long and short of it is that changing the kern.timecounter sysctl
value to ACPI-fast or (ACPI-safe if you're not running 9.x yet) fixes
the hanging issue so far for us.

To temporarily enable it under 9.x:
sysctl kern.timecounter.hardware=ACPI-fast

Pre 9.x (which doesn't have the ACPI-fast mode):
sysctl kern.timecounter.hardware=ACPI-safe

To make this persist across reboots and be enabled by default add this
line to your /etc/sysctl.conf

Under 9.x:
kern.timecounter.hardware=ACPI-fast

Pre 9.x:
kern.timecounter.hardware=ACPI-safe

Hope this helps anyone running across this issue.




--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


No buffer space available / tcp_inpcb value

2012-10-30 Thread Adam Strohl

Hey -STABLE,

I've got a client who we've setup a FreeBSD cluster for with about a 
dozens servers, all behind two front end proxies/LBs/firewalls which 
also act as NAT gateways for the internal servers.


On the active front end proxy we've started seeing "fatal: socket: No 
buffer space available" errors during high-peak times.   I can see in 
vmstat -z that this is what is getting denied:


ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
tcp_inpcb:  392,  32770,   19398, 13372,1449734621,6312858,   0

We've got a lot of the other values bumped, and it appears to be this 
input limit that is getting hit.  There are no other non-zero FAILed 
counters except 64 and 128 buckets which I believe are normal.


I cannot seem to find the sysctl (or equiv) that controls this limit 
though, or even what it is.  Anyone know?


I'm obviously in need of this specific answer, but overall is there a 
codex of vmstat -z's items that explains this that I have just not found 
in my searches?  This isn't the first time I've had to dig into a value 
like this to increase it's limit, but this time I'm not turning anything up.


Any thoughts/ideas appreciated!

--
Adam Strohl
http://www.ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: No buffer space available / tcp_inpcb value

2012-10-30 Thread Adam Strohl

On 10/30/2012 23:05, Adrian Chadd wrote:

Check the output of 'netstat -mb', maybe you're also running out of mbufs?


There was nothing denied there that I can see:

35696/4039/39735 mbufs in use (current/cache/total)
2069/3797/5866/32768 mbuf clusters in use (current/cache/total/max)
2069/2077 mbuf+clusters out of packet secondary zone in use (current/cache)
4/3283/3287/16384 4k (page size) jumbo clusters in use 
(current/cache/total/max)

0/0/0/8192 9k jumbo clusters in use (current/cache/total/max)
0/0/0/4096 16k jumbo clusters in use (current/cache/total/max)
13078K/21735K/34813K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines




Adrian


On 30 October 2012 06:21, Adam Strohl  wrote:

Hey -STABLE,

I've got a client who we've setup a FreeBSD cluster for with about a dozens
servers, all behind two front end proxies/LBs/firewalls which also act as
NAT gateways for the internal servers.

On the active front end proxy we've started seeing "fatal: socket: No buffer
space available" errors during high-peak times.   I can see in vmstat -z
that this is what is getting denied:

ITEM   SIZE  LIMIT USED FREE  REQ FAIL SLEEP
tcp_inpcb:  392,  32770,   19398, 13372,1449734621,6312858,   0

We've got a lot of the other values bumped, and it appears to be this input
limit that is getting hit.  There are no other non-zero FAILed counters
except 64 and 128 buckets which I believe are normal.

I cannot seem to find the sysctl (or equiv) that controls this limit though,
or even what it is.  Anyone know?

I'm obviously in need of this specific answer, but overall is there a codex
of vmstat -z's items that explains this that I have just not found in my
searches?  This isn't the first time I've had to dig into a value like this
to increase it's limit, but this time I'm not turning anything up.

Any thoughts/ideas appreciated!

--
Adam Strohl
http://www.ateamsystems.com/

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SU+J on 9.1-RC2 ISO

2012-11-02 Thread Adam Strohl

On 11/2/2012 23:47, Bas Smeelen wrote:

Hi

Why are journaled soft updates the default when installing a new system
from a 9.1-RC2 ISO?

I admit I did not pay too much attention when installing a new system
from an 9.1-RC2 ISO and found out when taking a snapshot with dump (dump
-0Lauf) to clone the system. Other systems (9-STABLE, 9.1-RC2 and
9.1-RC3) have been upgraded from 8.X-RELEASE and earlier, so there are
no journaled soft updates enabled, just soft updates, and well there
dump with snapshot works just fine.

Can SU+J be disabled for the 9.1-RELEASE or do you think this is not
going to be a problem for users of FreeBSD? I will have to boot these
two systems single user now to disable the soft updates journal, because
I use dump + restore on live systems, not a problem for me, it is just
an inconvenience.



I have to second this sentiment.  Unless the dump/snapshot issue has 
been resolved they journal should be turned off by default.


It's a really nasty bug that causes an instant panic which is awful if 
the server is in production.  The fact that it happens when you're 
trying to exercise due diligence (ie; backups) is even worse.


-- my .02
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SU+J on 9.1-RC2 ISO

2012-11-02 Thread Adam Strohl

On 11/3/2012 0:13, Mike Jakubik wrote:

You can disable SU+J after installing, though it would be nice if the
installer gave you a choice.


This assumes that you know about this flaw, which most people do not.

I didn't until I discovered it by panic-ing a perfectly fine running 
server.  Getting burned by a known bug like this shouldn't be "SOP" for 
users of FreeBSD.


If anything it should be turned off by default, and people can turn it 
on if they want given the landmine it plants.  If they know how to turn 
it on they're much more likely to be aware of the issue.



--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: SU+J on 9.1-RC2 ISO

2012-11-04 Thread Adam Strohl

On 11/3/2012 1:31, Mateusz Guzik wrote:

Currently when you try to take a snapshot, the kernel checks whether SUJ
is enabled on specified mount-point, and if yes it returns EOPNOTSUPP.

See this commit (MFCed as r230725):
http://svnweb.freebsd.org/base?view=revision&revision=230250



Ahhh excellent to hear. I partition manually these days with 9.0-R 
because most servers are either using gmirror, which I want setup before 
the install, or a RAID card which means partitions need to be aligned to 
the stripe boundaries.  So I just "newfs -U -L" and keep journaling off 
and wouldn't have realized there is at least some mitigation that will 
make it into 9.1-R.


I still stand by my feeling that it should not be on by default though, 
because it breaks snapshots and by extension dump -L which I consider to 
be a pretty awesome feature of FreeBSD.  If you have partitions with 
enabled it means booting up in single user to undo it which is a hassle 
for a server if it's in production (I realize that's a bit whiny :P).



--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Why is SU+J undesirable on SSDs?

2012-11-04 Thread Adam Strohl

On 11/4/2012 5:32, Karl Denninger wrote:

It is utter insanity to enable, by default, filesystem options that
break _*the canonical backup solution*_ in the handbook ("dump", when
used with "-L", which it must be to dump a live filesystem SAFELY.)


Exactly.


--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both 
physical 9.1 boxes as well as VMs for I would say 6-9 months at least. 
 I finally have a physical box here that reproduces it consistently 
that I can reboot easily (ie; not a production/client server).


No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not actually 
power down or reboot.  KB input seems to be ignored.  This server is a 
ZFS NAS (with GMIRROR for boot blocks) but the other boxes which show 
this are using GMIRRORs for root/swap/boot (no ZFS).


Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted 
cleanly ... on this ZFS box it comes back quick because ZFS is good like 
that but on the other servers with GMIRROR roots rebuilding the GMIRROR 
and fscking at the same time is murder on the disk/performance until it 
finishes.


Another interesting thing is that this particular server runs slapd 
(OpenLDAP) which, when it comes back up, has a "corrupted" DB (easily 
fixed with db_recover, but still).  This might be because FS commits 
aren't happening at the end.   I can even manually stop slapd (service 
slapd stop) then run sync(8) (I assume this does something for ZFS too) 
and it still comes back as hosed if I reboot shortly after.  If I 
start/stop slapd it's fine.  So I feel like there is an FS/dismount 
thing going on here.


Additional information: I also have some boxes which will reboot (ie; 
they don't freeze like some do at the end) but they don't dismount 
cleanly either and have to rebuild both GMIRROR and fsck.  This might be 
a different issue, too.


Anyone have any thoughts?  Let me know if I can provide more details etc.

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl

On 6/19/2013 19:21, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 06:35:57PM +0700, Adam Strohl wrote:

Hello -STABLE@,

So I've seen this situation seemingly randomly on a number of both
physical 9.1 boxes as well as VMs for I would say 6-9 months at
least.  I finally have a physical box here that reproduces it
consistently that I can reboot easily (ie; not a production/client
server).

No matter what I do:

reboot
shutdown -p
shutdown -r

This specific server will stop at "All buffers synced" and not
actually power down or reboot.  KB input seems to be ignored.  This
server is a ZFS NAS (with GMIRROR for boot blocks) but the other
boxes which show this are using GMIRRORs for root/swap/boot (no
ZFS).

Here is what happens on the console: http://i.imgur.com/1H8JMyB.jpg

When I reset the server it appears that disks were not dismounted
cleanly ... on this ZFS box it comes back quick because ZFS is good
like that but on the other servers with GMIRROR roots rebuilding the
GMIRROR and fscking at the same time is murder on the
disk/performance until it finishes.


1. You mention "as well as VMs".  Anything under a "virtual machine" or
under a hypervisor is going to be very, very, **VERY** different than
bare metal.  So I hope the issues you're talking about above are on bare
metal -- I will assume so.


Nope, I see basically the same thing sometimes under ESXi 5.0 Hypervisor 
(and yes it worries me the implications of something so broad).  Those 
unites I just haven't been able to isolate on a server which isn't 
critical.  Lets focus on this server for now though per your suggestion 
below.




2. We need to know what version of "9.1" you're using, i.e. 9.1-RELEASE.
If you use stable/9 (RELENG_9) we need to see uname -a output (you can
hide the machine name if you want).


Sorry, this ZFS box is 9.1-R P4 (kernel built today):

FreeBSD ilos.dsn 9.1-RELEASE-p4 FreeBSD 9.1-RELEASE-p4 #6: Wed Jun 19 
15:31:12 ICT 2013 root@hostname:/usr/obj/usr/src/sys/ATEAMSYSTEMS  amd64




3. Can we please have dmesg from this machine?  The controller and some
other hardware details matter.


Sure take a look at the full log here: http://pastebin.com/k55gVVuU

This includes a boot, then a reboot as I describe (you can see it logs 
the All Buffers Synced, etc) then powering back on.




4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?


Weirdly this allowed it to reboot on the first try (without needing to 
be reset), but not the second.  The "Starting background file system 
checks in 60 seconds" message appeared ... that only happens when 
something is dirty, right?


So the second try with just this I could ctrl alt del it and it 
responded .. kind of:

http://i.imgur.com/POAIaNg.jpg

Still had to reset it though.



5. Does "sysctl hw.acpi.handle_reboot=1" help you?


No change, still responded to a ctrl alt del like above, but like that 
still needs to be reset and comes back dirty.




6. Does "sysctl hw.acpi.disable_on_reboot=1" help you?


No change.  Same as above, ctrl alt del responds but needs a hard reset 
still.




7. If none of the above helps, can you please boot verbose mode and then
when the system "locks up" on "shutdown -r now" take a picture of the
VGA console?


Lots of debug on boot obviously but not much different on shutdown/hang:
http://i.imgur.com/SgzSsoP.jpg



8. Does the machine run moused(8) (check the process list please, do not
rely on rc.conf) ?


ps -auxww | grep moused reveals nothing running (which is how I have 
things set).





Another interesting thing is that this particular server runs slapd
(OpenLDAP) which, when it comes back up, has a "corrupted" DB
(easily fixed with db_recover, but still).  This might be because FS
commits aren't happening at the end.   I can even manually stop
slapd (service slapd stop) then run sync(8) (I assume this does
something for ZFS too) and it still comes back as hosed if I reboot
shortly after.  If I start/stop slapd it's fine.  So I feel like
there is an FS/dismount thing going on here.


sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982
http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html


Groking this now ..



Your problem is related to unclean shutdown; fix that and your issues go
away.


Yeah that is my feeling as well.




Additional information: I also have some boxes which will reboot
(ie; they don't freeze like some do at the end) but they don't
dismount cleanly either and have to rebuild both GMIRROR and fsck.
This might be a different issue, too.


Every issue needs to be handled/treated separately.


Sure, I just had run across some threads about that but will focus on 
this ZFS box (and see 

Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl

On 6/19/2013 19:53, Adam Strohl wrote:

sync(8) does not do what you think it does.  Please read (not skim) this
entire thread starting here:

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/thread.html#16982

http://lists.freebsd.org/pipermail/freebsd-fs/2013-April/016982.html


Groking this now ..



Epic.  So basically "mount -u -o ro " is really what I (and probably 
everyone else) wants and the man page needs a major overhaul + 
disclaimer (and possibly a recommendation to use "mount -u -o ro " 
instead).



--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl
asize: 131072 (128k)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 1
   Flags: NONE
   GenID: 0
   SyncID: 1
   ID: 1958473873
3. Name: ada2p1
   Mediasize: 131072 (128k)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 2
   Flags: NONE
   GenID: 0
   SyncID: 1
   ID: 1208229558
4. Name: ada3p1
   Mediasize: 131072 (128k)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 3
   Flags: NONE
   GenID: 0
   SyncID: 1
   ID: 3928010527
5. Name: ada4p1
   Mediasize: 131072 (128k)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 4
   Flags: NONE
   GenID: 0
   SyncID: 1
   ID: 442340132
6. Name: ada5p1
   Mediasize: 131072 (128k)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e1
   State: ACTIVE
   Priority: 0
   Flags: NONE
   GenID: 0
   SyncID: 1
   ID: 1281187492


3. Any/all details of your gmirror setup or other things you can
think of when you set it up


The only thing is that we use GMIRROR on the partition level because we 
use GPT (which is clear from the gpart output I think).  I gmirror the 
boot partition only in this case as I use ZFS backed swap and ZFS root 
for this server.



4. Contents of /etc/fstab


>>>> cat /etc/fstab
# DeviceMountpoint  FStype  Options DumpPass#
# NOTE: ZFS root is not managed here
/dev/zvol/zroot/swapnoneswapsw  0   0


5. Contents of /boot/loader.conf


>>>> cat /boot/loader.conf
geom_mirror_load="YES"
zfs_load="YES"
vfs.root.mountfrom="zfs:zroot"
aio_load="YES"
if_lagg_load="YES"



6. Contents of /etc/rc.conf


#  Don't run FS check and let apps start
#
fsck_y_enable="YES"
background_fsck="NO"

#  Power management enables SpeedStep and TurboBoost
#
powerd_enable="YES"
powerd_flags="-a hiadaptive"

#  Networking
#
hostname="hostname"
defaultrouter="xxx.xxx.xxx.3"
# -- LACP
ifconfig_em0="up"
ifconfig_em1="up"
cloned_interfaces="lagg0"
ifconfig_lagg0="laggproto lacp laggport em0 laggport em1 xxx.xxx.xxx.212/24"

#  Services
#
sshd_enable="YES"
smartd_enable="YES"
samba_enable="YES"
zabbix_agentd_enable="YES"
zfs_enable="YES"
apcupsd_enable="YES"
slapd_enable="YES"
slapd_flags='-h "ldapi://%2fvar%2frun%2fopenldap%2fldapi/ 
ldap://xxx.xxx.xxx.212/ ldap://127.0.0.1/";'

slapd_sockets="/var/run/openldap/ldapi"

#  Time Stuff
#
ntpd_enable="YES"
ntpd_sync_on_start="YES"

#  Mail
#
postfix_enable="YES"
sendmail_enable="NO"
sendmail_submit_enable="NO"
sendmail_outbound_enable="NO"
sendmail_msp_queue_enable="NO"


7. Contents of /etc/sysctl.conf


kern.maxfiles=25600
kern.maxfilesperproc=16384
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=65536


8. Contents of /sys/amd64/conf/ATEAMSYSTEMS


See above




5. Does "sysctl hw.acpi.handle_reboot=1" help you?


No change, still responded to a ctrl alt del like above, but like
that still needs to be reset and comes back dirty.



6. Does "sysctl hw.acpi.disable_on_reboot=1" help you?


No change.  Same as above, ctrl alt del responds but needs a hard
reset still.


Okay, thank you.


7. If none of the above helps, can you please boot verbose mode and then
when the system "locks up" on "shutdown -r now" take a picture of the
VGA console?


Lots of debug on boot obviously but not much different on shutdown/hang:
http://i.imgur.com/SgzSsoP.jpg


It looks to me like the ACPI layer is still actively working at the time
"all buffers are synced", meaning the actual reboot phase itself never
happens.  This to me starts to smell of an ACPI problem, but I do not
have the skill set to debug this, and I'm also grasping at straws.
There are many things that happen during that phase of operation,
particularly the "USB shutdown" phase.


Yeah.  Originally I had even my UPS (APC) disconnected, the only USB 
device (via a port -- I realize there might be MB virtual ports) was a 
Dell KB.




But it all depends on your kernel config, which I've now asked for.


Yeah

--
Adam Strohl
http://www.ateamsystems.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl

On 6/19/2013 21:21, Steven Hartland wrote:

You still need to test if stable/9 fixes your issue though as otherwise
you don't know if the issue your seeing has already been fixed, and if
its the old know ZFS vfs hang on shutdown, it has.


Thanks Steve, understood but probably not going to happen with this box. 
 I can reboot this thing but it's our NAS and not a test bed.  This 
problem on this machine isn't a big deal because its a server and not 
rebooted often (and easy to bring back).  But I more was hoping it would 
let me easily test solutions to the issue since the other servers 
showing the issue are in client production with the mind that the VMs 
not use ZFS also show a similar/identical issue  My gut says it 
appeared in/with 9.1 (We never saw this with 9.0 servers).   It is also 
possible this is a different issue from those other servers and VMs.


How far away is 9.2? ;-P

Depending on how things go with Jeremy I'll probably have to wait this 
out unless I can get a test machine or VM where I can reproduce the 
issue AND upgrade it to -STABLE (again assuming it's even the same issue).

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: shutdown -r / shutdown -h / reboot all hang and don't cleanly dismount

2013-06-19 Thread Adam Strohl

On 6/19/2013 22:04, Jeremy Chadwick wrote:

On Wed, Jun 19, 2013 at 09:15:18PM +0700, Adam Strohl wrote:

On 6/19/2013 20:35, Jeremy Chadwick wrote:


I've snipped out portions which aren't relevant at this point in the
convo.  I'm trying to be terse as much as possible here (honest).

To recap for readers/mailing list:

- Adam seems the same behaviour on systems on bare metal, as well as
   FreeBSD guests running under VMware ESXi 5.0 hypervisor.  However,
   as I stated on the list just yesterday about "lock-ups on shutdown",
   every situation may be different and there is a well-established
   history of this problem on FreeBSD where each root cause (bugs)
   were completely different from one another.

- The system we're discussing at this point in the thread is on
   bare metal -- specifically an Asus P8B-X motherboard, with BIOS
   version 6103, driven entirely by on-board Intel AHCI (not BIOS-level
   RAID).

- Adam runs 9.1-RELEASE because of business needs pertaining to
   freebsd-update and binary updates.  (I ask more about this for
   benefits of readers below, however -- because this situation comes
   up a lot and I want to know what real-world admins do)



This is all correct.


Thanks.  I was mainly interested in the storage controller being used
(in this case ahci(4)) and the disks being used (notorious ST3000DM001,
known for excessively parking heads).


Yeah, was not my first choice but then again ... RAIDZ-2 :)  HD
supply chain here (Thailand) is weird considering how many are made
here (and can't buy).  Smartd screams about them possibly needing a
firmware update (they don't according to Seagate).   Had no issues
aside from a failure a month or so again (it's an HD ... it
happens).


Absolutely understood -- and FYI, in case you need backup, your thought
process/conclusion here is spot on (re: "it's a MHDD, failures happen").


Indeed :-D



Irrelevant to your shutdown problem: as for smartmontools bitching about
the firmware: no vendors disclose what actual changes go into their
drive firmware updates (vendors if you are reading this: I will have
your souls...), so I have to read a bunch of end-user forums where
nobody knows what they're talking about, and then of course find this
"highly educational" *cough* article from Adaptec:

http://ask.adaptec.com/app/answers/detail/a_id/17241/~/known-issues-with-seagate-barracuda-7200.14-desktop-drives



Yeah I agree .. I tried to firmware upgrade them when I was building the 
system but it said they didn't qualify when using the boot ISO.  I just 
checked the site and it says no firmware update available too when using 
their search by serial # tool.   At this point I'm leery about updating 
given that I've got data on it anyway.  I do occasionally (maybe once a 
week or two and they're in the same room as me/my office) hear one parking.


I see nothing wrong in smart though, no dmesg errors and have noticed no 
issues with the array and it bench tests at around 850 MB/sec.  Too bad 
10 Gbit equipment isn't cheaper.


Also when I bought the 6 for this array I got a 7th as a cold spare :P


The problem here is that there have been *so many* firmware bugs with
Seagate's drives in the past 2 years or so that it's impossible for me
to know which fixes what.  You buy what you buy because that's what you
buy, and that's cool -- but I avoid their stuff like the plague.


Yeah.  I'd prefer WD myself but this place is swimming in "green" and 
now "red" drives.  uhgl.


<< Snipping out the unrelated parts ... >>


Can you try removing VESA and SC_PIXEL_MODE please?  I know that
sounds crazy ("what on earth would that have to do with it?"), but
please try it.  I can explain the justification if need be -- I'm being
extra paranoid of something that got discovered here on -stable only a
few days ago.  It's a stretch, but I can see potential relevance.  I can
provide details/links later.


No change unfortunately.




4. Does "sysctl hw.usb.no_shutdown_wait=1" help you?


Weirdly this allowed it to reboot on the first try (without needing
to be reset), but not the second.


I'm not surprised.  Pleas re-try with stable/9; Hans has been constantly
working on the USB stack and fixing major bugs.


Got it but probably not going to go this route as it means no more
binary upgrades.  While I can reboot it, it is the office NAS here
and so 'testing out' -STABLE I think probably isn't going to happen.


I understand.  I have a question relating to this below.


Place background_fsck="no" in /etc/rc.conf.  If the machine does not
have a clean filesystem on boot-up, you'll know because the system will
immediately begin fsck (in the foreground actively).  You'll recognise
that output if it happens, trust me.


Preaching to the choir, we se