Re: [CentOS] weird load values

2007-12-06 Thread J. Potter


As mentioned before, IO could give such strange results. I suggest  
launching

dstat with logging to a file, and analyzing the file afterwards.


Thanks, much appreciated!

This has yielded some interesting data, which I'll attempt to include  
a few seconds before and after one of these events occurred.


	system interrupts per second: Note the ~200x jump to almost 200,000  
interrupts per second.

2907
6714
1371
194218
2456
2907

	network received: Note the network received ramps up over 5 seconds,  
peaks at ~50x background, and ramps back down in about 3 seconds. The  
peak is from the same sample as the 200x sample above.

108784
389794
1070850
4843956
352226
353102
96392


Everything else looks sane -- there's enough ram, nothing's being  
swapped out, etc. This is on a private-network server that has a load  
balancer in front of it, so if it's network related, it wouldn't be  
misdirected random bits.


Has anyone seen this sort of behavior before? What was the cause? What  
should I do to figure out how to keep the load averages from flipping  
out of control?


(This isn't something as lame as a counter rolling over somewhere  
internal to the kernel, is it? Wouldn't think so, but thought to ask.  
Running 2.6.18-8.1.8.el5. We could reboot to run 2.6.18-8.1.15 if  
that'd be a potential fix.)


Thanks for any insight!

best,
Jeff




total cpu usage dsk/total   net/total   system  
usr sys readwritrecvsendint csw
10.53.250   409600  108784  72286   290720376
3.992.993   0   319488  389794  661170  671423941
0.250.250   720896  1070850 1189720 137116648
9.167   90.442  12288   1122304 4843956 38  194218  55433
56.931  16.832  0   1273856 352226  334506  245612844
46.25   20  0   454656  353102  384496  290720631
24.25   1.250   3260416 96392   72316   134217307
23.25   2.250   610304  91086   71194   145817584
10.973  1.496   0   0   84192   46276   134918135
0   0   0   94208   71892   33304   122016979
0.250.250   126976  71184   47576   126816973___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Clustering MySQL

2007-12-11 Thread J. Potter



...  But I saw a presentation at the Boston MySQL Meetup.com
group about how to do master-master in mysql 5.  We're about to
implement this in the next few weeks.  ...


I've run into issues with crash recovery in master-master mode:

 - master A is at position X
 - master B, replicating from A, gets to position X
 - master A syncs to its filesystem that it's at position X

 - master A receives some inserts, and is now at position Y
 - master B, replicating from A, gets to position Y
 - master A crashes before the position gets synced to filesystem
 - master A gets rebooted, recovers from innodb log, but has itself  
only marked at position X
 - master B requests position Y from master A, but that position  
doesn't exist yet, so replication breaks.


Perhaps someone here knows the proper recovery procedure at this point?

best,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Clustering MySQL

2007-12-11 Thread J. Potter



 - master A is at position X
 - master B, replicating from A, gets to position X
 - master A syncs to its filesystem that it's at position X

 - master A receives some inserts, and is now at position Y
 - master B, replicating from A, gets to position Y
 - master A crashes before the position gets synced to filesystem
 - master A gets rebooted, recovers from innodb log, but has itself
only marked at position X
 - master B requests position Y from master A, but that position
doesn't exist yet, so replication breaks.

Perhaps someone here knows the proper recovery procedure at this  
point?


If this were master-slave, I'd probably do an LVM Snapshot and get a
fresh copy of the master db.  The same could be done for
master-master.


I'm not sure this would work, since some data will have been inserted  
in on master B as well. I.e., with master-master, a one-way sync won't  
work. The only recovery option that I can see is to destroy Master A,  
and copy Master B -- either via an LVM snapshot or shutdown, sync,  
startup -- to create a new Master A.  Maybe this is what you're  
suggesting?


Is there a better way?

best,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Clustering MySQL

2007-12-12 Thread J Potter


After all the discussions regarding MySQL-style clustering (multi- 
master etc), what about a "classic" HA cluster for MySQL? Since the  
OP mentioned high availability, wouldn't the simplest solution be  
failover clustering (ie. single master with failover, shared  
storage, fenced nodes etc) via Centos CS?


We have this running in one setup, and it's been working (mostly) fine:

- master-master setup
- heartbeat creating a virtual IP
- all mysql clients use the virtual IP

So, effectively, it's a master-master setup where only 1 master is  
ever receiving traffic, and if that master fails, it'll automatically  
fail-over to the standby master.


The benefit of doing master-master in this scenario is that there's no  
real recovery process needed for restoring redundancy -- when the  
failed master comes back online, it catches up with the current  
master. (Make sure auto-fallback is off in heartbeat.)


The only problem I've seen is that a crashed node may not be able to  
replicate correctly, if its on-disk log position gets out of sync with  
what the other node has. It seems if this happens one has to do a real  
sync (lock tables, lvm snapshot, unlock tables, if you're willing to  
give up the storage needed for the lvm snapshot; or rsync, shutdown  
and re-rsync, startup).


best,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] RPM for perl-svn-notify?

2008-02-06 Thread J. Potter


Hi List,

Is it possible to get an rpm built and added into the plus or dag  
repos for the perl module svn-notify? (Note: not the same as svn- 
notify-mirror.)


I know it's been brought up before that perl's internal CPAN build/ 
install can cause serious conflicts with the rpm-based approach; if  
there are other / better ways of doing this in a standard fashion,  
please let me know.


Thanks,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] RPM for perl-svn-notify?

2008-02-06 Thread J. Potter
... If I can't find an RPM for a Perl module on one of the third- 
party repositories, I usually use cpanflute2 to build an RPM, then  
install that.  That way RPM knows all about the module and can  
handle it appropriately. ...



Thanks, Jay!

Mostly there. For some reason, the rpm file is outputting the files  
under /var/tmp, instead of on the system:


rpm -ql perl-SVN-Notify
/usr/share/doc/perl-SVN-Notify-2.66
/usr/share/doc/perl-SVN-Notify-2.66/Changes
/usr/share/doc/perl-SVN-Notify-2.66/README
/var/tmp/perl-SVN-Notify-2.66-8-root/usr/bin/svnnotify
	/var/tmp/perl-SVN-Notify-2.66-8-root/usr/lib/perl5/site_perl/5.8.8/ 
SVN/Notify.pm
	/var/tmp/perl-SVN-Notify-2.66-8-root/usr/lib/perl5/site_perl/5.8.8/ 
SVN/Notify/Alternative.pm

...

Did I miss a setting somewhere?

-Jeff


On CentOS 5 x86_64:

	yum -y install perl-RPM-Specfile perl-IO-Zlib rpm-build perl-rpm- 
build-perl perl-Module-Build perl-HTML-Parser

wget 
'http://search.cpan.org/CPAN/authors/id/D/DW/DWHEELER/SVN-Notify-2.66.tar.gz'
gunzip SVN-Notify-2.66.tar.gz
	cpanflute2 --name=SVN-Notify --version=2.66 SVN-Notify-2.66.tar  -- 
buildall

rpm -Uvh perl-SVN-Notify-2.66-8.src.rpm
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] RPM for perl-svn-notify?

2008-02-07 Thread J. Potter


Thanks, Jay! That did it.

For the record, here is what is needed to instal SVN-Notify on CentOS 5:

   yum -y install perl-RPM-Specfile perl-IO-Zlib rpm-build perl-rpm- 
build-perl perl-Module-Build perl-HTML-Parser

   wget 
'http://search.cpan.org/CPAN/authors/id/D/DW/DWHEELER/SVN-Notify-2.66.tar.gz'
   gunzip SVN-Notify-2.66.tar.gz
   cpanflute2 --name=SVN-Notify --version=2.66 SVN-Notify-2.66.tar
   rpm -Uvh perl-SVN-Notify-2.66-8.src.rpm
   vi /usr/src/redhat/SPECS/perl-SVN-Notify.spec # remove options  
after 'make pure_install'
   rpmbuild -bb --target=noarch /usr/src/redhat/SPECS/perl-SVN- 
Notify.spec
   rpm -Uvh /usr/src/redhat/RPMS/noarch/perl-SVN- 
Notify-2.66-8.noarch.rpm


best,
Jeff

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] limit number of per-vhost or per-user cgi processes?

2008-02-14 Thread J. Potter


Hi List,

Is there a way to limit the number of cgi processes Apache's suExec  
will fork for a given vhost or given user?  (either solution is fine)


suExec doesn't honor the /etc/security/limits.conf nproc value.  
mod_throttle seems to be dead; and I can't figure out if selinux might  
be able to manage this (although would rather not flip selinux from  
permissive to enabled).


What are others doing?

Thanks!

-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Is there a fix for the "sqlite cache needs updating" error?

2008-02-19 Thread J. Potter


I've been seeing the below message from yum whenever the repo has an  
update (CentOS 5):



/etc/cron.daily/yum.cron:

** Message: sqlite cache needs updating, reading in metadata


Googling a bit, it looks like others have seen this happen as well.   
The solutions, when I've found them, have been along the lines of  
"send output to /dev/null" or "edit this file", but nothing that feels  
like the "right" fix.


Is there a general solution to this? Or should we just sit tight for  
redhat bug #429689 to be fixed (5.2?).


-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Is there a fix for the "sqlite cache needs updating" error?

2008-02-19 Thread J. Potter



** Message: sqlite cache needs updating, reading in metadata
... Is there a general solution to this? Or should we just sit  
tight for redhat bug #429689 to be fixed (5.2?).


this is not really a bug ...
it is just verbose output that causes an e-mail to be sent.


It seems to be the default, though, and when managing 150+ servers,  
this ends up being a lot of email!


Is there a simple way to flip that message off?



I do not see how this issue is at all related to RH bug 429689 ???


Typo; meant 429869; as Akemi suggested.

-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] Newer MySQL in centos-plus?

2008-03-10 Thread J. Potter



Hi List,

I'm noticing that the CentOS-plus repo for 4.6 has MySQL 5.0.54 in it,  
but the CentOS 5.1 repo does not have a newer rpm, leaving the  
"newest" easily-available version as the vendor-provided mysql 5.0.22.


Is there a reason for this? We're wanting to try a newer MySQL under  
CentOS 5.1 -- do any of the standard repos include such an RPM?


Thanks!
-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] how does one remove bond1?

2007-10-20 Thread J. Potter


Hi List,

We're using bonding to create bond0 with 2 NICs, and noticing that  
CentOS 5 2.6.18-8.1.14.el5 (and presumably older) creates bond1 as well.


I'd like to remove bond1 from the system, so that our monitoring  
scripts don't pick it up, except for those machines that actually do  
have a bond1.


So... how does one remove bond1?

Thanks! Misc info below of configuration and what doesn't work.

best,
Jeff


% cat /etc/modprobe.conf | grep bond
alias bond0 bonding
options bond0 max_bonds=2 miimon=100 mode=1

% find /etc/ -iname "*bond1*" | wc -l
0

% grep -r bond1 /etc/ 2> /dev/null | wc -l
0

% cat /sys/class/net/bonding_masters
bond0 bond1

% echo "bond0" > /sys/class/net/bonding_masters
-bash: echo: write error: Operation not permitted

% id
uid=0(root) gid=0(root)  ...

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] how does one remove bond1?

2007-10-22 Thread J. Potter



So... how does one remove bond1?


Shouldn't this do the trick?
rm /etc/sysconfig/network-scripts/ifcfg-bond1


One would hope... but that file doesn't exist. In fact, there is no  
file under /etc that contains the letters "bond1", nor any file  
under /etc that contains the contents "bond1".


???

best,
Jeff


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] how does one remove bond1?

2007-10-22 Thread J. Potter



Remove the max_bonds=2 from /etc/modprobe.conf


Yup, doing that removed the mystery "bond1" under /proc/net/bonding -  
thanks!


best,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] weird load values

2007-12-05 Thread J. Potter


Hi List,

I'm stumped by this:

load average: 10.65, 594.71, 526.58

We're monitoring load every ~3 minutes. It'll be fine (i.e. something  
like load average: 2.14, 1.27, 1.03), and then in a single sample,  
jump to something like the above. This seems to happen once a week or  
so on a few different servers (all running in a similar application).  
I've never seen the 1 minute sample spike as high as the 5 or 15  
minute samples.


Seeing as that last value is a 15 minute period, well, it doesn't seem  
possible that one can have a 500+ 15 minute sample without having  
observed a spike in the 5 minute sample at least 5 minutes before.


Also, there aren't 500+ processes on these systems -- it's typically  
around 100 total processes (ps auxw | wc -l). (Is there a way to see  
the total count of kernel-level threads?)


Thoughts?

best,
Jeff

Linux someHostName 2.6.18-8.1.8.el5 #1 SMP Tue Jul 10 06:39:17 EDT  
2007 x86_64 x86_64 x86_64 GNU/Linux


CentOS release 5 (Final)

 09:31:15 up 65 days, 17:45,  2 users,  load average: 0.92, 200.91,  
371.30


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] count of active tcp sockets?

2008-04-16 Thread J Potter


Hi List,

Is there an easy way to get a count of the number of active socket  
connections, or even better, number of socket connections in the  
time_wait state? (Something lightweight... under /proc/sys/net/ipv4/?  
I'd like to avoid the impact of listing out all the connections a-la  
netstat.)


Thanks!
-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] count of active tcp sockets?

2008-04-16 Thread J Potter



netstat -an|grep TIME_WAIT|wc  ?


I need to avoid anything that lists out all the connections -- the  
above would take too long if there are tens of thousands of connections.


I'm hoping there's a proc entry that has a summary count of the  
current number of connections?


-Jeff

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] CentOS 5.2 - /usr/sbin/httpd: double free or corruption (!prev) error

2008-07-11 Thread J Potter


Hi List,

We've been seeing the following error on a CentOS 5.2 x86_64 /  
2.6.18-92.1.6.el5 install (from /var/log/httpd/error):


	*** glibc detected *** /usr/sbin/httpd: double free or corruption (! 
prev): 0x2ad8ebed2d80 ***
	[Thu Jul 10 19:12:19 2008] [notice] child pid 5261 exit signal  
Segmentation fault (11)
	[Thu Jul 10 19:12:48 2008] [notice] child pid 12180 exit signal  
Segmentation fault (11)

...


The following packages are installed:
glibc-2.5-24
httpd-2.2.3-11.el5_1.centos.3
mysql-server-5.0.45-7.el5
mysql-5.0.45-7.el5
php-gd-5.1.6-20.el5
php-mysql-5.1.6-20.el5
php-common-5.1.6-20.el5
php-pdo-5.1.6-20.el5
php-cli-5.1.6-20.el5
php-5.1.6-20.el5
php-imap-5.1.6-20.el5


If we remove the php-pdo / php-mysql packages, then the crashes stop.  
But, we need those packages, so that's not a solution.


Any ideas as to what might be causing this?

We've tried the obvious things; the only thing that's not "vanilla"  
here are a few settings in /etc/php.ini (having to do with max  
execution time and max memory). The php code that's being run does  
have its share of php warn/notices:


PHP Notice: Undefined index: d in [snip] on line 1699
PHP Notice: Undefined index: verbose in [snip] on line 1819
	PHP Warning: Invalid argument supplied for foreach() in [snip] on  
line 1695


So I would not be surprised if it is tickling something out of the  
ordinary, but it should still not be causing a segfault!


Any thoughts as to what's going on here, and how to fix it? I'm  
tempted to install php 5.2.6 from utterramblings repo, to see if that  
works around this.


Thanks!

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Virtual NICs (aliases like eth0:1) won't come up after reboot

2008-11-14 Thread J Potter


I found under CentOS 4 a few years ago that the OS would only bring up  
virtual interfaces starting with 0:0 and increasing sequentially -- if  
there was a gap, it would stop at that gap point.


I.e.:
Good: eth0, eth0:0, eth0:1, eth0:2...
Bad: eth0, eth0:0, eth0:2  (system would only start 0 and 0:0)
Bad: eth0, eth0:1, eth0:2  (system would only start 0)

This last case looks like yours --in your config, after eth0, you  
started with eth0:1.


-Jeff

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] squid HA failover?

2009-02-06 Thread J Potter

Assuming server A with IP M, server B with IP N, and DNS entry X  
currently pointing at IP M:

1) Add heartbeat on servers A and B, with heartbeat managing a new IP  
address O (this is your virtual IP -- nothing to do with VRRP, that's  
for your routers to failover your gateway).
2) If you want active-active load sharing on servers A and B, install  
pound on both server A and B, and in your pound config, point pound to  
IP M and IP N (same pound config on both servers),
3) Update your DNS to point entry X to IP M.

If you want active-standby on your squids, then have both squids bind  
to 0.0.0.0 and you're done. The standby server will have squid  
listening to requests, but since standby won't have the VIP O, it'll  
just sit there. In this setup, heartbeat is only managing the VIP, but  
no services.

If you want active-active on your squids, then have squid on server A  
bind to only IP M, squid on server B bind to only IP N, and pound  
configured to bind to only IP O. Heartbeat will need to be configured  
to start pound on failover (since IP O will only exist on one box at a  
time, so pound can't bind to it unless the interface is up).

Make sure you test the case where squid (or pound in active-active) on  
the server running VIP O crashes.

-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] squid HA failover?

2009-02-06 Thread J Potter

> Yes, I normally want one server handling the full load to maximize the
> cache hits.  But the other one should be up and running.

So, active/standby. Easier config. Squid won't even be aware that  
heartbeat is running; just keep it running on both servers all the time.

See my install notes at bottom.


> And, because
> this is already in production as a mid-level cache working behind a
> loadbalancer, I would like to be able to keep answering on the M or N
> addresses, gradually reconfiguring the web servers in the farm to use
> address O instead of the loadbalancer VIP address.

Go for it. It'll work fine. You could get fancy and switch primary  
interface from M to O, and make M the VIP. Depends if you can accept  
the ~30 seconds of downtime and your tolerance for risk.


> I got the impression
> from some of the docs/tutorials that it was a bad idea to access the  
> M/N
> addresses directly.

In your case, it's only "bad" if M/N go down.


> Or does that only apply to services where it is
> important to only have one instance alive at a time or where you are
> replicating data?

Depends on the service and replication setup. If you had master/slave  
MySQL and connected to the slave, you'd see amnesia on the master.  
(That setup wouldn't allow for fail-back though, so, probably wouldn't  
see it.) Things like drbd protect you from concurrent mounting.


> Even after converting all of the farm to use the new
> address, I'll still want to be able to monitor the backup server to be
> sure it is still healthy.

Yup. And you'll want to monitor the active node and force-failover if  
the service fails. My config below doesn't take this into  
consideration; maybe other list lurkers can correct it to be better.  
The quick and dirty fix is to for each node to check if it is active,  
and if it is, if squid is not active, to then run 'service heartbeat  
restart' to failover to the other node. (I.e. once-a-minute cron job.)  
Not as pretty as it should be.

best,
Jeff


Replace 1.2.3.4 with your VIP ip address, and a.example.com and  
b.example.com with your FQDN hostnames.

server A ("a.example.com"):
yum -y install heartbeat
chkconfig --add heartbeat
chkconfig --level 345 heartbeat on

echo 'a.example.com IPaddr::1.2.3.4' > /etc/ha.d/haresources
echo "node a.example.com" > /etc/ha.d/ha.cf
echo "node b.example.com" >> /etc/ha.d/ha.cf
echo "udpport 9000" >> /etc/ha.d/ha.cf
echo "bcast bond0" >> /etc/ha.d/ha.cf
echo "auto_failback off" >> /etc/ha.d/ha.cf
echo "logfile /var/log/ha-log" >> /etc/ha.d/ha.cf
echo "logfacility local0" >> /etc/ha.d/ha.cf
echo "auth 1" > /etc/ha.d/authkeys
echo "1 crc" >> /etc/ha.d/authkeys
chmod go-rwx /etc/ha.d/authkeys

server B ("b.example.com"):
yum -y install heartbeat
chkconfig --add heartbeat
chkconfig --level 345 heartbeat on

echo 'a.example.com IPaddr::1.2.3.4' > /etc/ha.d/haresources # yes,  
"a" again - that's the default host to run the service
echo "node a.example.com" > /etc/ha.d/ha.cf
echo "node b.example.com" >> /etc/ha.d/ha.cf
echo "udpport 9000" >> /etc/ha.d/ha.cf
echo "bcast bond0" >> /etc/ha.d/ha.cf
echo "auto_failback off" >> /etc/ha.d/ha.cf
echo "logfile /var/log/ha-log" >> /etc/ha.d/ha.cf
echo "logfacility local0" >> /etc/ha.d/ha.cf
echo "auth 1" > /etc/ha.d/authkeys
echo "1 crc" >> /etc/ha.d/authkeys
chmod go-rwx /etc/ha.d/authkeys

# This assumes:
# 1) your network is bond0, not eth0
# 2) you are on a private network where you don't care about  
security, otherwise see http://www.linux-ha.org/authkeys
# Make sure udpport isn't in use by any other instances; or, use mcast.

On server A:
service heartbeat start
# Then, check your log files (/var/log/ha-log and /var/log/messages).
# Ping the virtual IP.

On server B:
service heartbeat start
# check your log files

On server A:
service heartbeat restart

On server B:
ifconfig -a
# Check if the interface is now runing on server B.

You can monitor current active node with arp -- the mac address will  
switch to match the physical interface that the VIP is running on.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Problems with mysql multi-master after update.

2009-02-10 Thread J Potter

For what it's worth, I haven't seen this on any systems I manage when  
going from 5.0.22->5.0.45, which include permutations of master-slave  
and master-master.

Is there anything useful in /var/log/mysqld.log?


> after I updated from mysql-5.0.22 CentOS 5.0 to mysqld-5.0.45 in  
> CentOS
> 5.2, mysql looses master-slave sync after one node reboots.
> I've noticed that the slave doest not respect the informantion on
> master.info, instead, it tries to read the informantion from the  
> master
> server file inc-index.index
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] clustering and load balancing Apache

2009-02-11 Thread J Potter

Look at pound: http://www.apsis.ch/pound/

If you are concerned about traffic volume, you might consider running  
squid as a transparent proxy in front of pound. I.e.:

request -> squid -> pound -> apache

Where squid will return the response for everything marked as  
cacheable and still fresh; and pound will take care of load balancing  
to apache. (Pound can inspect/insert cookies to send visitors to the  
same back-end node on subsequent requests.) On some of our setups,  
squid responds to 98% of the requests coming in, and is able to  
respond to an extremely insane high volume of requests. Other list  
users might be able to provide good stats as to what sort of volume  
they can support. (I'd be curious to hear what others have seen...)

For HA:
- 2 instances of squid, active/standby or active/active (i.e. two IP  
address in DNS for the public hostname, and have each squid instance  
pick up the others during failure).
- 2 instances of pound, active/standby
- N instances of apache

Re: replication of content on your apache nodes, another poster  
suggested drbd. From my understanding, I do not think this is  
possible, since only one node can mount the drbd volume at a time. If  
you have shared data that needs to be seen across apache nodes, either  
stick it in SQL or mount an NFS volume across the nodes. (But then you  
have NFS in the picture, which might not be so good.)

If your apache code is constant, then have a master apache node and  
write a shell script that runs rsync to push code changes out to the  
other instances.

It's hard to get very specific about what's best for your setup  
without know the specifics of things like the data sync needs on the  
apache nodes, so take all of this with a grain of salt -- or as a  
default starting place.

best,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


[CentOS] general protection rip?

2009-03-06 Thread J Potter

Hi List,

On one of our CentOS 5 (x64_86) servers, identical to a number of  
other systems, I'm seeing some processes / services failing to run,  
along with the following error in /var/log/messages:

Mar  2 23:25:07 someHostname kernel: wrapper-linux-x[24448] general  
protection rip:805386e rsp:ffc20390 error:0
Mar  2 23:25:09 someHostname kernel: dsm_sa_datamgr3[5063] general  
protection rip:f7f86e7c rsp:f5aab30c error:0
Mar  3 09:57:54 someHostname kernel: dcecfg32[1993] general  
protection rip:b9c71a rsp:ffa20568 error:0
Mar 6 09:38:59 someHostname kernel: omreport[15107] general  
protection rip:af597a rsp:ffdccad0 error:0

I've Googled a bit, but don't see any clear consistent explanation.  
Does anyone have any pointers as to what is causing this? Faulty  
hardware? I ran memtester, but it didn't find any issue with the  
memory it was able to test.

Thanks,
Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] [OT] Network switches

2009-03-26 Thread J Potter

> look at HP Procurves. That is what I use.
> You can get 2524's quite cheap on ebay.

We used these for years, and they were great, and super cheap on EBay.  
HP support was fantastic as well. The 26xx series allows for "light"  
layer 3 routing; you may want to snag the 2626 or 2650 instead of the  
25xx series. I believe that HP has end-of-lifed these switches,  
though, so firmware updates for security bugs, etc, will, from what I  
understand, cease in a few years.

We upgraded to some Dell PowerConnect 6248s in the past year, so that  
we could use VRRP for (routing-enabled) switch failover. As with all  
Dell things, hammer them on the price and you can get it ~30% cheaper  
than listed.

-Jeff

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] Two sets of Heartbeat HTTPD clusters on same subnet

2009-04-02 Thread J Potter

> I have successfully configure two machines to use heartbeat to cluster
> httpd. The two nodes are called etk-1 and etk-2. I am trying to
> configure another two machines to act as a separate cluster (on the
> same IP subnet). These two nodes are called radu-1 and radu-2.

We successfully do this with many pairs of HA nodes in the same  
subnet, using different UDP ports...

Under /etc/ha.d/ha.cf:
udpport 

Use a different authkey for each pair so as to avoid accidental snafus  
with mixing up nodes from different pairs.

-Jeff
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos