I’ve enabled my Cfengine infrastructure to perform 2 way data transfers.  My 
clients are configured to run cf-serverd, so my Master Policy Server can login 
to pull some files off of each machine.  To accomplish this, I execute two 
policies.


  *   policy 1 extracts all the clients the MPS has seen from the lastseen 
database and dumps this info to a text file.
  *   policy 2 reads the text file, and instructs cf-agent on the MPS to loop 
through and pull down data from each “client machine” from 
/var/cfengine/outgoing.

Here’s policy 2 which executes on my MPS.

bundle agent grab_client_cfreport_output
{
vars:
        "host_array_size"
                int => readstringarray("host_array", 
"/export/apps/cfengine-client-data/active_clients.txt","#[^\n]*","[\n]",999999999,9999999);
        "real_machine_name"     slist   =>      getindices("host_array");

files:
        # This transfers all reporting data from the clients to the Master 
Policy Server
        "/export/apps/cfengine-client-data/$(real_machine_name)"
                handle                          =>      "grab_client_data",
                copy_from                       =>      
remote_copy("/var/cfengine/outgoing","$(real_machine_name)"),
                depth_search                    =>      recurse("inf"),
                action                          =>      immediate;
}
#########################################################
body copy_from remote_copy(sourcedir,sourceserver)
{
        source          =>      "$(sourcedir)";
        servers         =>      { "$(sourceserver)" };
        copy_backup     =>      "false";
        purge           =>      "false";
        trustkey        =>      "true";
        collapse_destination_dir        =>      "true";
        encrypt         =>      "true";
}




When this executes, cf-agent holds open a socket for each client it connects 
to.  It doesn’t close the socket when it moves onto the next machine.  The 
downside of this, is that this master policy server has to reach out and grab 
data from 900 clients, which means I end up with a TON of open file descriptors 
with socket information.

I’ve raised ulimit –n (open file descriptors) to 2048, but cf-agent doesn’t 
seem very happy.
# ulimit -a
core file size        (blocks, -c) unlimited
data seg size         (kbytes, -d) unlimited
file size             (blocks, -f) unlimited
open files                    (-n) 2048
pipe size          (512 bytes, -p) 10
stack size            (kbytes, -s) 10240
cpu time             (seconds, -t) unlimited
max user processes            (-u) 16357
virtual memory        (kbytes, -v) unlimited



Through about 500 client transfers, cf-agent is happy.    Then I start hitting 
these messages below.  The file transfer still succeeds, but it looks nasty.


 -> Updated 
/export/apps/cfengine-client-data/ela4-cs44.prod/monitor_summary.html from 
source /var/cfengine/outgoing/reports/monitor_summary.html on ela4-cs44.prod
 -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/performance.html 
from source /var/cfengine/outgoing/reports/performance.html on ela4-cs44.prod
 -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/lastseen.html from 
source /var/cfengine/outgoing/reports/lastseen.html on ela4-cs44.prod
 -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/classes.html from 
source /var/cfengine/outgoing/reports/classes.html on ela4-cs44.prod
 -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/class_notes from 
source /var/cfengine/outgoing/reports/class_notes on ela4-cs44.prod
 -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/audit.html from 
source /var/cfengine/outgoing/reports/audit.html on ela4-cs44.prod
 -> Updated 
/export/apps/cfengine-client-data/ela4-be174.prod/monitor_summary.html from 
source /var/cfengine/outgoing/reports/monitor_summary.html on ela4-be174.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/performance.html 
from source /var/cfengine/outgoing/reports/performance.html on ela4-be174.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/lastseen.html 
from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be174.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/classes.html from 
source /var/cfengine/outgoing/reports/classes.html on ela4-be174.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/class_notes from 
source /var/cfengine/outgoing/reports/class_notes on ela4-be174.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/audit.html from 
source /var/cfengine/outgoing/reports/audit.html on ela4-be174.prod
 -> Copying from 
ela4-be520.prod:/var/cfengine/outgoing/reports/monitor_summary.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/performance.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/lastseen.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/hashes.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/classes.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/class_notes
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/audit.html
 -> Copying from ela4-be520.prod:/var/cfengine/outgoing/cm.conf
Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be298.prod.pub) - 
use cf-key to get one
 !!! System error for fopen: "Too many open files"
 -> Trusting server identity, promise to accept key from 
ela4-be298.prod=172.17.135.198
 -> Updated 
/export/apps/cfengine-client-data/ela4-be298.prod/monitor_summary.html from 
source /var/cfengine/outgoing/reports/monitor_summary.html on ela4-be298.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/performance.html 
from source /var/cfengine/outgoing/reports/performance.html on ela4-be298.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/lastseen.html 
from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be298.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/classes.html from 
source /var/cfengine/outgoing/reports/classes.html on ela4-be298.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/audit.html from 
source /var/cfengine/outgoing/reports/audit.html on ela4-be298.prod
Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for editing
 !!! System reports error for fopen: "Too many open files"
Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be420.prod.pub) - 
use cf-key to get one
 !!! System error for fopen: "Too many open files"
 -> Trusting server identity, promise to accept key from 
ela4-be420.prod=172.17.137.192
 -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/performance.html 
from source /var/cfengine/outgoing/reports/performance.html on ela4-be420.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/lastseen.html 
from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be420.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/classes.html from 
source /var/cfengine/outgoing/reports/classes.html on ela4-be420.prod
 -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/audit.html from 
source /var/cfengine/outgoing/reports/audit.html on ela4-be420.prod
Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for editing
 !!! System reports error for fopen: "Too many open files"
Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-ss143.prod.pub) - 
use cf-key to get one


Running a pfiles on cf-agent, here’s all the open sockets I see.  They aren’t 
being released after every client transfer, so it piles up in cf-agent.


$ pfiles 5610
5610:   /var/cfengine/bin/cf-agent -I -K
  Current rlimit: 2048 file descriptors
   0: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:1
   1: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:1
   2: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /devices/pseudo/pts@0:1
   3: S_IFDOOR mode:0444 dev:295,0 ino:56 uid:0 gid:0 size:0
      O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[215]
      /var/run/name_service_door
   4: S_IFREG mode:0644 dev:30,131 ino:8466 uid:0 gid:1 size:16384
      O_RDWR|O_CREAT|O_LARGEFILE FD_CLOEXEC
      /var/cfengine/cf_Audit.db
....
........
...
 758: S_IFSOCK mode:0666 dev:293,0 ino:2553 uid:0 gid:0 size:0
      O_RDWR
        SOCK_STREAM
        SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
        sockname: AF_INET 172.17.130.245  port: 41914
        peername: AF_INET 172.17.137.174  port: 5308
 759: S_IFSOCK mode:0666 dev:293,0 ino:17601 uid:0 gid:0 size:0
      O_RDWR
        SOCK_STREAM
        SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
        sockname: AF_INET 172.17.130.245  port: 41915
        peername: AF_INET 172.17.137.73  port: 5308
 760: S_IFSOCK mode:0666 dev:293,0 ino:44388 uid:0 gid:0 size:0
      O_RDWR
        SOCK_STREAM
        SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
        sockname: AF_INET 172.17.130.245  port: 41916
        peername: AF_INET 172.17.138.159  port: 5308



Anyways, is there a way to instruct cf-agent to close the socket when the 
copy_from is complete, or does anyone else have a better approach for what I am 
trying to accomplish?

Thanks
Mike
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to