Mike,

cf-aget holds the socket because of connection caching which seems
quite reasonable for an ordinary agent tasks. You've just put it into
condition where caching harms. If we were discussing the ideal model,
cf-agent would detect the maximum number of file descriptors available
and enforce cache expiration upon reaching some high watermark. Since
such model would require too much efforts to implement, more
straightforward solution is to create feature request about some sort
of keepalive => "false"; option in body copy_from.

2011/1/19 Mike Svoboda <msvob...@linkedin.com>:
> I’ve enabled my Cfengine infrastructure to perform 2 way data transfers.  My
> clients are configured to run cf-serverd, so my Master Policy Server can
> login to pull some files off of each machine.  To accomplish this, I execute
> two policies.
>
> policy 1 extracts all the clients the MPS has seen from the lastseen
> database and dumps this info to a text file.
> policy 2 reads the text file, and instructs cf-agent on the MPS to loop
> through and pull down data from each “client machine” from
> /var/cfengine/outgoing.
>
> Here’s policy 2 which executes on my MPS.
>
> bundle agent grab_client_cfreport_output
> {
> vars:
>         "host_array_size"
>                 int => readstringarray("host_array",
> "/export/apps/cfengine-client-data/active_clients.txt","#[^\n]*","[\n]",999999999,9999999);
>         "real_machine_name"     slist   =>      getindices("host_array");
>
> files:
>         # This transfers all reporting data from the clients to the Master
> Policy Server
>         "/export/apps/cfengine-client-data/$(real_machine_name)"
>                 handle                          =>      "grab_client_data",
>                 copy_from                       =>
>      remote_copy("/var/cfengine/outgoing","$(real_machine_name)"),
>                 depth_search                    =>      recurse("inf"),
>                 action                          =>      immediate;
> }
> #########################################################
> body copy_from remote_copy(sourcedir,sourceserver)
> {
>         source          =>      "$(sourcedir)";
>         servers         =>      { "$(sourceserver)" };
>         copy_backup     =>      "false";
>         purge           =>      "false";
>         trustkey        =>      "true";
>         collapse_destination_dir        =>      "true";
>         encrypt         =>      "true";
> }
>
>
>
>
> When this executes, cf-agent holds open a socket for each client it connects
> to.  It doesn’t close the socket when it moves onto the next machine.  The
> downside of this, is that this master policy server has to reach out and
> grab data from 900 clients, which means I end up with a TON of open file
> descriptors with socket information.
>
> I’ve raised ulimit –n (open file descriptors) to 2048, but cf-agent doesn’t
> seem very happy.
> # ulimit -a
> core file size        (blocks, -c) unlimited
> data seg size         (kbytes, -d) unlimited
> file size             (blocks, -f) unlimited
> open files                    (-n) 2048
> pipe size          (512 bytes, -p) 10
> stack size            (kbytes, -s) 10240
> cpu time             (seconds, -t) unlimited
> max user processes            (-u) 16357
> virtual memory        (kbytes, -v) unlimited
>
>
>
> Through about 500 client transfers, cf-agent is happy.    Then I start
> hitting these messages below.  The file transfer still succeeds, but it
> looks nasty.
>
>
>  -> Updated
> /export/apps/cfengine-client-data/ela4-cs44.prod/monitor_summary.html from
> source /var/cfengine/outgoing/reports/monitor_summary.html on ela4-cs44.prod
>  -> Updated
> /export/apps/cfengine-client-data/ela4-cs44.prod/performance.html from
> source /var/cfengine/outgoing/reports/performance.html on ela4-cs44.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/lastseen.html
> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-cs44.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/classes.html
> from source /var/cfengine/outgoing/reports/classes.html on ela4-cs44.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/class_notes
> from source /var/cfengine/outgoing/reports/class_notes on ela4-cs44.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/audit.html from
> source /var/cfengine/outgoing/reports/audit.html on ela4-cs44.prod
>  -> Updated
> /export/apps/cfengine-client-data/ela4-be174.prod/monitor_summary.html from
> source /var/cfengine/outgoing/reports/monitor_summary.html on
> ela4-be174.prod
>  -> Updated
> /export/apps/cfengine-client-data/ela4-be174.prod/performance.html from
> source /var/cfengine/outgoing/reports/performance.html on ela4-be174.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/lastseen.html
> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be174.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/classes.html
> from source /var/cfengine/outgoing/reports/classes.html on ela4-be174.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/class_notes
> from source /var/cfengine/outgoing/reports/class_notes on ela4-be174.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/audit.html
> from source /var/cfengine/outgoing/reports/audit.html on ela4-be174.prod
>  -> Copying from
> ela4-be520.prod:/var/cfengine/outgoing/reports/monitor_summary.html
>  -> Copying from
> ela4-be520.prod:/var/cfengine/outgoing/reports/performance.html
>  -> Copying from
> ela4-be520.prod:/var/cfengine/outgoing/reports/lastseen.html
>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/hashes.html
>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/classes.html
>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/class_notes
>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/audit.html
>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/cm.conf
> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be298.prod.pub) -
> use cf-key to get one
>  !!! System error for fopen: "Too many open files"
>  -> Trusting server identity, promise to accept key from
> ela4-be298.prod=172.17.135.198
>  -> Updated
> /export/apps/cfengine-client-data/ela4-be298.prod/monitor_summary.html from
> source /var/cfengine/outgoing/reports/monitor_summary.html on
> ela4-be298.prod
>  -> Updated
> /export/apps/cfengine-client-data/ela4-be298.prod/performance.html from
> source /var/cfengine/outgoing/reports/performance.html on ela4-be298.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/lastseen.html
> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be298.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/classes.html
> from source /var/cfengine/outgoing/reports/classes.html on ela4-be298.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/audit.html
> from source /var/cfengine/outgoing/reports/audit.html on ela4-be298.prod
> Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for
> editing
>  !!! System reports error for fopen: "Too many open files"
> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be420.prod.pub) -
> use cf-key to get one
>  !!! System error for fopen: "Too many open files"
>  -> Trusting server identity, promise to accept key from
> ela4-be420.prod=172.17.137.192
>  -> Updated
> /export/apps/cfengine-client-data/ela4-be420.prod/performance.html from
> source /var/cfengine/outgoing/reports/performance.html on ela4-be420.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/lastseen.html
> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be420.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/classes.html
> from source /var/cfengine/outgoing/reports/classes.html on ela4-be420.prod
>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/audit.html
> from source /var/cfengine/outgoing/reports/audit.html on ela4-be420.prod
> Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for
> editing
>  !!! System reports error for fopen: "Too many open files"
> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-ss143.prod.pub) -
> use cf-key to get one
>
>
> Running a pfiles on cf-agent, here’s all the open sockets I see.  They
> aren’t being released after every client transfer, so it piles up in
> cf-agent.
>
>
> $ pfiles 5610
> 5610:   /var/cfengine/bin/cf-agent -I -K
>   Current rlimit: 2048 file descriptors
>    0: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>       O_RDWR|O_NOCTTY|O_LARGEFILE
>       /devices/pseudo/pts@0:1
>    1: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>       O_RDWR|O_NOCTTY|O_LARGEFILE
>       /devices/pseudo/pts@0:1
>    2: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>       O_RDWR|O_NOCTTY|O_LARGEFILE
>       /devices/pseudo/pts@0:1
>    3: S_IFDOOR mode:0444 dev:295,0 ino:56 uid:0 gid:0 size:0
>       O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[215]
>       /var/run/name_service_door
>    4: S_IFREG mode:0644 dev:30,131 ino:8466 uid:0 gid:1 size:16384
>       O_RDWR|O_CREAT|O_LARGEFILE FD_CLOEXEC
>       /var/cfengine/cf_Audit.db
> ....
> ........
> ...
>  758: S_IFSOCK mode:0666 dev:293,0 ino:2553 uid:0 gid:0 size:0
>       O_RDWR
>         SOCK_STREAM
>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>         sockname: AF_INET 172.17.130.245  port: 41914
>         peername: AF_INET 172.17.137.174  port: 5308
>  759: S_IFSOCK mode:0666 dev:293,0 ino:17601 uid:0 gid:0 size:0
>       O_RDWR
>         SOCK_STREAM
>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>         sockname: AF_INET 172.17.130.245  port: 41915
>         peername: AF_INET 172.17.137.73  port: 5308
>  760: S_IFSOCK mode:0666 dev:293,0 ino:44388 uid:0 gid:0 size:0
>       O_RDWR
>         SOCK_STREAM
>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>         sockname: AF_INET 172.17.130.245  port: 41916
>         peername: AF_INET 172.17.138.159  port: 5308
>
>
>
> Anyways, is there a way to instruct cf-agent to close the socket when the
> copy_from is complete, or does anyone else have a better approach for what I
> am trying to accomplish?
>
> Thanks
> Mike
> _______________________________________________
> Help-cfengine mailing list
> Help-cfengine@cfengine.org
> https://cfengine.org/mailman/listinfo/help-cfengine
>
>



-- 
SY, Seva Gluschenko.
_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to