Sounds good to me with the feature request!  But to solve the problem I am
trying to accomplish....

The only workaround I can think would be to split my "client list" up into 4
or 5 separate files, and then execute the copy_from against a file based
upon a class based time?

Or does anyone else have a more elegant solution on how to grab data from
hundreds of clients?  This seems to be a scalability problem someone must
have hit before.  




On 1/19/11 10:59 AM, "Seva Gluschenko" <seva.glusche...@gmail.com> wrote:

> Mike,
> 
> cf-aget holds the socket because of connection caching which seems
> quite reasonable for an ordinary agent tasks. You've just put it into
> condition where caching harms. If we were discussing the ideal model,
> cf-agent would detect the maximum number of file descriptors available
> and enforce cache expiration upon reaching some high watermark. Since
> such model would require too much efforts to implement, more
> straightforward solution is to create feature request about some sort
> of keepalive => "false"; option in body copy_from.
> 
> 2011/1/19 Mike Svoboda <msvob...@linkedin.com>:
>> I¹ve enabled my Cfengine infrastructure to perform 2 way data transfers.  My
>> clients are configured to run cf-serverd, so my Master Policy Server can
>> login to pull some files off of each machine.  To accomplish this, I execute
>> two policies.
>> 
>> policy 1 extracts all the clients the MPS has seen from the lastseen
>> database and dumps this info to a text file.
>> policy 2 reads the text file, and instructs cf-agent on the MPS to loop
>> through and pull down data from each ³client machine² from
>> /var/cfengine/outgoing.
>> 
>> Here¹s policy 2 which executes on my MPS.
>> 
>> bundle agent grab_client_cfreport_output
>> {
>> vars:
>>         "host_array_size"
>>                 int => readstringarray("host_array",
>> "/export/apps/cfengine-client-data/active_clients.txt","#[^\n]*","[\n]",99999
>> 9999,9999999);
>>         "real_machine_name"     slist   =>      getindices("host_array");
>> 
>> files:
>>         # This transfers all reporting data from the clients to the Master
>> Policy Server
>>         "/export/apps/cfengine-client-data/$(real_machine_name)"
>>                 handle                          =>      "grab_client_data",
>>                 copy_from                       =>
>>      remote_copy("/var/cfengine/outgoing","$(real_machine_name)"),
>>                 depth_search                    =>      recurse("inf"),
>>                 action                          =>      immediate;
>> }
>> #########################################################
>> body copy_from remote_copy(sourcedir,sourceserver)
>> {
>>         source          =>      "$(sourcedir)";
>>         servers         =>      { "$(sourceserver)" };
>>         copy_backup     =>      "false";
>>         purge           =>      "false";
>>         trustkey        =>      "true";
>>         collapse_destination_dir        =>      "true";
>>         encrypt         =>      "true";
>> }
>> 
>> 
>> 
>> 
>> When this executes, cf-agent holds open a socket for each client it connects
>> to.  It doesn¹t close the socket when it moves onto the next machine.  The
>> downside of this, is that this master policy server has to reach out and
>> grab data from 900 clients, which means I end up with a TON of open file
>> descriptors with socket information.
>> 
>> I¹ve raised ulimit ­n (open file descriptors) to 2048, but cf-agent doesn¹t
>> seem very happy.
>> # ulimit -a
>> core file size        (blocks, -c) unlimited
>> data seg size         (kbytes, -d) unlimited
>> file size             (blocks, -f) unlimited
>> open files                    (-n) 2048
>> pipe size          (512 bytes, -p) 10
>> stack size            (kbytes, -s) 10240
>> cpu time             (seconds, -t) unlimited
>> max user processes            (-u) 16357
>> virtual memory        (kbytes, -v) unlimited
>> 
>> 
>> 
>> Through about 500 client transfers, cf-agent is happy.    Then I start
>> hitting these messages below.  The file transfer still succeeds, but it
>> looks nasty.
>> 
>> 
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-cs44.prod/monitor_summary.html from
>> source /var/cfengine/outgoing/reports/monitor_summary.html on ela4-cs44.prod
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-cs44.prod/performance.html from
>> source /var/cfengine/outgoing/reports/performance.html on ela4-cs44.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/lastseen.html
>> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-cs44.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/classes.html
>> from source /var/cfengine/outgoing/reports/classes.html on ela4-cs44.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/class_notes
>> from source /var/cfengine/outgoing/reports/class_notes on ela4-cs44.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-cs44.prod/audit.html from
>> source /var/cfengine/outgoing/reports/audit.html on ela4-cs44.prod
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-be174.prod/monitor_summary.html from
>> source /var/cfengine/outgoing/reports/monitor_summary.html on
>> ela4-be174.prod
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-be174.prod/performance.html from
>> source /var/cfengine/outgoing/reports/performance.html on ela4-be174.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/lastseen.html
>> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be174.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/classes.html
>> from source /var/cfengine/outgoing/reports/classes.html on ela4-be174.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/class_notes
>> from source /var/cfengine/outgoing/reports/class_notes on ela4-be174.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be174.prod/audit.html
>> from source /var/cfengine/outgoing/reports/audit.html on ela4-be174.prod
>>  -> Copying from
>> ela4-be520.prod:/var/cfengine/outgoing/reports/monitor_summary.html
>>  -> Copying from
>> ela4-be520.prod:/var/cfengine/outgoing/reports/performance.html
>>  -> Copying from
>> ela4-be520.prod:/var/cfengine/outgoing/reports/lastseen.html
>>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/hashes.html
>>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/classes.html
>>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/class_notes
>>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/reports/audit.html
>>  -> Copying from ela4-be520.prod:/var/cfengine/outgoing/cm.conf
>> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be298.prod.pub) -
>> use cf-key to get one
>>  !!! System error for fopen: "Too many open files"
>>  -> Trusting server identity, promise to accept key from
>> ela4-be298.prod=172.17.135.198
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-be298.prod/monitor_summary.html from
>> source /var/cfengine/outgoing/reports/monitor_summary.html on
>> ela4-be298.prod
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-be298.prod/performance.html from
>> source /var/cfengine/outgoing/reports/performance.html on ela4-be298.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/lastseen.html
>> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be298.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/classes.html
>> from source /var/cfengine/outgoing/reports/classes.html on ela4-be298.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be298.prod/audit.html
>> from source /var/cfengine/outgoing/reports/audit.html on ela4-be298.prod
>> Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for
>> editing
>>  !!! System reports error for fopen: "Too many open files"
>> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-be420.prod.pub) -
>> use cf-key to get one
>>  !!! System error for fopen: "Too many open files"
>>  -> Trusting server identity, promise to accept key from
>> ela4-be420.prod=172.17.137.192
>>  -> Updated
>> /export/apps/cfengine-client-data/ela4-be420.prod/performance.html from
>> source /var/cfengine/outgoing/reports/performance.html on ela4-be420.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/lastseen.html
>> from source /var/cfengine/outgoing/reports/lastseen.html on ela4-be420.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/classes.html
>> from source /var/cfengine/outgoing/reports/classes.html on ela4-be420.prod
>>  -> Updated /export/apps/cfengine-client-data/ela4-be420.prod/audit.html
>> from source /var/cfengine/outgoing/reports/audit.html on ela4-be420.prod
>> Couldn't read file /var/cfengine/cfagent.ela4-41105-js01.prod.log for
>> editing
>>  !!! System reports error for fopen: "Too many open files"
>> Couldn't find a public key (/var/cfengine/ppkeys/root-ela4-ss143.prod.pub) -
>> use cf-key to get one
>> 
>> 
>> Running a pfiles on cf-agent, here¹s all the open sockets I see.  They
>> aren¹t being released after every client transfer, so it piles up in
>> cf-agent.
>> 
>> 
>> $ pfiles 5610
>> 5610:   /var/cfengine/bin/cf-agent -I -K
>>   Current rlimit: 2048 file descriptors
>>    0: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>>       O_RDWR|O_NOCTTY|O_LARGEFILE
>>       /devices/pseudo/pts@0:1
>>    1: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>>       O_RDWR|O_NOCTTY|O_LARGEFILE
>>       /devices/pseudo/pts@0:1
>>    2: S_IFCHR mode:0620 dev:286,0 ino:12582918 uid:3378 gid:7 rdev:24,1
>>       O_RDWR|O_NOCTTY|O_LARGEFILE
>>       /devices/pseudo/pts@0:1
>>    3: S_IFDOOR mode:0444 dev:295,0 ino:56 uid:0 gid:0 size:0
>>       O_RDONLY|O_LARGEFILE FD_CLOEXEC  door to nscd[215]
>>       /var/run/name_service_door
>>    4: S_IFREG mode:0644 dev:30,131 ino:8466 uid:0 gid:1 size:16384
>>       O_RDWR|O_CREAT|O_LARGEFILE FD_CLOEXEC
>>       /var/cfengine/cf_Audit.db
>> ....
>> ........
>> ...
>>  758: S_IFSOCK mode:0666 dev:293,0 ino:2553 uid:0 gid:0 size:0
>>       O_RDWR
>>         SOCK_STREAM
>>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>>         sockname: AF_INET 172.17.130.245  port: 41914
>>         peername: AF_INET 172.17.137.174  port: 5308
>>  759: S_IFSOCK mode:0666 dev:293,0 ino:17601 uid:0 gid:0 size:0
>>       O_RDWR
>>         SOCK_STREAM
>>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>>         sockname: AF_INET 172.17.130.245  port: 41915
>>         peername: AF_INET 172.17.137.73  port: 5308
>>  760: S_IFSOCK mode:0666 dev:293,0 ino:44388 uid:0 gid:0 size:0
>>       O_RDWR
>>         SOCK_STREAM
>>         SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(232.193.0.0)
>>         sockname: AF_INET 172.17.130.245  port: 41916
>>         peername: AF_INET 172.17.138.159  port: 5308
>> 
>> 
>> 
>> Anyways, is there a way to instruct cf-agent to close the socket when the
>> copy_from is complete, or does anyone else have a better approach for what I
>> am trying to accomplish?
>> 
>> Thanks
>> Mike
>> _______________________________________________
>> Help-cfengine mailing list
>> Help-cfengine@cfengine.org
>> https://cfengine.org/mailman/listinfo/help-cfengine
>> 
>> 
> 
> 

_______________________________________________
Help-cfengine mailing list
Help-cfengine@cfengine.org
https://cfengine.org/mailman/listinfo/help-cfengine

Reply via email to