I am going to look into this soon, I need to setup a working test environment first. Perhaps you can create a new bug and provide initial sample config files: http://bugzilla.adiscon.com/buglist.cgi?product=rsyslog&list_id=190
best regards, Andre > -----Original Message----- > From: [email protected] [mailto:rsyslog- > [email protected]] On Behalf Of David Lang > Sent: Thursday, August 01, 2013 9:13 AM > To: rsyslog-users > Subject: Re: [rsyslog] Message loss with relp and queues > > FYI, Rainer is on vacation for several weeks. I'm answering the questions that > I can, but this is one that I can't touch. > > Andre is the person from adiscon who's watching the list, so I've been > keeping track of threads like this (so they don't get lost) and hoping that he > can step in. > > David Lang > > On Fri, 26 Jul 2013, David Anderson wrote: > > > Sorry to "bump" this, but I'm wondering if I'm just not setting > > something in my configuration properly that is causing this message > > loss, or if this is a bug. Can anyone replicate or comment? > > > > Thanks, > > -- > > Dave > > > > On 7/19/13 4:43 PM, David Anderson wrote: > >> Hello, > >> > >> I'm using central logging using RELP+TLS with disk assisted queues on > >> the client. When I: > >> 1) stop rsyslog on the server (or just drop all packets at the > >> firewall to simulate network loss) > >> 2) send a few messages to syslog with "logger" on the client (rsyslog > >> notices the relp server is down and puts the messages in the queue) > >> 3) then restart rsyslog on the client > >> > >> the first message (sometimes I've noticed two) sent to logger while > >> the server was unreachable is lost and does not get sent to the central > server. > >> If I *do not* restart rsyslog on the client while the central server > >> is unreachable, the message is not lost. This happens with both > >> LinkedList and Disk queues. > >> > >> It's not uncommon that the client might be restarted during a time > >> when it has already lost the network connection to the central > >> server. Am I missing a config that might fix this? > >> > >> Here are my example logs showing the message loss: > >> > >> I start off with both central server and rsyslog client running > >> normally with RELP working fine: > >> ============================================ > >> root@rsyslog-test:/etc# logger upnow > >> root@rsyslog-test:/etc# service rsyslog restart rsyslog stop/waiting > >> rsyslog start/running, process 23909 root@rsyslog-test:/etc# logger > >> upnow2 > >> > >> Works fine. Central server receives both messages and sees the client > >> restart: > >> ============================================ > >> root@relp-server /etc # tail /var/log/relp_log > >> 2013-07-19T17:07:55.757080-04:00 rsyslog-test logger: upnow > >> 2013-07-19T17:08:13.439180-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23883" > >> x-info="http://www.rsyslog.com"] exiting on signal 15. > >> 2013-07-19T17:08:13.493691-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23909" > >> x-info="http://www.rsyslog.com"] start > >> 2013-07-19T17:08:15.894360-04:00 rsyslog-test logger: upnow2 > >> > >> Now I stop the central log server: > >> ============================================ > >> root@relp-server /etc # service rsyslog stop rsyslog stop/waiting > >> > >> Log some messages to syslog on the client with "logger" and stop > >> rsyslog on the client: > >> ============================================ > >> root@rsyslog-test:/etc# logger downnow1 root@rsyslog-test:/etc# > >> logger downnow2 root@rsyslog-test:/etc# logger downnow3 > >> root@rsyslog-test:/etc# service rsyslog stop rsyslog stop/waiting > >> > >> > >> Since I have all messages being logged locally on the client I can see that > >> my messages were sent to syslog and syslog detected the central server > was > >> down after I logged the 'downnow2' message: > >> ============================================ > >> root@rsyslog-test:/etc# tail /var/log/rsyslog-test > >> 2013-07-19T17:07:20.413878-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23883" > >> x-info="http://www.rsyslog.com"] start > >> 2013-07-19T17:07:55.757080-04:00 rsyslog-test logger: upnow > >> 2013-07-19T17:08:13.439180-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23883" > >> x-info="http://www.rsyslog.com"] exiting on signal 15. > >> 2013-07-19T17:08:13.493691-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23909" > >> x-info="http://www.rsyslog.com"] start > >> 2013-07-19T17:08:15.894360-04:00 rsyslog-test logger: upnow2 > >> 2013-07-19T17:08:31.045190-04:00 rsyslog-test logger: downnow1 > >> 2013-07-19T17:08:48.446359-04:00 rsyslog-test logger: downnow2 > >> 2013-07-19T17:08:48.446601-04:00 rsyslog-test rsyslogd-2353: > >> omrelp[10.10.5.1:20514]: error 'TLS record write failed [gnutls error -53: > >> Error in the push function.]', object 'conn to srvr 10.10.5.1:20514' - > >> action may not work as intended [try http://www.rsyslog.com/e/2353 ] > >> 2013-07-19T17:08:54.453092-04:00 rsyslog-test logger: downnow3 > >> 2013-07-19T17:09:13.302262-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23909" > >> x-info="http://www.rsyslog.com"] exiting on signal 15. > >> > >> > >> I can see the messages except the first 'downnow1' are in the disk queue > >> file on the client: > >> ============================================ > >> root@rsyslog-test:/etc# cat /var/spool/rsyslog/queue/relpy.00000001 > >> <Obj:1:msg:1: > >> +iProtocolVersion:2:1:0: > >> +iSeverity:2:1:3: > >> +iFacility:2:1:5: > >> +msgFlags:2:1:1: > >> +ttGenTime:2:10:1374268128: > >> +tRcvdAt:3:34:2:2013:7:19:17:8:48:446601:6:-:4:0: > >> +tTIMESTAMP:3:34:2:2013:7:19:17:8:48:446601:6:-:4:0: > >> +pszTAG:1:14:rsyslogd-2353:: > >> +pszRawMsg:1:215:omrelp[10.10.5.1:20514]: error 'TLS record write failed > >> [gnutls error -53: Error in the push function.]', object 'conn to srvr > >> 10.10.5.1:20514' - action may not work as intended [try > >> http://www.rsyslog.com/e/2353 ]: > >> +pszHOSTNAME:1:17:rsyslog-test: > >> +pszInputName:1:8:rsyslogd: > >> +pszRcvFrom:1:17:rsyslog-test: > >> +pszRcvFromIP:1:9:127.0.0.1: > >> +offMSG:2:1:0: > >> >End > >> . > >> <Obj:1:msg:1: > >> +iProtocolVersion:2:1:0: > >> +iSeverity:2:1:5: > >> +iFacility:2:1:1: > >> +msgFlags:2:1:4: > >> +ttGenTime:2:10:1374268134: > >> +tRcvdAt:3:34:2:2013:7:19:17:8:54:453092:6:-:4:0: > >> +tTIMESTAMP:3:34:2:2013:7:19:17:8:54:453092:6:-:4:0: > >> +pszTAG:1:7:logger:: > >> +pszRawMsg:1:36:<13>Jul 19 17:08:54 logger: downnow3: > >> +pszInputName:1:8:imuxsock: > >> +pszRcvFrom:1:17:rsyslog-test: > >> +pszRcvFromIP:1:9:127.0.0.1: > >> +offMSG:2:2:27: > >> >End > >> . > >> <Obj:1:msg:1: > >> +iProtocolVersion:2:1:0: > >> +iSeverity:2:1:6: > >> +iFacility:2:1:5: > >> +msgFlags:2:1:1: > >> +ttGenTime:2:10:1374268153: > >> +tRcvdAt:3:34:2:2013:7:19:17:9:13:302262:6:-:4:0: > >> +tTIMESTAMP:3:34:2:2013:7:19:17:9:13:302262:6:-:4:0: > >> +pszTAG:1:9:rsyslogd:: > >> +pszRawMsg:1:115: [origin software="rsyslogd" swVersion="7.5.2" > >> x-pid="23909" x-info="http://www.rsyslog.com"] exiting on signal 15.: > >> +pszHOSTNAME:1:17:rsyslog-test: > >> +pszInputName:1:8:rsyslogd: > >> +pszRcvFrom:1:17:rsyslog-test: > >> +pszRcvFromIP:1:9:127.0.0.1: > >> +offMSG:2:1:0: > >> >End > >> . > >> <Obj:1:msg:1: > >> +iProtocolVersion:2:1:0: > >> +iSeverity:2:1:5: > >> +iFacility:2:1:1: > >> +msgFlags:2:1:4: > >> +ttGenTime:2:10:1374268128: > >> +tRcvdAt:3:34:2:2013:7:19:17:8:48:446359:6:-:4:0: > >> +tTIMESTAMP:3:34:2:2013:7:19:17:8:48:446359:6:-:4:0: > >> +pszTAG:1:7:logger:: > >> +pszRawMsg:1:36:<13>Jul 19 17:08:48 logger: downnow2: > >> +pszInputName:1:8:imuxsock: > >> +pszRcvFrom:1:17:rsyslog-test: > >> +pszRcvFromIP:1:9:127.0.0.1: > >> +offMSG:2:2:27: > >> >End > >> . > >> > >> > >> Now I restart rsyslog on the central server and the client, and log a few > >> more messages on the client: > >> ============================================ > >> root@relp-server /etc # service rsyslog start > >> rsyslog start/running, process 331 > >> ============================================ > >> root@rsyslog-test:/etc# service rsyslog start > >> rsyslog start/running, process 23971 > >> root@rsyslog-test:/etc# logger upagain > >> root@rsyslog-test:/etc# logger upagain2 > >> root@rsyslog-test:/etc# logger upagain3 > >> > >> I check the central server logs and all messages except the lost > "downnow1" > >> message are successfully received: > >> ============================================ > >> root@relp-server /etc # tail -n 20 /var/log/relp_log > >> 2013-07-19T17:07:55.757080-04:00 rsyslog-test logger: upnow > >> 2013-07-19T17:08:13.439180-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23883" > >> x-info="http://www.rsyslog.com"] exiting on signal 15. > >> 2013-07-19T17:08:13.493691-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23909" > >> x-info="http://www.rsyslog.com"] start > >> 2013-07-19T17:08:15.894360-04:00 rsyslog-test logger: upnow2 > >> 2013-07-19T17:08:48.446601-04:00 rsyslog-test rsyslogd-2353: > >> omrelp[10.10.5.1:20514]: error 'TLS record write failed [gnutls error -53: > >> Error in the push function.]', object 'conn to srvr 10.10.5.1:20514' - > >> action may not work as intended [try http://www.rsyslog.com/e/2353 ] > >> 2013-07-19T17:08:54.453092-04:00 rsyslog-test logger: downnow3 > >> 2013-07-19T17:09:13.302262-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23909" > >> x-info="http://www.rsyslog.com"] exiting on signal 15. > >> 2013-07-19T17:08:48.446359-04:00 rsyslog-test logger: downnow2 > >> 2013-07-19T17:10:15.245473-04:00 rsyslog-test rsyslogd: [origin > >> software="rsyslogd" swVersion="7.5.2" x-pid="23971" > >> x-info="http://www.rsyslog.com"] start > >> 2013-07-19T17:10:35.210411-04:00 rsyslog-test logger: upagain > >> 2013-07-19T17:10:41.354935-04:00 rsyslog-test logger: upagain2 > >> 2013-07-19T17:10:48.602667-04:00 rsyslog-test logger: upagain3 > >> > >> > >> Here is the relp server config: > >> ============================================ > >> root@relp-server /etc # cat rsyslog.conf > >> $PrivDropToUser syslog > >> $PrivDropToGroup syslog > >> > >> $AbortOnUncleanConfig off > >> > >> $umask 0000 # Ensure umask is reset > >> $DirCreateMode 0700 > >> $FileCreateMode 0644 > >> $DirOwner syslog > >> $DirGroup syslog > >> > >> # DynaFile settings > >> $FileOwner syslog > >> $FileGroup syslog > >> $FailOnChownFailure on > >> $DynaFileCacheSize 100 # a cache of 100 files at most > >> > >> $MaxOpenFiles 200000 > >> > >> $MaxMessageSize 64k > >> > >> $MainMsgQueueSize 100000 > >> > >> $DropTrailingLFOnReception on > >> $Escape8BitCharactersOnReceive off > >> $EscapeControlCharactersOnReceive on > >> $RepeatedMsgReduction off # log every message > >> > >> $DropMsgsWithMaliciousDnsPTRRecords off > >> > >> module(load="imuxsock") > >> module(load="imrelp" ruleset="relp") > >> > >> input(type="imrelp" port="20514" tls="on" > >> tls.caCert="/etc/rsyslog.certs/ca.pem" > >> tls.myCert="/etc/rsyslog.certs/server.cert" > >> tls.myPrivKey="/etc/rsyslog.certs/server.key" > >> tls.authMode="name" > >> tls.permittedpeer=["rsyslog-test.mydomain.com"] > >> ) > >> > >> ruleset (name="relp") { > >> $RulesetCreateMainQueue on > >> action(type="omfile" file="/var/log/relp_log") > >> } > >> > >> $IncludeConfig /etc/rsyslog.d/ > >> > >> *.* /var/log/syslog > >> > >> > >> > >> And the client config: > >> ============================================ > >> root@rsyslog-test:/etc# cat rsyslog.conf > >> module(load="imuxsock") > >> module(load="omrelp") > >> module(load="imtcp") > >> > >> input(type="imtcp" port="514") > >> > >> $WorkDirectory /var/spool/rsyslog/queue # default location for work > (spool) > >> files > >> > >> $MaxMessageSize 64k > >> > >> $MainMsgQueueFileName mainq # set file name, also enables disk mode > >> $MainMsgQueueType LinkedList > >> $MainMsgQueueSaveOnShutDown on > >> $MainMsgQueueMaxDiskSpace 40g > >> $MainMsgQueueSize 8000000 > >> > >> $ActionSendResendLastMsgOnReconnect on > >> > >> action(type="omrelp" target="10.10.5.1" port="20514" tls="on" > >> tls.compression="on" > >> tls.caCert="/etc/rsyslog.certs/ca.pem" > >> tls.myCert="/etc/rsyslog.certs/rsyslog-test.cert" > >> tls.myPrivKey="/etc/rsyslog.certs/rsyslog-test.key" > >> tls.authmode="name" > >> tls.permittedpeer=["log.mydomain.com"] > >> queue.filename="relpy" > >> queue.maxdiskspace="10g" > >> queue.saveonshutdown="on" > >> queue.type="linkedlist" > >> queue.maxfilesize="20m" > >> action.resumeretrycount="-1" > >> ) > >> > >> *.* /var/log/rsyslog-test > >> > >> > >> -- > >> Dave > >> _______________________________________________ > >> rsyslog mailing list > >> http://lists.adiscon.net/mailman/listinfo/rsyslog > >> http://www.rsyslog.com/professional-services/ > >> What's up with rsyslog? Follow https://twitter.com/rgerhards > >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of > >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T > >> LIKE THAT. > > > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T > > LIKE THAT. > > > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

