Hello Rainer,

Could you check if this patch is correct and suitable for submission?

I came across an annoying bug using ommysql but I think it would be happening to any action with a queue that returns RS_RET_DATAFAIL -- or maybe only for transactional actions?

# rsyslog-fix-infinite-loop-on-action-datafail.patch

I believe this commit has introduced the issue:
59627f23bee26f3acec19d491d5884bcd1fb672e
core: fix message loss on target unavailibility during shutdown
closes https://github.com/rsyslog/rsyslog/issues/2760

Elements of configuration to reproduce the bug are attached (not the easiest because it is using ommysql to demonstrate):

# rsyslog.conf
# rsyslog-db.sql

To trigger the RS_RET_DATAFAIL, I do an INSERT with a value that do not fit the column size (it returns mysql error code 1406).

echo "<1>test: hop $((++i))" > /dev/udp/127.0.0.1/5514

With MariaDB strict mode it produces:

rsyslogd: ommysql: db error (1406): Data too long for column 'service' at row 1 [v8.2104.0]

In ommysql this returns RS_RET_DATAFAIL.

rsyslogd: action 'out-sql-test' (module 'ommysql') message lost, could not be processed. Check for additional error messages before this one. [v8.2104.0 try https://www.rsyslog.com/e/2218 ]

Now generate a second RS_RET_DATAFAIL.

echo "<1>test: hop $((++i))" > /dev/udp/127.0.0.1/5514

Function processMsgMain() called from processBatchMain() returns RS_RET_DATAFAIL and then starts an infinite loop, the message is never removed from the queue.

First time:

4820.707506050:nagiosevents-comment queue:Reg/w0: queue.c: DeleteProcessedBatch: etry 0 state 3 4820.707525035:nagiosevents-comment queue:Reg/w0: queue.c: DeleteProcessedBatch: we deleted 1 objects and enqueued 0 objects 4820.707538508:nagiosevents-comment queue:Reg/w0: queue.c: rger: deleteBatchFromQStore, nElem 1 4820.707552043:nagiosevents-comment queue:Reg/w0: queue.c: doDeleteBatch: delete batch from store, new sizes: log 0, phys 0

Second time, loop, message is always re-enqueued:

4829.519359069:nagiosevents-comment queue:Reg/w0: queue.c: DeleteProcessedBatch: etry 0 state 0 4829.519382642:nagiosevents-comment queue:Reg/w0: queue.c: DeleteProcessedBatch: we deleted 0 objects and enqueued 1 objects 4829.519393587:nagiosevents-comment queue:Reg/w0: queue.c: rger: deleteBatchFromQStore, nElem 1 4829.519411018:nagiosevents-comment queue:Reg/w0: queue.c: doDeleteBatch: delete batch from store, new sizes: log 1, phys 1

Kind regards,
Julien

## debug

$DebugFile /tmp/rsyslog-dev/data/rsyslog.debug
$DebugLevel 1

## globals

$maxopenfiles 10000
global(umask="0027")
global(maxmessagesize="32768")
global(shutdown.enable.ctlc="on")
global(net.enabledns="off")
global(parser.permitslashinprogramname="on")
global(workdirectory="/tmp/rsyslog-dev/spool")

## modules

module(load="builtin:omfile" Template="RSYSLOG_TraditionalFileFormat")
module(load="imudp")
module(load="ommysql")

## main queue

#main_queue(
#    queue.type="LinkedList"
#    queue.size="5000"
#    queue.timeoutenqueue="0"
#    queue.dequeuebatchsize="200"
#    queue.workerthreads="1"
#    queue.workerthreadminimummessages="1000"
#
#    # disk assisted queue
#    queue.maxfilesize="5m"
#    queue.maxdiskspace="100m"
#    queue.saveonshutdown="on"
#    queue.filename="main"
#)

## inputs

input(
    type="imudp"
    address="0.0.0.0"
    port="5514"
    name="in-net-udp-514"
    rcvbufsize="16777216"
)

## templates

template (name="tpl-sql-test" type="list" option.sql="on") {
    constant(value="INSERT INTO test ")
    constant(value="(ts, host, service, message) ")
    constant(value="VALUES ('")
    property(name="timegenerated" dateformat="rfc3339" position.from="1" 
position.to="10")
    constant(value=" ")
    property(name="timegenerated" dateformat="rfc3339" position.from="12" 
position.to="19")
    constant(value="', '")
    property(name="$!e.host")
    constant(value="', '")
    property(name="$!e.service")
    constant(value="', '")
    property(name="$!e.message")
    constant(value="')")
}

## rules

if ($rawmsg contains "hop") then {
    set $!e.host = "ZeHost";
    set $!e.service = "ZeService012345678901234567890";
    set $!e.message = $msg;

    action (
        type="ommysql"
        template="tpl-sql-test"
        name="out-sql-test"
        server="172.28.42.7"
        db="dbtest"
        uid="root"
        pwd="password"

        queue.size="10000"
        queue.type="LinkedList"
        #queue.spooldirectory="/tmp/rsyslog-dev/spool"
        #queue.filename="out-sql-test"
        #queue.saveonshutdown="on"
        #queue.maxdiskspace="100m"
        #queue.timeoutenqueue="0"
    )
}

Attachment: rsyslog-db.sql
Description: application/sql

diff --git a/action.c b/action.c
index 17a74a74c..fec1f9ff1 100644
--- a/action.c
+++ b/action.c
@@ -1694,6 +1694,7 @@ processBatchMain(void *__restrict__ const pVoid,
 			if(   localRet == RS_RET_OK
 			   || localRet == RS_RET_DEFER_COMMIT
 			   || localRet == RS_RET_ACTION_FAILED
+			   || localRet == RS_RET_DATAFAIL
 			   || localRet == RS_RET_PREVIOUS_COMMITTED ) {
 				batchSetElemState(pBatch, i, BATCH_STATE_COMM);
 				DBGPRINTF("processBatchMain: i %d, COMM state set\n", i);

_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to