yes that was in the first message, I reimplemented everything, it is working without any issues now. that was not the problem btw. that was not even there at the time of the first message, this is actually the reimplementatiom I did on the rsyslog side. as said, it works without any issues now. but thanks

On 21/09/2023 21:09, Joan Sala wrote:
(Disclaimer: I have not read all this thread in depth.)

This flag "confirmMessages=on" sounds suspicious. With this flag, omprog waits for the script to confirm each received log line (if I recall well), but your script (looking at your first message) doesn't seem to do this (?) (to be sure we would need to see the python part). This could cause the queue to stall...



On Thu, Sep 21, 2023, 11:26 TG Servers via rsyslog <[email protected]> wrote:

    I don't think we are talking same things anymore.
    I told you several times journald is not involved in this. you cannot
    find 1 line of these logs in journald, so I am not using journald as
    queue, because I am not using journald at all for this process.

    "you did not configure rsyslog to use a separate queue for the
    logs from
    this socket..."?
    I told you that I implemented a separate queue... you tell me I didn't

    what are you talking about? about the implementation I sent in my
    original message? Yes at that point I didn't.
    But I wrote several mails in the meanwhile and wrote that I
    completely
    reimplemented this, with a dedicated socket, with a dedicated queue.

    The socket does nothing else than receive messages with tag "app".
    And
    these messages do not ever touch journald.

    This is a separate queue, isn't it?

    if $programname == "app" then {
        action(type="omprog"
               name="app_log"
               binary="/usr/local/script/app_log.sh"
               template="app"
               confirmMessages="on"
               confirmTimeout="5000"
               queue.type="LinkedList"
               queue.size="10000"
               closeTimeout="10000"
               queue.workerThreads="2"
               action.resumeInterval="5"
               killUnresponsive="on"
               )

    also that nginx had no access to the socket after a rsyslog
    restart had
    nothing to do with a full queue, the socket was simply not listening.
    since it is now handled by systemd this is not a problem anymore, too


    On 21/09/2023 10:55, David Lang wrote:
    > if you are sending logs to journald and having journald send
    logs to
    > syslog, you are using journald as a queue for the delivery
    >
    > when you were delivering directly to rsyslog, what was probably
    > happening (we don't know because you never enabled impstats to
    see) is
    > that the logs were arriving, but because your script takes so
    long to
    > process each log message, the queue was filling up, and when the
    queue
    > is full, rsyslog cannot accept another message, and that results in
    > the error that you are reporting.
    >
    > you did not configure rsyslog to use a separate queue for the logs
    > from this socket, so as they arrived they got added to the main
    queue
    > along with all other logs.
    >
    > rsyslog has options to tell it to throw away logs when it's too
    busy
    > (and even can prioritize which ones it throws away). you can also
    > configure it to write logs to disk when the memory queue gets too
    > full. But eventually you will run out of disk space if you keep
    > getting logs faster than you can process them.
    >
    > David Lang
    >
    >
    >
    > On Thu, 21 Sep 2023, TG Servers wrote:
    >
    >>  I did not get a single message from you David regarding that,
    that
    >> confused me quite a bit as Rainer mentioned you already before,
    now I
    >> know why :
    >> 450 4.7.25 Client host rejected: cannot find your hostname,
    >> [66.167.xxx.xxx]; from=<[email protected]> to=<[email protected]>
    >> proto=ESMTP helo=<mail.lang.hm <http://mail.lang.hm>>
    >> made an exception now
    >>
    >> But to the point, I sadly don't get what both of you are
    telling me
    >> now. This has nothing to do with journald. It just did not work
    when
    >> the socket was created by rsyslog. If this is/was a rsyslog or a
    >> nginx problem does not matter in the end to me, as this had to
    be fixed.
    >>
    >> I am using a dedicated socket, completely aside from sysSock,
    and a
    >> dedicated queue. sysSock ist not involved, nginx does not even
    log a
    >> single line to journald, so how should journald act as a queue
    here,
    >> or being negatively affected when it does not even receive a
    single
    >> message of the process involved? It can't throw something away it
    >> does not have.
    >> This socket is only used by and for this process and nothing else.
    >> I also won't likely run out of queue space because this is not a
    >> process that is 24/7 under a full load scenario. That might happen
    >> under an attack maybe, but otherwise I don't see that happening
    >> If I am seeing things wrong then I would be happy if it could
    be made
    >> clear to me because as of now I do not see the problem.
    >>
    >> Thanks,
    >> Thomas
    >>
    >>
    >> On 21/09/2023 08:34, Rainer Gerhards wrote:
    >>> I guess it works because journal always throws messages away
    if it
    >>> cannot deliver them quickly. Luke a very short timeout+drop queue
    >>> config in rsyslog.
    >>>
    >>> Rainer
    >>>
    >>> Sent from phone, thus brief.
    >>>
    >>> David Lang <[email protected]> schrieb am Do., 21. Sept. 2023, 08:23:
    >>>
    >>>     now you have journald acting as a queue, so all messages from
    >>>     journald will end
    >>>     up delayed when your script cannot keep up. You haven't
    solved the
    >>>     problem of
    >>>     the slow script, you've just added another layer of buffer
    to fill
    >>>     up before you
    >>>     notice.
    >>>
    >>>     with rsyslog you can set the queue size to whatever you
    want, and
    >>>     you can spill
    >>>     logs to disk when your queue fills up.
    >>>
    >>>     but no matter what you do, if you have something that is
    >>>     processing logs slower
    >>>     than they are being generated, eventually you will run out of
    >>>     queue space (in
    >>>     memory or on disk) and have to stop accepting new messages, or
    >>>     start throwing
    >>>     away messages you haven't processed yet
    >>>
    >>>     David Lang
    >>>
    >>>     On Thu, 21 Sep 2023, TG Servers via rsyslog wrote:
    >>>
    >>>     > the only way I was able to fix this was to use a
    dedicated socket
    >>>     > created via systemd and passed via systemd to rsyslog
    >>>     > since then it is working without any issues.
    >>>     > although I implemented a queue, too, this did not fix the
    >>>     problem as
    >>>     > long as the socket was handled by rsyslog itself
    >>>     > so this is "fixed" from my point of view, I know for the
    >>> future now
    >>>     >
    >>>     > On 18/09/2023 21:53, TG Servers via rsyslog wrote:
    >>>     >> I don't know what this is... I implemented a complete queue
    >>>     solution
    >>>     >> and it occasionally happens when there is no request
    but one in
    >>>     sight,
    >>>     >> and this one gets a 111 then, nothing in nginx debug
    log, no
    >>>     error to
    >>>     >> be seen in rsyslog log
    >>>     >> but one thing I realized, after a restart the first log
    message
    >>>     >> always, reproducable gets a 111
    >>>     >> the socket is not connected, nor listening, only after
    the first
    >>>     >> request is logged/or not logged (which is logged with
    111 in
    >>>     nginx)
    >>>     >> the socket is connected and listening, so restarting
    rsyslog via
    >>>     >> systemd does not connect/listen to/on the socket
    >>>     >>
    >>>     >> the rsyslog debug log just tells us this :
    >>>     >> 6289.088037540:main thread    : imuxsock.c: imuxsock:
    Opened
    >>> UNIX
    >>>     >> socket '/run/logmat' (fd 6).
    >>>     >>
    >>>     >> [root@xxx rsyslog.d]# systemctl restart rsyslog
    >>>     >> [root@xxx rsyslog.d]# ss -x | grep logmat
    >>>     >> [root@xxx rsyslog.d]# lsof /run/logmat
    >>>     >> COMMAND      PID USER   FD TYPE DEVICE SIZE/OFF
    >>>     NODE NAME
    >>>     >> rsyslogd 2097140 root    6u unix
    0x0000000000000000      0t0
    >>>     25300317
    >>>     >> /run/logmat type=DGRAM (UNCONNECTED)
    >>>     >>
    >>>     >> make a request from browser or curl
    >>>     >>
    >>>     >> [root@xxx rsyslog.d]# lsof /run/logmat
    >>>     >> COMMAND      PID USER   FD TYPE DEVICE SIZE/OFF
    >>>     NODE NAME
    >>>     >> rsyslogd 2097140 root    6u unix
    0x0000000000000000      0t0
    >>>     25300317
    >>>     >> /run/logmat type=DGRAM (CONNECTED)
    >>>     >> [root@xxx rsyslog.d]# ss -x | grep logmat
    >>>     >> u_dgr ESTAB 0 0 /run/logmat 25300317            * 0
    >>>     >>
    >>>     >> On 18/09/2023 16:34, TG Servers via rsyslog wrote:
    >>>     >>> I just wanted to add that in a further message as it
    came to
    >>>     my mind.
    >>>     >>> you were faster...
    >>>     >>> the script is definitely "slow", this is what I know
    for sure
    >>>     as it
    >>>     >>> does quite a lot of processing/analytics in the
    background, so
    >>>     even
    >>>     >>> if you trigger it from command line it can take half a
    sec or
    >>>     so....
    >>>     >>> I can't change that, it needs to do what it does, I didn't
    >>>     write it
    >>>     >>> though it can handle manual fast F5 triggers in the
    browser
    >>>     without
    >>>     >>> issue and then it 111s when there are 2 requests
    incoming...
    >>>     >>> I thought rsyslog might handle that just well via the
    queue...
    >>>     >>> but then this might eventually really be the issue,
    and if it
    >>>     is, is
    >>>     >>> there anything to mitigate this from rsyslog side (in
    terms of
    >>>     own
    >>>     >>> queue for that socket or something in that direction)?
    >>>     >>> ok, will enable impstats, too when I switch back
    >>>     >>>
    >>>     >>> Thanks,
    >>>     >>> Tom
    >>>     >>>
    >>>     >>> On 18/09/2023 16:17, Rainer Gerhards wrote:
    >>>     >>>>> so far not a single 111 today, I let this run the
    until late
    >>>     evening,
    >>>     >>>>> and if there is stil no 111 I will put back the python
    >>>     script in order
    >>>     >>>>> because right now there are 2 possibilities, I moved the
    >>>     socket as
    >>>     >>>>> said,
    >>>     >>>>> and I skipped the script and just appended the
    message to
    >>> a file
    >>>     >>>>> if either of the 2 things are responsible in the end
    I won't
    >>>     >>>>> understand
    >>>     >>>>> it either :)
    >>>     >>>> I don't know what the script does. But if it is slow,
    it may
    >>>     push back
    >>>     >>>> to the main queue, making rsyslog unresponsive.
    >>>     >>>>
    >>>     >>>> This is David's concern. Tomorrow, if you re-enable, you
    >>>     should also
    >>>     >>>> enable impstats as David suggested.
    >>>     >>>>
    >>>     >>>> Rainer
    >>>     >>>
    >>>     >>> _______________________________________________
    >>>     >>> rsyslog mailing list
    >>>     >>> https://lists.adiscon.net/mailman/listinfo/rsyslog
    >>>     >>> http://www.rsyslog.com/professional-services/
    >>>     >>> What's up with rsyslog? Follow
    https://twitter.com/rgerhards
    >>>     >>> NOTE WELL: This is a PUBLIC mailing list, posts are
    ARCHIVED
    >>> by a
    >>>     >>> myriad of sites beyond our control. PLEASE UNSUBSCRIBE
    and
    >>> DO NOT
    >>>     >>> POST if you DON'T LIKE THAT.
    >>>     >>
    >>>     >> _______________________________________________
    >>>     >> rsyslog mailing list
    >>>     >> https://lists.adiscon.net/mailman/listinfo/rsyslog
    >>>     >> http://www.rsyslog.com/professional-services/
    >>>     >> What's up with rsyslog? Follow
    https://twitter.com/rgerhards
    >>>     >> NOTE WELL: This is a PUBLIC mailing list, posts are
    ARCHIVED
    >>> by a
    >>>     >> myriad of sites beyond our control. PLEASE UNSUBSCRIBE
    and DO
    >>>     NOT POST
    >>>     >> if you DON'T LIKE THAT.
    >>>     >
    >>>     > _______________________________________________
    >>>     > rsyslog mailing list
    >>>     > https://lists.adiscon.net/mailman/listinfo/rsyslog
    >>>     > http://www.rsyslog.com/professional-services/
    >>>     > What's up with rsyslog? Follow https://twitter.com/rgerhards
    >>>     > NOTE WELL: This is a PUBLIC mailing list, posts are
    ARCHIVED by
    >>>     a myriad of
    >>>     > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
    POST if
    >>>     you DON'T
    >>>     > LIKE THAT.
    >>>
    >>
    >>

    _______________________________________________
    rsyslog mailing list
    https://lists.adiscon.net/mailman/listinfo/rsyslog
    http://www.rsyslog.com/professional-services/
    What's up with rsyslog? Follow https://twitter.com/rgerhards
    NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a
    myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT
    POST if you DON'T LIKE THAT.


_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to