On Thu, 2 Nov 2023, computerquip-work wrote:

This is a bit unorganized of a take so I'm going to apologize ahead of time. 
These are the things I could think of off the top of my head.

1. Documentation is unclear and doesn't take itself seriously.

What I mean by this is that it states things that you can't take at face value. For example, in your overview, you state that `[2]`, or legacy syntax, is discouraged but the documentation says `this is the format best used to express basic things`. People take that comment seriously and I've seen a lot of mixing and matching of both formats, where it then ends up as two things in the same configuration file that get expressed in two different ways. As a result, you can't just know RainerScript or legacy syntax, you have to understand both if you want to read a configuration file. Even the sample configuration often used as a default doesn't use RainerScript, it uses the legacy syntax. https://github.com/rsyslog/rsyslog/blob/master/sample.conf

Yes, this is true. RainerScript is a recent addition because attempts to graft more functionality into the old syslog syntax got so ugly that even the rsyslog developers were having trouble reading configs and understanding what they do.

Initially there was talk about phasing out the old syntax, but to maintain backwards compatibility (avoid breaking existing configs) we decided to maintain support for both.

You give a solid overview that matches how I view the legacy vs RainerScript situation... but also, while RainerScript is more verbose, it's incredibly confusing to mix and match several syntax together. It is not clear to me at all what's "recommended" anymore and rsyslog (both as a community and a product) itself seems unclear on the topic.

we have to support both to avoid breaking existing configs, the recommendation is to use whichever is the clearest to the team maintining the config, but if you need to use multiple lines to configure something in the legacy format, you are probably better off using the new format.

2. Variables and their use are a mess.

I'm still not sure how to express variables in RainerScript. For examples that 
are used in the documentation:
* `property(name="$!usr!msgnum")`
* `constant(outname="@version" value="1" format="jsonf")` (Actually isn't a 
variable at all)
* `set $!usr!tpl2!dataflow = field($msg, 58, 2);`
* `property(name="$!")`
* `set $.tnow = $$now-unixtimestamp`

Where am I supposed to look in the documentation to interpret these? There is some explanation [here](https://www.rsyslog.com/doc/master/rainerscript/variable_property_types.html). But notice that it's not comprehensive. It doesn't mention all of the formats above at all. I'm basically on my own for anything not documented for the examples above. I've ended up using `$.` for most everything since I don't have any idea why I'd used `$!` and I still to this day have no clue what `$$` means (the best I can figure is that the actual variable name is `$now-unixtimestamp` and it's just stuck like that). There's no mention on scoping (or lack thereof), there's no real mention on how to set your own variables, only that you can do it.

3. Templates are split into different formats.
Similar to `1`, templates have several different ways to express themselves and 
it's not clear why you'd use one over the other. For the most part, I've just 
used the more expressive version with explicit `constant`, `property`, etc. in 
a list. There are a couple of instances where I couldn't figure out how to 
express that in a list so I did use string.

These are both the legacy of how things were added to rsyslog (along with the implementation details), and can't be cleaned up without breaking backwards compatibility. Yes, in retrospect it's bad and ugly and should have been done differently back in the really early days, but we don't see a way to get out of it. I can give you an explaniation of what is and why it got this way, I'd appriciate any suggestions in how we can better document this (as I said before, the people who wrote the documentation are too close to the code)

initially there were 'message properties' such as timestamp and hostname.
then system properties were added such as $myhostname
https://www.rsyslog.com/doc/v7-stable/configuration/properties.html

these were referenced in templates as
$template foo, "this uses a variable %timestamp% or %$myhostname%"
when rainerscript was added, they were referenced as $timestamp and $$myhostname in an if statement.

RFC-5424 was written to standardize syslog formats better than the prior RFC-3164, and it included an ability to add structured data to log messages. Pretty much nobody used it. A few years later, the various logging projects got together to try and define a standard for structuring logs in messages. The only part of it that survived was the idea to encode messages as JSON in the body of the message, and then have the logging systems parse the messages with ! as a reserved character so:
{'a': 'foo', 'b': {'c': 'bar', 'd':'baz'}}
would let you use
$! (returning "{'a': 'foo', 'b': {'c': 'bar', 'd':'baz'}")
$!a (returning 'foo')
$!b (returning "{'c': 'bar', 'd':'baz'}"
$!b!c (returning 'bar')
This is when user definable variables were added to rsyslog (initially just as the result of a message modification module parsing messages, but then the set/unset statements were added allowing manipulation of variables in the config)

I am responsible for us adding the $. namespace so that we could have a place to put variables that we don't want to include when we refer to $!, this is things like variables that you use for conditions, things you will use in file path templates, etc. Other than the fact that parsing message modification modules default to populating $!, there is no technical difference in how $! and $. variables can be used, they are simply two different namespaces (sometimes $. is referrred to as 'local' variables, reflecting the history of using it for internal processing while $! is historically used for things that will end up in an outbound message)

If you log a message using the RSYSLOG_DebugFormat you will see these variable namespaces down at the bottom of the message block.

$\ was added at the same time as $. so that there is a way to set a variable that will persist past the processing of a single message. These aren't used much, and the cost of locking around making them reasonably reliable to use makes them something to avoid if you can.

the simple template definition doesn't work well when complex escaping is needed, thigns needed to be formatted into json structures, etc and so new ways of defining a template were added. I'm not sure the new string format should have been added (it's just more syntactical suger around the old way of defining templates), but that was in the days when doing a break with the existing config format was being considered.

personally, I almost always use the legacy format for template definitions.

Not doing a break with the old config ended up being a significant advantage, it is what allowed the distros to switch from sysklogd (which wasn't being maintained) to rsyslog with minimal disruption. If we had made that change require the new syntax, I think odds are good that syslog-ng would have been selected and rsyslog may have faded away (syslog-ng has now gone the freemium route where you have to pay to get the full feature set)

the documentation for all of this was mostly written one page at a time as things changed, grafting the pages into the existing documentation


Now that I have given you the 'what is' and the history behind it, do you have suggestions for how we can update the documentation to better show and explain this? The docs tend to be a very dry reference material structure, but it may be that we need to give this history somewhere in there to explain the 'why' around this.

And if you can suggest changes that we can make to make things more consistant, please do (but keep in mind that for backwards compatibility, we aren't going to be able to remove support for the existing stuff)


4. Text is displayed in a not-friendly manner.
Some parts of the online documentation requires you scroll over a ridiculous 
amount to actually read it: https://i.imgur.com/Ujl289L.png

do you mean horizontal scrolling? we thought we ad fixed this

6. The index is too empty.

Not sure what's up with the index but there's basically nothing in there. No reference to `global()`, `input()`, or various other keywords and terms that would be very useful. For example, if I want to see how the `contains` expression work, I'd imagine I could go to the index to find a page related to it.

good point, thanks
I have been tripped up myself looking for global() a time or two

7. There is no search function.
The search function for the site doesn't appear to pertain to the documentation unless I'm misunderstanding. If I want to search for the expression `contains` or `global`, there's no way to do so. Even if I search for something very specific such as `RuleSetCreateMainQueue`, I get no useful results.

this is actually designed to be packaged and shipped with your distro. But I agree that it would be good to add a specific search the docs capability (I mostly use google and look for hits on rsyslog.com but I know enough of what I'm looking for to find it)

I think it would also be fantastic if it was possible to get sponsorship for the doc site and eliminate the advertising there (I don't know how much adiscon gets from those ads, so I don't know how much sponsorship money would be needed to eliminate them)

For a practical example, let's say I see `$Ruleset RSYSLOG_DefaultRuleset` and I want to figure out what exactly that does. Where do I even begin? This *looks* like legacy but if I look over in [Legacy Configuration Directives](https://www.rsyslog.com/doc/master/configuration/index_directives.html), there's no mention of it. There's no mention of it on the [conversion page](https://www.rsyslog.com/doc/master/configuration/converting_to_new_format.html). I see documentation for rulesets over in [basic structure](https://www.rsyslog.com/doc/master/configuration/basic_structure.html) but still no mention of $Ruleset although it *does* mention RSYSLOG_DefaultRuleset. Search doesn't work so I can't do that. It's not listed in the index. At the bottom of the Table of Contents, there's a page named [Multiple Rulesets in rsyslog](https://www.rsyslog.com/doc/master/concepts/multi_ruleset.html) where it lists what it does and what that particular ruleset means but I have to know to look there.

I think the example is on the ridiculous side because I think most people should be able to assume that $Ruleset just changes the current ruleset. But there are parts in the example that should have worked, such as search or index, that failed. `$Ruleset` *is* legacy syntax but there's nowhere it's listed as such. If you apply this to other things you might find in an older configuration like `$RuleSetCreateMainQueue`, each time you have to search through the documentation is a different path in the maze to finally get to where you need to be.

that's a good example, and it perfectly shows the problem we have. rulesets weren't initially in rsyslog, when they were added the concepts page was written to explain them, but the rest of the documenation wasn't significantly changed (other than to add the 'call' capability and the ability to tie a ruleset to an input), years later when the page on legacy statements was added, that one was missed.

Rainer, is there a relatively easy way to search the code for legacy type statements to make sure they are all documented on the legacy config page?

David Lang
_______________________________________________
rsyslog mailing list
https://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to