Anton, with this mail, I am still just adressing one question, I think the major one on why compatibility. By adressing only this, I hope to save some time. I would like to get some feedback on the WG if the proposed format (with non-xml brackets) is generally acceptable. I think this is one major question - because if the answer would be no, all other work done would be lost.
So if someone does not like this format, please speak up now. I would really appreciate this. I'll wait until monday before I do any major edit. If I got no objection by then, I assume the format in general is accepted and we "just" need to look at the "minor" issues of the draft. Now to backwards compatibility: Of course, we create a new standard for new syslog implementation. However, I am *very* sure that existing RFC 3164 compliant and NON-compliant implementation will be out for many years to come. Obviously, all real-world syslogds will need to support those older clients - or they will not receive market acceptance. So my goal is to both specify a new standard but leave a window open so that a newer syslogd can still - in a standard way - support the older protocols. In fact, I think the older emitors can get quite a lot from this, as the first relay is permitted to change this. However, a cleaner solution may be to assign new IP ports to syslog-protocol, saying that it is actually *not* syslog. But I fear this, too, causes some loss of acceptance. And as I do not see real evil in how it is specified, I would like to stick with this approach ... until a good argument against it comes up. (To adress a specific issue: the charset will be specified via a low-footprint -international structured data element - this is why I left it open in the traditional sense. And, yes, some wording needs to be changed, too.). I appreciate all comments. Rainer > -----Original Message----- > From: Anton Okmianski [mailto:[EMAIL PROTECTED] > Sent: Wednesday, January 21, 2004 8:48 PM > To: Rainer Gerhards; [EMAIL PROTECTED] > Subject: RE: syslog-protocol-01 posted & comments > > Rainer: > > Good draft! I have a laundry list of suggestions/questions > below. Feel > free to respond to them in different emails at your convenience. The > issues are in more or less section order. > > 1. First, I have to say I don't understand the backward compatibility > aspects. RFC 3164 compliant syslog collector or relay is supposed to > accept ANY messages as stated in section 4.2 of that RFC. Can you > explain in practical terms what backward-compatibility we are > trying to > achieve? I think it is an important question we need to > settle as this > affects the whole draft. > > I assume the new draft RFC will provide hard requirements for > syslog in > contrast to the informational nature of RFC 3164. As such, this RFC > cannot be fully backward compatible with RFC 3164 which allowed free > form messages. If we make various selected RFC 3164 aspects > required in > the draft (like old timestamp), we are essentially putting a stamp of > approval on a previously informational recommendation and making it a > requirement instead of obsoleting it. How are we ever going > to obsolete > the old format this way? > > 2. Section 2 refers to "machines" and "devices" which is > misleading. I > think we need to talk about "applications". After all a sender and > collector can both be on the same machine. > > 3. Section 4. HOSTNAME. I think "." and "-" characters are allowed in > FQDN (except no trailing "-") per RFC 1123. Also, the limit of 64 > characters is inappropriate. It should be 255 per same RFC. > > 4. Section 4. The time-secfrac field should be specified as 1*4DIGIT. > This is the only number of digits that would be allowed given the 32 > character limit you specified for TIMESTAMP field. This just makes it > more explicit and actually removes the need to specify the > length of the > TIMESTAMP field. > > 5. Section 4. MSG. I think the character set specified here is not > consistent with specifying that UTF-8 is supported. UTF-8 > character can > consist of multiple bytes and each byte can be any 8-byte value. Also > you refer to "PRINTABLE" in the comment, which is not defined > anywhere. > > 6. Section 4.1. PRI field. First, I support Albert's > proposal of a new > format which increases the number of facilities and provides a format > that is easier to handle. I just don't know why stop at 999 > facilities > and not allow say 2bln (signed 32-bit). An alternative (less optimal > for performance) is defining a structured content parameter "facility" > or "channel" and assuming new syslog collectors/relays will use it in > configurations. > > 7. Section 4.1. PRI field. I think naming facilities 16-23 "local" is > misleading. In fact, remote logging uses those almost > exclusively. So, > how are they local? I'd call them "custom facility 1,2,..." > or something > like that. > > 8. Section 4.1. Note 1. I think here and in many other places in this > draft RFC we should avoid using language such as "...have > been seen...", > etc. This is not intended to be an informational RFC like > 3164. I think > it would be more appropriate to be talking about what SHOULD or MUST > happen instead of what has been seen to happen. > > 9. Section 4.2. Here and thereafter you use the term "visible > (printing) > characters". Although you clarify everywhere the specific character > range, I think this term is imprecise. A Chinese character encoded in > UTF-8 will be visible if you have the right viewer and not visible if > you don't. Maybe you should refer to "non-control > characters" instead. > > 10. Section 4.2 Last 3 sentences. Again you mention "has usually been > seen". Do we actually want to recommend the use of one IP or > the other > or at least the consistent use of one? > > 11. Section 4.2.1. In the note, you mention "single syntax". In fact, > use of second fractions is optional. Yes, technically it is one ABNF > syntax. But then so is the RFC 3339 which you claim provides "multiple > syntaxes". > > 12. Section 4.2.1. My feeling is we should not support the > old timestamp > format in this RFC. If some collector wants to support it, > they can be > both RFC3164 and RFC.new compatible, right? Why give more > prominence to > the old legacy timestamp which we know is bad? > > 13. Section 4.2.1. Bullet point talking about time-secfrac should > mention that performance considerations is another condition for the > recommendation, not just availability of clock accuracy. > > 14. Section 4.2.2. Again, I am not convinced that supporting > the legacy > of just the hostname instead of FQDN is the good reason to > have. We may > still want just the hostname option for local logging though. > > 15. Section 4.2.2. Do we want to make a recommendation as to what is > preferred hostname-only or IP? > > 16. Section 4.2.2. Where we mention IPv6 RFC 2373, we should mention > specifically the section on "Textual representation" of that RFC - > section 2. > > 17. Section 4.2.3. We never say what the purpose of the TAG field is > nor give any guidance to what should be put there. This field of the > syslog specification is, to me, very strange. I understand > the legacy, > but see my concerns about backward-compatibility. The fact that no > spaces are allowed is not optimal. Recommendation of a > trailing ":" can > only mislead casual observers into believing it is used as a separator > character while it is not. Then, what's the purpose of this > recommendation? > > 18. Section 4.2.3. We never explain what's the difference between > static and dynamic portions of TAG. The last sentence talks about use > of "consistent tag value", but I don't understand what it means. > Consistent between what and what? This needs clarification. > > 19. Section 4.3. The phrase "traditionally and most frequently used" > should be replaces with SHOULD, MUST or RECOMMENDED I think. > > 20. Section 4.3. The last two paragraphs are talking about some "code > sets". I think if we are talking about UTF-8, we are talking about > *one* code set -- UNICODE -- and one encoding -- UTF-8. I thought > UNICODE and UTF-8 obsoleted all that code set business, or am I wrong? > > 21. Section 4.4. TRAILER. This says that some receivers may require a > trailer. Aren't we supposed to specify here what compatible > receiver is > allowed to require and what not? Why are we allowing this? I think > nobody should require trailer and we should drop this from format. > > 22. Section 4.5. Sentence "..locally defined facility (local4)...". > Again, I am confused by term "locally-defined". > > 23. Section 5 & beyond. Why is there a need to specified > structured data > *anywhere* within the message. I thought we will designate a special > field like TAG for the structured data. This way we won't need a > special sequence to identify it. Also, I think allowing it everywhere > gives too much unnecessary freedom. Harder to evolve protocol later. > > 24. Section 5.1. Like with the MSG, I think the character set of > parameter value is any non-control character with some > characters being > escaped. We are supporting UTF-8 within the parameter values, right? > > 25. Section 5.1. I think the fact that each structured data item which > has a different IANA dictionary needs to be in a different block is > somewhat cumbersome and limiting. For example, if I want to put the > msgid parameter in all of my messages regardless of use of > fragmentation, then when I do use fragmentation, do I have to put this > parameter twice? > > 26. Section 5.1. I think dictionary identifier can be made > into a just > another key-value parameter. This would be more consistent with > providing a general mechanism key-value pairs and idea of using [] > brackets to group related tags. > > 27. Section 5.1. Can the SD-ID be optional for experimental > parameters. > This way I don't have to put "x-cisco" in front of all tags. I don't > see any value in this. We can just assume experimental tags. If a > given vendor needs to identify his tags they can do this with > their own > parameters like "vendor", "product", "version", or whatever else the > vendor wants. Vendor tags are for vendor use only, right? General > syslog collector won't use them anyway, correct? > > 28. Section 5.1. I would also consider the following approach which > eliminates dictionaries. If we only need parameter namespace > so we can > avoid conflicts between current & future syslog RFCs and vendor > parameters, then we can just define some prefix for current and future > syslog protocol parameters. For example "sys.msgno", "sys.fragcount", > etc. Then, we will control the tags in this namespace using IANA or > RFCs. If some vendor wants to re-use the "sys.msgno" tag because the > definition of the tag suites them for a different use case, then they > don't need to duplicate it. > > 29. Section 5.1. I think we should require a space character > after each > structured data block closing bracket. This will make it > more readable > while eliminating the ambiguity as to whether or not the space is part > of the message. Even you examples will look nicer. I think > we can make > the space optional between two structured blocks of data. > > 30. Section 6. Paragraph 3 call for not using fragmentation > when message > can fit in a single message. I think, in general, we assume > the use of > fragmentation *only* for splitting long messages. We had some > discussion on this a long time ago, but I don't remember the > conclusion. > The other use case for fragmentation (or better named multi-part > messages in this case) is when the message is inherently multi-line. > For example, a stack trace: > > LockConflictException > at com.cisco.csrc.db.LockTable.obtainUpdateLock(LockTable.java:199) > at > com.cisco.csrc.db.indexes.OidIndex.obtainUpdateLock(OidIndex.java:448) > at > com.cisco.csrc.db.PObjectImpl.obtainUpdateLock(PObjectImpl.java:1184) > > How can I send such message with current draft? I would have > to come up > with some new parameters likely. I think this needs to be > standardized. > The distinction here is that the original message is not a > single line. > Rather the original message is a multi-part message with each > part being > a separate line. > > To handle the above we need to differentiate the case when > message does > not need to be assembled. > > 31. Section 6. Again we had discussion on this before... It would be > useful if message parts could be sent before the total length of the > message is know. We have one message in our system which is > about 2000 > lines long. It dumps all kinds of properties on crash. It > would be nice > if I could send parts of this message without knowing the > total message > part count. Otherwise, I would need to assemble the whole message > before sending it. This can be problematic if I am crashing > due to out > of memory condition, for example. To address this, we simply need to > sate that recount parameter is optional in all fragments > except for the > last message. This will designate the end of the fragmented message. > > 32. Section 6.2. The above suggestions would mean that you can't sign > the whole message, only parts. You suggest that signing all parts is > not as safe as signing the whole message. Why? We know exactly the > message to which each part belongs and this information is signed, > right? > > 33. General. What do we do with non-conforming messages. Do > we want to > recommend that collectors/relay agent fire some diagnostic > message which > embeds the offending message? > > 34. Do we want to introduce more standard parameters? Good candidates > are "facility" and "severity". Yes, this will duplicate information, > but we can make them optional. At least this will overcome > the problem > of syslog servers only storing the message and not the PRI field which > leads to then not knowing what facility or severity the message had if > you store multiple facilities/severities in the same log file. > > I did not review section 7 and beyond yet. It seems a lot of it is > identical to old RFC. > > Thanks, > Anton. > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards > > Sent: Wednesday, January 21, 2004 3:53 AM > > To: [EMAIL PROTECTED] > > Subject: syslog-protocol-01 posted & comments > > > > > > Hi WG, > > > > as you may have seen, the draft editor has posted protocol-01: > > > > > http://www.ietf.org/internet-drafts/draft-ietf-syslog-protocol-01.txt > > > > First things first: this was a "quick" edit (while not as > > quick as I hoped... ;)). My main objective was to get out > > some text as quickly as possible. There are probably some > > typos and some other minor inconsistencies. Also, some of the > > descriptions may not be as good as they should be. As the > > format issue was quite controversal, I try to save a little > > bit work by providing ONE POSSIBLE text to handle it. But > > further discussion may go into a different direction. So I > > tried to make it as understandable as possible while not > > putting total finishing efforts into it. Once we have decided > > the final direction, I will either revamp totally or polish > > the current text. > > > > To the content: > > > > #1 > > As said on the list, I used Anton's non-XML proposal. > > Weighting all the arguments received, I really think we do > > not actually need XML in syslog, even though syslog is no > > longer just for human review but also for automatted > > processes (this in answer to David's question). > > > > After finishing the text, I am far more convinced that the > > simple tagging approach is not only sufficiently enough for > > transport, but it also is a good solution for syslog in > > general. In the long term, it can also be used to define > > payload dictionaries, which may be very useful (should we > > manage to do this;)). > > > > Regarding integrating this into XML-based systems, I think a > > mapping of what I described now and XML is fairly easy, at > > least as long as you assume that the message is orginally > > generated by a syslog device and the brought over to some > XML system. > > > > #2 > > I would also like to drag your attention to the fragmentation > > that I now described. There are some specific implications in > > regard to syslog-sign. I would appreciate if those deeply > > involved in -sign could cross-check that this format could > > actually do the job for -sign. > > > > #3 > > The section on fragmentation is currently missing a > > recommendation to use a reliable transport. This will be > > moved in once we have a general concensus. > > > > I appreciate comments on these points as well as -protocol-01 > > in general. I am prepared to do a quick re-edit. > > > > Thanks, > > Rainer > > > > > > > > > >