On Fri, Aug 15, 2014 at 9:47 AM, AW <debian.list.trac...@1024bits.com> wrote: > On Fri, 15 Aug 2014 09:11:19 +0900 > Joel Rees <joel.r...@gmail.com> wrote: > > > When you're grep- or sed-searching a textual log file, you don't care > > whether all the log entries fit any particular relation or structure > > definition, and you don't have to think sideways to search on the > > keywords buried in the text of the actual log entry. > > Of course you think sideways... > Step 1. Choose a log to view
Mixed logs. What then? > Step 2. Decide which time frame you want to view. Maybe I don't want to limit to a particular time frame, especially when I'm trying to debug a problem which has been slowly corrupting the loggin database for I don't know how long. > Step 3. Decide which column is important to you. What columns? Who defined those columns? Why do I have to do a database design on all the unforeseeable sets of conditions that I will want to log, many not errors or even warnings, with all the information I want to log about them, before I can start coding and debugging the application so that I can find out what I want to log? And, again, what happens when a watchdog daemon can't get a socket (heaven forbid a port) to the error logging daemon and wants to log that fact? Now we're back to log files and we might just as well have stuck with them in the first place. And if management wants them in a database, dump them to a database after you can scan through them to get an idea of any specific columns you want to define other than the free-form text bucket at the end. But keep the logs in files and generate the database from the files, otherwise, you're going to be stuck trying to log the fact that you can't log because your database function is down or not yet up, and that's going to happen a lot more often than trying to log the fact that your file system is so corrupt you can't write the logs. > These are all relational searches. You can design them, after the fact, as relational searches. And if your design is good, it will catch a lot of similar searches. But you still have to write down the queries if you want to use them again, just like you have to write down the more complex grep queries if you want to use them again. > The fact that you decide as a human does > not make the data non-relational. Actually, the mathematician in me says, yes it does. No mathematical model truly captures anything from the real world. > It should be very clear that log data are > strongly relational. Only if there is a large text bucket at the end of most records. > They conform to all the ideas regarding relational data, > and you follow relational logic to retrieve the parred down snippet of data > you > wish to view. Only after you have had time to go back, analyze a few months or years of logs, and design a database that fits. > As far as keywords go, which column in an apache log shows the > referrer? You don't know unless you can see my httpd configuration files, unless I happen not to have customized the error logs very much. (And, yes, I sometimes heavily customize the apache logs to emphasize stuff that needs to be seen in a specific application while debugging a specific problem. Then I change the format again when I'm done, because leaving it that way clutters the logs. And I leave the format sitting in the configuration files in a comment, in case I need to do that again. You would have me design a new database and make the logs discontinuous to do the same thing. > Which one shows the date? Aren't these precisely keyword searches? Depends on whether normalization makes them keywords. (See what I said above.) > In fact, awk with grep usage is very similar to a database 'select' > statement... Uhm, yeah, the early relational databases were little more than constrained plaintext, numeric indexes written as ASCII text, and searched with awk, sed, and grep. Then they started adding specialty search functions and then they started writing the indexes in binary. Databases are a constrained use of text. Binary indexing and binary blog fields are just optimizations. > except the user must already know what the column headers are, What headers? You don't need headers in text logs. You want a date? You search for a date. Don't seem to be successful finding a date, look at the log and you'll see the dates that are there, and then you know what the grep command should look like. On the other hand, if you need to see some log where you wrote out that the number of pink elephant toy queries seems to be greater than the number of Pooh-Bear towel queriess, and the managers think that is meaningful because it probably means the customers this week have been from a particular neighborhood, and they want to adjust the signs and in-store sales accordingly, what columns in your log database tell you that? And again, what columns do you look at when the whole system dies before it can get up far enough to write to the log database? > as that > information is not available as it would be in an sql database... But if you need the tabular data, you can parse the text logs and write it. And your parser can tell you when a table's definition needs to be changed. And you can add new tables for management. And your logs are unaffected, remain useable for your system purposes and for new things management dreams up. Normal data has more to do with how you look at the data than it has to do with how your store it. -- Joel Rees Be careful where you see conspiracy. Look first in your own heart. -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAAr43iMpe9Bmg5Vwh2mY0EMr0md3Y=4E2Feu_pRycOY0OfG=f...@mail.gmail.com