On 8/15/14, Joel Rees <joel.r...@gmail.com> wrote: > On Fri, Aug 15, 2014 at 9:47 AM, AW <debian.list.trac...@1024bits.com> > wrote: >> On Fri, 15 Aug 2014 09:11:19 +0900 >> Joel Rees <joel.r...@gmail.com> wrote: >> >> > When you're grep- or sed-searching a textual log file, you don't care >> > whether all the log entries fit any particular relation or structure >> > definition, and you don't have to think sideways to search on the >> > keywords buried in the text of the actual log entry. >> >> Of course you think sideways... >> Step 1. Choose a log to view > > Mixed logs. What then?
I think we are meant to create an SQL view then, so we cmd line administrator types can still "grep" the whole sql db with a single command. >> Step 2. Decide which time frame you want to view. > > Maybe I don't want to limit to a particular time frame, especially > when I'm trying to debug a problem which has been slowly corrupting > the loggin database for I don't know how long. You're right, not possible with SQL. >> Step 3. Decide which column is important to you. > > What columns? Who defined those columns? Why do I have to do a > database design on all the unforeseeable sets of conditions that I > will want to log, many not errors or even warnings, with all the > information I want to log about them, before I can start coding and > debugging the application so that I can find out what I want to log? Perhaps "error code" and "message" would cover all cases? Dont' know that's possible with SQL. But AnyWay, I admit, that might still be more complicated than is worth poking a stick at, when grep works just fine... > And, again, what happens when a watchdog daemon can't get a socket > (heaven forbid a port) to the error logging daemon and wants to log > that fact? Now we're back to log files and we might just as well have > stuck with them in the first place. You should probably have fail over databases, with a watchdog system monitoring th... oh, that's what you're asking? > And if management wants them in a database, dump them to a database > after you can scan through them to get an idea of any specific columns > you want to define other than the free-form text bucket at the end. > But keep the logs in files and generate the database from the files, > otherwise, you're going to be stuck trying to log the fact that you > can't log because your database function is down or not yet up, and > that's going to happen a lot more often than trying to log the fact > that your file system is so corrupt you can't write the logs. Dunno about that. Perhaps a NoSQL database? >> These are all relational searches. > > You can design them, after the fact, as relational searches. And if > your design is good, it will catch a lot of similar searches. But you > still have to write down the queries if you want to use them again, > just like you have to write down the more complex grep queries if you > want to use them again. So we're up to 1-1, text file and sql? >> The fact that you decide as a human does >> not make the data non-relational. > > Actually, the mathematician in me says, yes it does. No > mathematical model truly captures anything from the > real world. Ahh, the truth, the absolute truth. Now we're on solid ground, unshakable ground :) >> It should be very clear that log data are >> strongly relational. Especially, if I might add, just briefly add though, since I don't want to take up too much time here just commenting in any sort of unnecessary sort of way, so I do hope you understand. Whoops, got lost there. Let's try again. Esepcially, if I might add, those binary blobs that systemd caters for. > Only if there is a large text bucket at the end of most records. We can't do a time-sequenced third relation joining errors and words in the error? (Time sequenced so we can still reconstruct the error message of course.) >> They conform to all the ideas regarding relational data, Consistent data types, consistent field widths, limited set of data types, non free-form textual data, no repeating fields (alright, this last one probably applies)? >> and you follow relational logic to retrieve the parred >> down snippet of data you wish to view. What, like the message reconstruction technique to de-normalise a fully normalized message? > Only after you have had time to go back, analyze a few months or years > of logs, and design a database that fits. and then new software, new systems come along, and the efficient non-free-form-text-bucked data structure changes yet again? >> As far as keywords go, which column in an apache log shows the >> referrer? > > You don't know unless you can see my httpd configuration files, unless > I happen not to have customized the error logs very much. (And, yes, I > sometimes heavily customize the apache logs to emphasize stuff that > needs to be seen in a specific application while debugging a specific Perfect use for normalized tables and reports! You could even run them through colorize I use grep) to add color to your text reports! > problem. Then I change the format again when I'm done, Excellent, perfect example for views, just create a view, when you're done with it, tear it down. Simple. > because leaving > it that way clutters the logs. Even more reasons for sql! > And I leave the format sitting in the > configuration files in a comment, in case I need to do that again. Well, you might prefer stored procedures rather then views then, or, I don't think views take any significant overhead, at least if you drop any indexes on your view... you'd have to consult someone more sql knowledgeable than I though to be sure sorry. > You would have me design a new database and make the logs > discontinuous to do the same thing. For a *much* more ordered (and frankly, searchable) result! >> Which one shows the date? Aren't these precisely keyword searches? > > Depends on whether normalization makes them keywords. > (See what I said above.) But that's the point, at least normalizing into columns - is that even a normal form? May be they need to call that "zero normal form" (although that would imply no normality, but I'm meaning, primary, that's it, primary normal form, means separate each unique piece of information/ data into a column). >> In fact, awk with grep usage is very similar to a database 'select' >> statement... > > Uhm, yeah, the early relational databases were little more than > constrained plaintext, numeric indexes written as ASCII text, and > searched with awk, sed, and grep. Then they started adding specialty > search functions and then they started writing the indexes in binary. > > Databases are a constrained use of text. Binary indexing and binary > blog fields are just optimizations. Perfect! We're all on the same page here, that's for sure! >> except the user must already know what the column headers are, > > What headers? > > You don't need headers in text logs. You want a date? You search for a > date. Don't seem to be successful finding a date, look at the log and > you'll see the dates that are there, and then you know what the grep > command should look like. True, you don't grep for a "Date" column, you simply grep for something resembling a date of interest. And I think modern sql dbs provide for "regex" style sql searching anyway, since that's efficient for the user. > On the other hand, if you need to see some log where you wrote out > that the number of pink elephant toy queries seems to be greater than > the number of Pooh-Bear towel queriess, and the managers think that is > meaningful because it probably means the customers this week have been > from a particular neighborhood, and they want to adjust the signs and > in-store sales accordingly, what columns in your log database tell you > that? That would probably be information from the end-user (customer or employee) facing system, and in its own tables or database. > And again, what columns do you look at when the whole system dies > before it can get up far enough to write to the log database? I think systemd has a solution for this one. >> as thatinformation is >> not available as it would be in an sql database... > > But if you need the tabular data, you can parse the text logs and > write it. And your parser can tell you when a table's definition needs > to be changed. And you can add new tables for management. And your > logs are unaffected, remain useable for your system purposes and for > new things management dreams up. > > Normal data has more to do with how you look at the data than it has > to do with how your store it. Do you mean "normalized data has..."? Cheers Zenaan -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/CAOsGNSSN3D=_=w-xmv-dexourxp6obbbyyvxvprjovp1f7u...@mail.gmail.com