Yes, in 4.1 you are correct - you cannot do a search for something like 24
hours in the past, continuing with real-time in the future.  (E.G. show a
graph with the last 24 hours of data which then updates in real-time).  The
current plan is to add this behavior in 4.2.

Rob

On Mon, Mar 1, 2010 at 11:00 AM, <da...@lang.hm> wrote:

> Rob,
>  one comment on the real-time search capability of splunk4, per recent
> conversations with splunk, the real-time search is not going to be able to
> be integrated with past data.
>
> in other words, you can say, 'do this search on data that arrives after
> now', but you cannot say 'do this search on data that arrived/arrives after
> 5 min ago'
>
> David Lang
>
>  On Mon, 1 Mar 2010, Rob Das wrote:
>
>  Date: Mon, 1 Mar 2010 10:26:38 -0800
>> From: Rob Das <rob...@gmail.com>
>> To: discuss@lopsa.org
>> Subject: [lopsa-discuss]  splunk alternatives
>>
>>
>> First, please forgive me if this email is overly long.
>>
>> Yes, SEC and Splunk are different in many ways - both useful in the right
>> context.  I have a few questions.  How much data per day are you talking
>> about?  Are you interested in looking at historical data and comparing it
>> against current data?  Do you need any sort of roll-ups or more advanced
>> aggregations/analytics on your data?  You may not now, but will you ever
>> be
>> interested in gathering events that cannot be captured via syslog (extra
>> large, application or multi-line events for example)?  Do you want
>> different
>> people to have access to different types of data.  Do you want different
>> roles of users to see different views?  Do you foresee that the data
>> volumes
>> will grow over time?  Are your 20 users really concurrent or will they be
>> searching randomly throughout the day?
>>
>> First of all the new version of Splunk (version 4.1), which will be out
>> very
>> soon, includes real-time support.  What this means is that searches can
>> optionally be executed at data input time as the data is acquired.  If
>> events as they come in match a search, alerts can be triggered.
>> Furthermore, Splunk's dashboards, graphs and tables will update in real
>> time
>> as the data comes in effectively providing a "heartbeat".
>>
>> If you need to "find the needle in the haystack", you can't find a better
>> tool.
>>
>> Simple stuff like "Tell me the top ten logins by IP address over the last
>> 24
>> hours or month" can't be done with SEC without writing code.  Splunk
>> handles
>> this via it's GUI and graphs like this can be placed on dashboards which
>> update in real-time.   Splunk can easily filter out data that you are not
>> interested in or keep it for as long as you like - your choice.
>>
>> Splunk provides role-based access controls that can optionally filter data
>> at search time depending on who is allowed to see what.
>>
>> One of the most important concepts is that Splunk doesn't require or
>> impose
>> any structure on the incoming data.  You can apply structure at search
>> time,
>> which means that as data changes in your data center (because new versions
>> of software/hardware are installed, etc), you will not need to re-do any
>> regular expressions.
>>
>> Depending on daily data volumes, Splunk will run very well on commodity
>> type
>> hardware.  As your business grows, it can scale to handle it (to
>> terrabytes/day).  If your daily volume doesn't exceed 500M/day, you can
>> use
>> the free version of the software.
>>
>> SEC is a low level tool written in Pearl that requires you to create
>> regular
>> expressions that match patterns in your data.  It also requires quite a
>> bit
>> of scripting to make it work in many environments. As things change, you
>> will need to update your regular expressions or things will break.
>>
>> SEC implements a state machine that operates over incoming data.  There
>> are
>> many cool things you can do with it, but like David L says keeps all of
>> it's
>> state in memory.  Splunk does not currently implement a state machine in
>> the
>> same way as SEC.  However, Splunk's search language, which is extremely
>> robust, can handle many of the same use cases - especially with the
>> introduction of real-time searching version 4.1.
>>
>> I have not taken a look at logsurfer, so I can't comment on it.  I'll
>> check
>> it out.
>>
>> I am more than happy to field questions directly if you wish.
>>
>> Rob Das
>> r...@splunk.com
>> Co-founder / Chief Architect
>> Splunk, Inc.
>>
>>  Paul DiSciascio wrote:
>>>
>>>> I'm looking for a good way to share log files on a centralized syslog
>>>> server with about 10-20 people/developers who are familiar with the log
>>>> formats but not very much with unix tools.  They want an easy way to
>>>> dig thru the logs and filter out junk they're not interested in, but
>>>> still have near realtime visibility.  Obviously, splunk can do this,
>>>> but it's pricey and their documentation seems to indicate that 20
>>>> concurrent users would be a lot to ask for without a lot of hardware.
>>>> I really only need an interface capable of some rudimentary filtering,
>>>> and if possible the ability to save those searches or filters.  Does
>>>> anyone have any suggestions short of writing this myself?
>>>>
>>>>
>>>>  You might be interested in SEC (simple event correlator) for this
>>> purpose. But, if you just want a presentation interface, logsurfer might
>>> be more what you are looking for. SEC is much more like splunk while
>>> logsurfer is more of a realtime filtering monitor.
>>>
>>
>> I'm not sure what you have seen of splunk, but it and SEC have very little
>> in common.
>>
>> splunk allows for arbatrary search queries against your past log data (and
>> indexes it like crazy to make the search fairly efficiant)
>>
>> SEC watches for patterns (or combinations of patterns) to appear in the
>> logs and generates alerts.
>>
>> splunk can simulate SEC's functionality by doing repeated queries against
>> the logs, but that's fairly inefficant.
>>
>>
>> the answer to the original question, it depends a lot on the amount of
>> data that you are working with.
>>
>> If you can fit it all in ram on a machine, then there are a lot of things
>> that you can use to query it. The problem comes when you can no longer fit
>> it in ram and have to go to disk, at that point you need an application
>> that does a lot of indexing (and/or spreads the load across multiple
>> machines, depending on how much data you have and how fast you want your
>> answers)
>>
>> you say that your users are not familiar with unix tools, are they
>> familiar with using SQL for queries?
>>
>> David Lang
>>
>
> _______________________________________________
> Discuss mailing list
> Discuss@lopsa.org
> http://lopsa.org/cgi-bin/mailman/listinfo/discuss
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>
>


-- 
Rob Das
Splunk
_______________________________________________
Discuss mailing list
Discuss@lopsa.org
http://lopsa.org/cgi-bin/mailman/listinfo/discuss
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to