Re: netflow in the core used for surveillance
Hi, all. Re: last week's thread on the Vice article - I can only speak for Kentik, and *we* don't resell or give 3rd party access to NetFlow data from our hundreds of customers. And never have. But there is definitely interest out there. We do get approached about it periodically and always say no. Mostly by commercial vendors and not (at least directly) by governmental bodies. Of course, our *customers* could in theory share their data via API key, or by using our outbound streaming firehose. But I've never talked to a customer who wanted to share their flow data with a 3rd party. Usually by far the opposite. The closest thing to this that our customers do ask about is re: aggregate community views, which people could contribute to to help themselves and the community. While we don't do this now, if and when we do it: 1) won't be with raw data, 2) will be opt-in only, 3) will be designed with customers and have open methodology; and 4) will be likely with synthetic test, BGP, device metrics, and other non-flow data to start. Thanks, Avi
LLDP via SNMP
Have had the question come up a few times, so I wanted to poll the community to see... For those who are monitoring LLDP, how have you found the SNMP MIB support support for it on Juniper, Cisco, Brocade, Arista, and others? Wondering if you've needed to resort to CLI scraping or APIs to get the data? Thanks, Avi Freedman CEO, Kentik
Re: Service Provider NetFlow Collectors
We do have a minimum for commercial service that's more like $1500/mo but we are coming out with a free tier in Q1 with lower retention (among other deltas, but including fully slice and dice flow analytics +BGP that it sounded like Erik might be looking for). Feel free to ping me if anyone would like to help us test the free tier in January. Thanks, Avi Freedman CEO, Kentik > Doesn't Kentik cost like $2000 a month minimum? > > > On Mon, Dec 31, 2018 at 11:57 AM Matthew Crocker > wrote: > > > +1 Kentik as well, DDoS, RTBH, Netflow. Cloud based so I don't have to > > worry about it. > > > > On 12/31/18, 11:37 AM, "NANOG on behalf of Bryan Holloway" < > > nanog-boun...@nanog.org on behalf of br...@shout.net> wrote: > > > > +1 Kentik ... > > > > We've been using their DDoS/RTBH mitigation with good success. > > > > > > On 12/31/18 3:52 AM, Eric Lindsjö wrote: > > > Hi, > > > > > > We use kentik and we're very happy. Works great, tons of new > > features > > > coming along all the time. Going to start looking into ddos > > detection > > > and mitigation soon. > > > > > > Would recommend. > > > > > > Kind regards, > > > Eric Lindsjö > > > > > > > > > On 12/31/2018 04:29 AM, Erik Sundberg wrote: > > >> > > >> Hi Nanog…. > > >> > > >> We are looking at replacing our Netflow collector. I am wonder what > > >> other service providers are using to collect netflow data off their > > >> Core and Edge Routers. Pros/Cons… What to watch out for any info > > would > > >> help. > > >> > > >> We are mainly looking to analyze the netflow data. Bonus if it does > > >> ddos detection and mitigation. > > >> > > >> We are looking at > > >> > > >> ManageEngine Netflow Analyzer > > >> > > >> PRTG > > >> > > >> Plixer – Scrutinizer > > >> > > >> PeakFlow > > >> > > >> Kentik > > >> > > >> Solarwinds NTA > > >> > > >> Thanks in advance… > > >> > > >> Erik > > >> > > >> > > >> > > > > >> > > >> CONFIDENTIALITY NOTICE: This e-mail transmission, and any > > documents, > > >> files or previous e-mail messages attached to it may contain > > >> confidential information that is legally privileged. If you are not > > >> the intended recipient, or a person responsible for delivering it > > to > > >> the intended recipient, you are hereby notified that any > > disclosure, > > >> copying, distribution or use of any of the information contained in > > or > > >> attached to this transmission is STRICTLY PROHIBITED. If you have > > >> received this transmission in error please notify the sender > > >> immediately by replying to this e-mail. You must destroy the > > original > > >> transmission and its attachments without reading or saving in any > > >> manner. Thank you. > > > > > > > > >
Re: vFlow :: IPFIX, sFlow and Netflow collector
lack of precision might not be an issue (they pretty much all use probabalistic data structures like HLLs to do count and topN). And MemSQL can operate in that mode as well though I don't think that was how Mehrdad was showing it working with vFlow. But again you can't ever go 'back in time' for an ad hoc query with them so it's probably more interesting as an augment and offloader for most uses where you'd normally think of storing many billions or a few trillion flows. Happy flow-ing... Avi Freedman CEO, Kentik
Re: vFlow :: IPFIX, sFlow and Netflow collector
> "NANOG" wrote on 05/16/2017 03:34:39 PM: > Nice analysis of the current state of the art. Thanks; of DIY for store-all approaches, at least :) Commercial options is a different thread and I'm conflicted so shouldn't try to summarize those... > > And then, the biggest flow store I know of (1 or 2 carriers may want to > argue > > but I haven't seen theirs) is at DISA for DoD - > a decade of un-sampled > flow > > coming from SiLK. All stored in hourly un-indexed files, essentially > nothing > > but CLI to access, > > FlowViewer provides a web GUI for invoking SiLK analysis tools. Provides > textual and graphical analysis with the ability to track filtered subsets > over time. Screenshots, etc.: > > https://sourceforge.net/projects/flowviewer/ Sorry, forgot about flowviewer - I've never seen it in use and asked at a bunch of Flocons - but it looks updated more recently than I had thought. On a related topic, I'd love to see NANOGers and general netops and perf-minded people go to Flocon (put on by CERT, and heavily but not exclusively SiLK- and security-focused). Cross-pollination of interests, tools, and techniques will help us all... > > Joe Thanks, Avi
Network nerd poker night 11/8 in Seattle
If there are any network+poker nerds in the Seattle area tomorrow, we have 5 seats left at a network nerd poker night I'm hosting tomorrow night. Attendees are from cloud, content provider, hosting, infra services, travel, and SaaS analytics industries. We'll have food, drinks, a training session, and will be running ~3 single-table No Limit Texas Hold'em tournaments. If there's time/interest afterwards I may also initiate anyone interested into the wonders of Pot Limit Omaha. Prizes will be Bose head sets, to avoid corporate gift issues with playing for or awarding $. It's at the W Hotel in Bellevue, at 6pm tomorrow night. The focus is poker, socializing, and free-form network tech, business, and policy nerd discussions. Travel and gadget geeking allowed as well. Kentik is sponsoring the space, tables, and professional dealers, and we'll have a < 5 minute sponsor presentation. RSVP / info @ https://www.greenvelope.com/viewer/?ActivityCode=.public:ab155c3532ca4bd5ad563ff222b6a338393435313037#details If it overflows we'll cut off RSVPs at the URL and/or let people know by email. We're also going to organize to do another in Seattle in Feb and larger ones in NY and the Bay area in Q1, so if you have interest or ideas for format or quick content topics we could cover, please let me know. One thing we're considering is adding a table for heads-up battles - participants to decide if they want to add peering as part of the stakes. Thanks, Avi
Re: oss netflow collector/trending/analysis
There's also SiLK from CMU. It's powerful but has a learning curve. I also see pmacct being used both by some end networks and by some vendors as part of systems. Avi > Hey There, > > I was just wondering, for people who are doing netflow analysis with > open source tools and who are doing at least 10k or more flows per > second, what are you using? > > I know of three tool sets: > > - The classic osu flow-tools and the modern continuation/fork. > - ntop > - nfdump/nfsen > > Is there anything else I've missed? A few folks here really seem to like > nfsen/nfdump. > > Thanks, > > Matt
Re: Preferring peers over customers [was: Do Not Complicate Routing
Forgive my potential lack of understanding; perhaps BGP behavior has changed or the way people use it has but my understanding is - Since BGP is used in almost all circumstances in a mode where only the best path to a prefix can be re-advertised, only one of the peer or customer path can be used by a 3rd network, and if the peer path is used for a prefix for a customer, then a transit provider can't easily provide transit for that prefix back to the customer without serious routing shennanigans. So isn't it in practice the case that if a provider prefers a peer to connect to a customer instead of the direct customer link, that: 1) The provider will lose the ability to bill for traffic delivered to that customer, and 2) The customer will lose redundancy of inbound path, and 3) The customer will almost certainly notice and have the chance to complain I would expect that most cases of a provider (for a given prefix, which is almost always a caveat here) preferring a peer to get to a customer would be something the customer had some input into via communities, or by calling and bitching if the provider doesn't have a rich communities set or the ability to set them. One thing one hears every so often (in cycles) is the pressure for emerging Tier1/2 aspirants to not peer with customers of larger potential peers who are also providers, to preserve revenue models of said larger peers, but that's a different situation. And - If applied to customers of customers, I'd think it'd revert to the cases above. Network X has customer Y and buys from provider C. If C prefers a peer to get to Y (this is all for a given prefix) and it wasn't due to policy expressed by X or Y via communities or request of provider C by X, then eventually someone's going to figure out that the backup path that presumably X and Y think is being paid for, isn't. Then the people that pay money will bitch and action shoudl be taken. Consistent announcements by a global network to its peers for the prefixes of a given customer is another level of wonkiness that customers can definitely influence by doing strange per-prefix communities settings, but that again is probably another topic as it'd be presumably driven by the customer's actions, not the provider's traffic-engineering goals. Or am I confused here on one, more, or all points? Certainly possible. One thing I think everyone can agree on - academic models of the ways that people combine routers, money, fiber, contracts, and policy almost never catch up to the creativity, poltiics, policy, bugs, and stupidities that combine to make the Internet as wonderful as it is. Avi > On Sep 5, 2011, at 4:03, Randy Bush wrote: > > >> Because routing to peers as a policy instead of customer as a matter > >> of policy, outside of corner cases make logical sence. > > > > welcome to the internet, it does not always make logical sense at first > > glance. > > > > the myth in academia that customers are always preferred over peers > > comes from about '96 when vaf complained to asp and me (and we moved it > > to nanog for general discussion) that we were not announcing an > > identical prefix list to him at east and west. the reason turned out to > > be that, on one of the routers, a peer path was shorter in some cases, > > so we had chosen it. we were perfectly happy with that but vaf was not, > > and he ran the larger network so won the discussion. > > The "myth" comes from engineers at large networks saying it is so. > > We could also have a small miscommunication here. For example, if a custome= > r were multi-homed to a peer, and the customer and peer were on the same rou= > ter, and the customer had prepended a single time (making the AS path equal)= > , by your original statement you would have sent traffic to the peer. Most p= > eople would find that silly. (And please do not point out customers and pee= > rs do not connect to the same router, this is a simple example for illustrat= > ive purposes.) > > However, the statement you make above says that you preferred the peer becau= > se "the path was shorter". You do not specify if that is IGP distance, AS p= > ath length, or some other metric, but it implies if the path were equal, you= > would prefer the customer - especially since the customer was preferred on t= > he other coast. So there may be assumptions on one side or the other that a= > re not clear which are causing confusion. > > Either way, this seems operationally relevant. > > I would like the large networks of the world to state whether they prefer th= > eir customer routes over peer routes, and how. For instance, does $NETWORK p= > refer customers only when the AS path is the same, or all the time no matter= > what? > > Let's leave out corner cases - e.g. If a customer asks you, via communities o= > r otherwise, to do something different. This is a poll of default, vanilla c= > onfigurations. > > Please send them to me, or the list, wi
Re: Cloudflare, and the 120Gbps DDOS "that almost broke the Internet"
An important question... I recall a peering panel at an ISPCON in 1996 when the current Peering Badguys, BBN, were represented by John, who listened to a ton of bitching for an hour about the unfairness of it all and said (paraphrasing)... "I understand you all have your opinions and desires but I just want to point out one thing. It is now 1996, 2 years after the widespread adoption of the web, and in every city in the US there are at least two ISPs happily providing unlimited {dialup} access for under $20/mo. What do you think we'd have if it were run or regulated by the government?" Luckily, many bureaucrats and politicians in our government do understand that. And so far The Community has been able to put pressure on international bodies and other governments don't have the clout. Hopefully that remains the case for some time. Avi > In general, governments have avoided regulating various aspects of > the Internet, in part because of lack of understanding and in part > because the community keeps telling them that trying to regulate > won't work because of its decentralized nature. As the Internet > becomes increasingly important to each country's economy and its > citizens, the status quo is not likely to continue. > > The real question is, when governments do decide to try and help > "improve the Internet", who will they be listening to, and will > the operator community have spoken with a clear enough voice in > these matters on what actually would make for an improvement? > > FYI, > /John
Re: DDOS, IDS, RTBH, and Rate limiting
> Netflow is stateful stuff, and just to run it on wirespeed, on hardware, > you need to utilise significant part of TCAM, Cisco ASRs and MXs with inline jflow can do hundreds of K flows/second without affecting packet forwarding. > i am not talking that on some hardware it is just impossible to run it. > So everything about netflow are built on assumption that hosting or ISP > can run it. And based on some observations, majority of small/middle > hosting providers are using minimal,just BGP capable L3 switch as core, > and cheapest but reliable L2/L3 on aggregation, and both are capable in > best case to run sampled sFlow. Actually, sFlow from many vendors is pretty good (per your points about flow burstiness and delays), and is good enough for dDoS detection. Not for security forensics, or billing at 99.99% accuracy, but good enough for traffic visibility, peering analytics, and (d)DoS detection. > So for a small hosting(up to 10G), i believe, FastNetMon is best > solution. Faster, and no significant investments to equipment. Bigger > hosting providers might reuse their existing servers, segment the > network, and implement inexpensive monitoring on aggregation switches > without any additional cost again. It can be useful to have a 10G network monitoring box of course... And with the right setup you can run FastNetMon or other tools in addition to generating flow that can be of use for other purposes as well... > Ah, and there is one more huge problem with netflow vs FastNetMon - > netflow just by design cannot be adapted to run pattern matching, while > it is trivial to patch FastNetMon for that, turning it to mini-IDS for > free. It's true, having a network tap can be useful for doing PCAP-y stuff. But taps can be difficult or at least time consuming for people to put in at scale. Even, we've seen, for folks with 10G networks. Often because they can get 90% of what they need for 4 different business purposes from just flow :) > Best regards, > Denys Avi Freedman| Your flow has something to show you; can you see it?| CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |
Re: DDOS, IDS, RTBH, and Rate limiting
> > On the contrary - SPAN nee port mirroring cuts into the > > frames-per-second budget of linecards, as the traffic is in essence > > being duplicated. It is not 'free', and it has a profound impact on > > the the switch's data-plane traffic forwarding capacity. > > > > Unlike NetFlow. > > In hosting case mirroring usually done for uplink port, but i have to > agree, it might be a problem. Have you seen any issues with SPANning? We usually advise something like a $1k netoptis tap or to be cheaper there are actually $50 fiber cables with 30/70 taps embedded (so two such, one for RX tap and one for TX tap). Of course, that only grabs a single 10gig whereas with SPAN you can potentially do more - but the issues we've seen across vendors is that if you try to send more traffic into a SPAN port than its size, bad things can happen. Head of line blocking, random congestion, and other strange failures. And you trade off potential catastrophic downtime for SPAN-related network destabilization, for guaranteed downtime to bring links down to tap them. > "Major" expenses - tuning server according author recommendations, and > writing shell script that will send to 4948 command to blackhope IP. For > qualified sysadmin it is 2 hours of work, and $500 max as a "labor" > cost. Thats it. What can be cheaper than $2000 in this case? I guess i > wont get answer. I think the issue is not with your providing the info about fastnetmon, its genesis, and what you see as the great use cases for it - more around the statements on flow as an unusable source of data for various purposes. Things seem to have died down around that though, which is good :) > --- > Best regards, > Denys Avi Freedman| Your flow has something to show you; can you see it?| CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |
Re: DDOS, IDS, RTBH, and Rate limiting
> > Cisco ASRs and MXs with inline jflow can do hundreds of K flows/second > > without affecting packet forwarding. > > Yes, i agree,those are good for netflow, but when they already exist in > network. > > Does it worth to buy ASR, if L3 switch already doing the job > (BGP/ACL/rate-limit/routing)? Not suggesting that anyone should change out their gear though per my other message, I've seen SPAN make things go wonky on almost every vendor that ISPs use for switching. > Well, if it is available, except hardware limitations, there is second > obstacle, software licensing cost. On latest JunOS, for example on EX2200, > you need to purchase license (EFL), and if am not wrong it is $3000 for > 48port units. > > So if only sFlow feature is on stake, it worth to think, to purchase license, > or to purchase server. Prices for JFlow license on MX, just for 5/10G is way > above cost of very decent server. I believe that smaller MXs can run it for free. Larger providers we've worked with often have magic cookies they can call in to get it enabled, but I understand you're talking about the smaller-provider (or at least ~ 10gig per POP across multiple POPs) case. We see a lot of Brocade for switching in hosting providers, which makes sFlow easy, of course. > > And with the right setup you can run FastNetMon or other tools in > > addition to generating flow that can be of use for other purposes > > as well... > > Technically there is ipt_NETFLOW, that can generate netflow on same box, > for statistical/telemetry purposes. But i am not sure it is possible to > run them together. At frac 10gig you can just open pcap on a 10gig interface on a Linux box getting a tap, of course. What we did was use myricom cards and the myri_snf drivers and take from the single-consumer ring buffers into large in-RAM ring buffers, and make those ring buffers available via LD_PRELOAD or cli tools to allow flow, snort, p0f, tcpdump, etc to all be run at the same time at 10gig. The key for that is not going through the kernel IP stack, though. > > But taps can be difficult or at least time consuming for people to > > put in at scale. Even, we've seen, for folks with 10G networks. > > Often because they can get 90% of what they need for 4 different > > business purposes from just flow :) > > About scaling, i guess it depends on proper deployment strategy and > sysadmins/developers capabilities. For example to deploy new ruleset > for my pcap-based "homemade" analyser to 150 probes across the country - > is just one click. Sounds cool. You should write up that use case. Hopefully you've secured the metadata/command push channel well enough :) > Best regards, > Denys Avi Freedman| Your flow has something to show you; can you see it?| CEO, CloudHelix | (avi at cloudhelix dot com) | my name one word on skype |
Re: NetFlow - path from Routers to Collector
Looking at probably 100 networks' flow paths over the last year, I'd say 1 or 2 have OOB for flow. Maybe another 10-20 have interest in taking simpler time series data of top talkers over their OOB networks, but not the flow itself. Agree w Roland that it can cause problems with telemetry if there are big network misconfigs. But for folks seeing DDoS, we implement rate-limiting of the flows/sec via local proxies to avoid overwhelming network capacity with the flow data... Avi > I think the key here is that Roland isn't often constrained by > these financial considerations. > > I would respectfully disagree with Roland here and agree with > Job, Niels, etc. > > A few networks have robust out of band networks, but most > I've seen have an interesting mixture of things and inband is usually > the best method. > > Those that do have "seperate" networks may actually be CoC > services from another deparment in the same company riding the same > P/PE devices (sometimes routers). > > I've seen oob networks on DSL, datacenter wifi or cable swaps > through the fence to an adjacent rack. > > An oob network need not be high bandwidth enough to do netflow > sampling, this is well regarded as a waste of money by many as the costs > for these can often be orders of magnitude more compared to a pure-IP > or internet service. > > I'll say this ranks up there with people who think > MPLS VPN == Encryption. It's not unless you think a few byte > label is going to confuse people. > > - Jared
Re: NetFlow - path from Routers to Collector
(Said Roland:) > Again, to clarify - I count VLANs/VRFs as being sufficiently out-of-band > to handle flow telemetry on a reasonable basis without mixing it in with > customer traffic. > > That changes the ratio. > I agree with you, Avi, and others that a dedicated OOB network *just for > flow telemetry* doesn't make economic sense in most (any?) scenarios. > > What I'm saying is that it oughtn't to be mixed in with customer > data-plane traffic. Ideally, all management-plane traffic would > traverse a separate physical infrastructure. Since we don't live in an > ideal world, virtual separation is generally Good Enough. We see well under 20% doing logical separation but definitely folks doing it... For the definition of OOB as "separate routers and switches", we don't see anyone really sending flow over that kind of OOB network. > --- > Roland Dobbins Avi Freedman CEO, Kentik avi at kentik dot com
Re: NetFlow - path from Routers to Collector
(Jared wrote): > Most people I've seen have little data or insight into their > networks, or don't have the level that they would desire as > tools are expensive or impossible to justify due to capital costs. > Tossing in a recurring opex cost of DC XC fee + transport + XC fee + > redundant aggregation often doesn't have the ROI you infer here. > I've put together some models in this area. It seems to me the > DC/real estate companies involved could make a lot (more) money by > offering an OOB service that is 10Mb/s flat-rate for the same as an XC > fee and compete with their customers. Equinix does have a very aggressively priced 10Mb/s flat-rate OOB (single IP only but that's not that hard to work around) for essentially XC pricing. It's been stable but not something you'd rely on for 100% packet delivery to some other point on the Internet (so more for reaching a per-pop OOB than for making a coherent OOB network with a bunch of monitoring running 24x7). Still, it's a good value for what it is. > - Jared Avi Freedman CEO, Kentik avi at kentik dot com
Re: NetFlow - path from Routers to Collector
Agreed, we are as well :) VLAN, VRF, whatever. + optimal tweaks include local flow proxy that can also rate limit / re-sample, and send topk talkers over 'true' OOB. Avi Freedman CEO, Kentik avi at kentik dot com > On 2 Sep 2015, at 7:27, Avi Freedman wrote: > > > We see well under 20% doing logical separation but definitely folks > > doing it... > > ~20% matches our subjective observations, as well. > > We're doing our best to increase that number. > > --- > Roland Dobbins
Re: EyeBall View
> All, > > I had an idea to create a product where we would have a host on every EyeBall > network. Customers could then connect to these hosts and check connectivity > back to their network. For instance you may want to see what the speed is > like from CableVision in central NJ to your network in South Florida or the > latency etc. I go large scale I wanted to know how much demand there was for > such a service. > > Regards, > > Dovid Another approach to take is to enable monitoring of your infrastructure, and then do active tests on top to web servers and other end points. Passive instrumentation gives you the even bigger advantage of giving you insight into issues actually affecting your users' traffic. Just did a talk about this at NANOG 65: https://www.nanog.org/sites/default/files/monday_general_freedman_flow.pdf If you set up a tap or SPAN and grab a box with Intel (or many other kinds of NICs), you can use PF_RING and nprobe to monitor at 100gig+ speeds. For nprobe in particular as an "agent", some of the extended/augmented data you can get via NetFlow includes: http://www.ntop.org/wp-content/uploads/2013/03/nProbe_UserGuide.pdf [NFv9 57595][IPFIX 35632.123] %CLIENT_NW_DELAY_MS Network latency client <-> nprobe (msec) [NFv9 57596][IPFIX 35632.124] %SERVER_NW_DELAY_MS Network latency nprobe <-> server (residual msec) [NFv9 57597][IPFIX 35632.125] %APPL_LATENCY_MSApplication latency (msec) [NFv9 57581][IPFIX 35632.109] %RETRANSMITTED_IN_PKTS Number of retransmitted TCP flow packets (src->dst) [NFv9 57582][IPFIX 35632.110] %RETRANSMITTED_OUT_PKTS Number of retransmitted TCP flow packets (dst->src) [NFv9 57583][IPFIX 35632.111] %OOORDER_IN_PKTSNumber of out of order TCP flow packets (dst->src) [NFv9 57584][IPFIX 35632.112] %OOORDER_OUT_PKTS Number of out of order TCP flow packets (dst->src) [NFv9 57585][IPFIX 35632.113] %UNTUNNELED_PROTOCOLUntunneled IP protocol byte The NANOG PPT shows an example of some of the slicing and dicing you can then do (focused around retransmitted TCP packets, which is what most of our customers are interested in focusing on as a simple proxy metric for 'network performance'). Not soliciting flames on what the magic metrics should be - store them all and use the ones that best correlate for you :) Luca/ntop are actively working on nprobe, so I'm sure you could get him to add throughput and other metrics as ell. Also - The same approach should work with Cisco AVC on ASRs, though it's something we're just starting to test and may only work with specific sets of filters (vs blanket apply to 40gig of traffic through an ASR). Definitely curious if anyone in the NANOG community has tried AVC? Or any other switch/router-layer performance instrumentation? We've been interested in putting an agent on some of the Linux white box switches, but the Broadcom chips in the current gens don't allow 'flow sampling' - getting all headers or none for a flow, for a % of flows matching a profile. And that's needed to do retransmit/OOO/latency tracking (vs just seeing samples of packets across flows). Again, pointers to switches that have that capability and can run *nix apps would be appreciated :) Avi Freedman CEO, Kentik avi at kentik dot com
Re: sFlow vs netFlow/IPFIX
Re: limits - For Cisco/Juniper it's in the low hundreds of thousands of flows/sec per chipset/linecard for 1:1 NetFlow/IPFIX, I think. Then of course, as has been mentioned, you'll need to be able to send it and receive it to something - and store+query. Avi Freedman CEO, Kentik > On 28 February 2016 at 23:40, Nick Hilliard wrote: > Around here they are currently voting on a law that will require unsampled > 1:1 netflow on all data in an ISP network with more than 100 users. Then > store that data for 1 year, so the police and other parties can request a > copy (with a warrant but you are never allowed to tell anyone that they > came for the data and the judges will never say no). > > My routers can apparently actually do 1:1 netflow and the documentation > does not state any limits on that. So maybe I am lucky? > > To the original question: in this country sFlow only is apparently about to > become illegal. > > Regards, > > Baldur
Re: sFlow vs netFlow/IPFIX
> This maybe outside the scope of this list but I was wondering if anybody had > advice or lessons learned on the whole sFlow vs netFlow debate. We are > looking at using it for billing and influencing our sdn flows. It seems like > everything I have found is biased (articles by companies who have commercial > offerings for the "better" protocol) > > Todd Crane Most vendors that take "flow" take both so there tends not to be THAT much religion unless you talk to someone who hates being flooded with 1:1 flow, or debugging broken (usually NetFlow) implementations. In our experience, they both basically work for ops use cases nowadays, for major vendors of routers, and most switches. sFlow gives faster feedback and more accurate (adding things up, * sample rates, closer to SNMP counter data) than most NetFlow/IPFIX implementations. How much varies from slightly to extreme (if you're using Catalysts for NetFlow/IPFIX). My thesis overall re: why sFlow 'just works' a bit better is that it's just so much easier to implement sFlow because there's no need to track flows (hash table or whatever data structure you need). Just grab samples of headers, parse, fill structs, and send. That said, things are hugely less sucky than 10 or even 5 years ago in the NetFlow world, and for the right vendor and box and software it's possible to get NetFlow/IPFIX essentially as accurate. And has been noted, it at least in theory some boxes that do tens to hundreds of gigabits (or low terabits) of traffic support 1:1, which you could in theory do with sFlow as a transport, but I haven't seen a switch or router that does that. Re: 1-1 flow - the boxes supporting that are generally not the biggest purchase-able from Cisco or Juniper, but are used as the big-boy backbone and border routers by a good number of multi-terabit networks, and even some multi-tens-of-terabit networks. Good luck in your flow journeys. Avi Freedman CEO, Kentik
Re: [Paper] B4: Experience with a Globally-Deployed Software Defined
No, people never use *flow controllers* for anything. People have been doing SDN since before Google was around. OK, so it was horrible expect scripts but it worked. Avi > Unpossible. I heard that no one really uses sdn for anything. > > :) > > T
Re: [Paper] B4: Experience with a Globally-Deployed Software Defined
> On Sat, Aug 17, 2013 at 2:32 PM, Avi Freedman wrote: > > > No, people never use *flow controllers* for anything. > > > People have been doing SDN since before Google was around. > > OK, so it was horrible expect scripts but it worked. > > Not really. Note I am talking about flow controllers in my first point. (And I was trying to be funny to match Todd's tone, though I guess it's dangerous to try to copy the master) Re: flow controllers - The idea of centralized decision makers doing something (typically per flow) has been proposed, in my experience, by those with little operational experience or those with extraordinarily constrained topoligies, types of traffic, and usually external filtering to constrain the types of traffic one could face. Because... There have been no proposals that I have seen (or that those who are at the Major Vendors who follow it more closely tell me about when I ask a few times/year) to actually deal with the every-packet-is-a-flow problem we saw first with 7206VXRs and that remain a real possibility for Internet- connected networks. Distributing flow controllers and making them hierarchical doesn't seem to help in the architectures that I've seen proposed. So it seems to be of use only for very tiny networks or for very constrained and filtered or non Internet-connected topologies. I'd be interested to be shown otherwise. > Automatic reconfiguration of routers is not what a software-defined network > is. > > It is one of the things (but not all of the things) that SDN provides. > > A software defined network is one where the forwarding behavior can be > completely defined > in software running outside of the devices that perform the forwarding. That said, I wince every time someone starts talking about (not suggesting you are here but many do) making the routing engineer or designer in a box that sits on the bottom or besides the network. Those who have experience and/or run larger infrastructure usually say words like "of course we have to worry about feedback loops" but many don't. I think innovation is great but I don't think there are that many shops that are better off writing their own control pane (centralized, distribtued, whatever) right now. It's worth remembering that Google is a software company. They are far ahead in software defined everything. > You can write expect scripts all day; but you cannot turn your basic switch > into a Load balancer or stateful firewall with one. > or decide in real time exactly which destination Ethernet ports a packet > coming in a certain port is going to touch, without having structured > VLANs and static MAC tables on the switches ahead of time. > > Changing device configurations with expect scripts is a limited tiny subset > of what SDN is. True, but the number of production environments that are going to be more stable or scalable by having people build their own control logic is pretty small in my experience. And being able to debug and reach out to a community of operators with a common set of experience of what to do, not to do, and how to debug is extraordinarily valuable for production networks. When I look at most of the non-Google big guys, SDN means pushing the vendors for better control plane instrumentation and ability to program (but more on the instrumentation side as where the gaps have been), and potentially to get some cross-provider way of doing the above. + having merchant silicon one can get/use for cheap, typically for more constrained topologies, doing pretty dumb switching and/or routing stuff. > -JH Where I see the delta a lot given the customer conversations I have is in the magic provisioning of cloud network infrastructures. New school SDN is that everything is a tunnel, magic software maps things, commercial providers doing this uniformly have to aggressively rate- limit their clients, and performance for content delivery is limited because the hypervisors must be briding and can't do PCI passthrough or SR/IOV. Old school SDN (not really that old school) is API-based provisioning of network devices with vendor support (let's say Juniper) to do filtering, VLANs, and shaping and tunnels where needed. It'll definitely be interesting to see where things go over the next few years. I know tens of companies who have run away from cloud providers with new(er) school SDN-ish infrastructures for the simplicity of just having some high performance dedicated machines/hypervisors with dead simple switching infrastructure. Anyway, innovation is great but I just see few companies with the understanding to go build their own control plane software to connect to the Internet with. And those vendors who do build it will get borged by one of the routing/switching vendors and things will become product features, differentiated by providers, most likely. (Though I hope not) Avi
Re: community real-time BGP hijack notification service (fwd)
Hi, Arnaud. The design is to only watch the origin ASN, not the other ASNs in the path. Support for doing something with the transit portion wof the AS_PATH will be added, probably a very simple "alert if X is in there" or "alert if Y is not in there". As others have said it's imperfect so ideas are welcome but the goal here is to try to keep it useful but simple. Thanks, Avi > Date: Fri, 12 Sep 2008 14:18:58 +0200 > From: Arnaud de Prelle <[EMAIL PROTECTED]> > To: Gadi Evron <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Subject: Re: community real-time BGP hijack notification service > > Hello Gadi, > > Gadi Evron wrote: > > Hi, WatchMy.Net is a new community service to alert you when your prefix > > has been hijacked, in real-time. > > Very good initiative. You can count on me as one of your users. > > Note that apparently it doesn't seem to be working as expected yet. > Indeed I already received two false alerts: > > 1. > Subject: > watchmy.net BGP Alert - seeing {91.198.99.0/24, 6450 3737 701 702 43751} > > Body: > Hello, we are seeing 91.198.99.0/24 being advertised with aspath 6450 > 3737 701 702 43751. > > We are alerting you because of the rule you set that is watching for > prefixes that match or are more specific than 91.198.99.0/24, and are > originated with any origin AS other than one of 702,6661,8220
Re: community real-time BGP hijack notification service
> Nathan wrote: > It is trivially easy for an attacker to falsify the origin AS. If 'they' are > not doing it already, then I'm quite surprised. > This isn't really a good thing to alarm on, in my opinion. Or, maybe it is, > but > there should be big bold text explaining that it's not reliable as it's > trivially easy to falsify. Yep, true. However, there's the case that someone's just typo'd you, which has happened to, near, around, and by me more frequently than an actual jackification. There was the time I fumble-fingered some net99 space and Karl Denninger started tracking me down to threaten lawsuits (before, I might add, asking me to log into the offending device and change the config). Anyway, the other case is where there shouldn't be a more specific, and you still win. Otherwise, yes, origin AS can be forged but the transit part is even messier, I think. > My best idea is looking at the AS_PATH for changes, and alerting whenever > that > happens. You'd obviously get a different path whenever there is churn in the > network though. I'm sure there's a way to do this, and I suspect having BGP > feeds from many many places is the most reliable way for it to happen, I just > haven't figured out why yet. As you point out, the Internet is a really noisy and messy place. Just doing the "different than usual" is something I resisted here because there's so much hidden partial transit that doesn't normally expose. More BGP feeds might just amplify that behavior, though the idea is to get more feeds in. > This seems like a service that Renesys etc. could/should (or maybe do?) > offer, > they seem well placed with all their BGP feeds.. Not sure who else offers it; it seemed reasonable to do and see if it's useful. Gadi told me there was no free real-time alerting out there but I didn't really look into it. Certainly if anyone wants to see the dynamics, who has advertised what now and in the deep dark past, etc Renesys would be the place to go as far as I know. > Nathan Ward Avi
Re: community real-time BGP hijack notification service
> Nathan wrote: > My best quick hack solution so far is to fire off a traceroute and make sure > that the traceroute gets ICMP TTL expire messages from IP addresses that are > in > prefixes originated from all the ASes in the ASPATH. > Still forgeable, but a bit more difficult.. still far from perfect though. An interesting idea although I think the false positive rate would be very high with all of the filtering (and mismatch between traceroute and BGP topologies) that exists out there. It'd be interesting for someone to try and see how well it works though. (Any researchers hanging out on NANOG want to try a weekend project...) > Nathan Ward Avi
Re: community real-time BGP hijack notification service
Hi Erik - There's a great button about Usenet - "Reading Usenet is like drinking from a firehose; Posting to Usenet is like shouting from a mountaintop; Archiving Usenet is like saving used toilet tissue." BGP may be somewhat more important, useful, and the results consumable in the short-term, but for long-term archiving I think it devolves to being more interested to researchers and other ubernerds who can use the libraries and the very valuable data store and service that RIPE provides (which is appreciated)! I was thinking more for the medium term "what's normal" that goes back beyond whatever's in the routing table this second, probably for a few weeks to months max in most cases. And I think for actual diagnosis what's needed is a great tool to ask network and business questions of historic BGP data. That's the context in which I mention Renesys tools+data. So I'd say to help the networkers of the world, it's probably more about tools than history. Thanks, Avi > RIS provides data in a searchable MySQL database for three months. > > All we've ever collected is kept in a raw data format. This archive > starts in 1999, and we maintain a library to read the data. > This data is free to use for any purpose and we will not remove any of > our raw data as it gets older. > > We are also carefully looking into whether we should reduce or increase > the amount of data in our MySQL database - as that's easy to search for > our users. > > However, any increase obviously comes with increased resource usage - so > this is something that requires careful thinking and planning. > Another option is to store aggregated info on older data, instead of > keeping every update that ever occured. > > But, this is just an idea that crosses our minds from time to time - I'm > not making promises on what we will implement :) > > Of course, any ideas on how much more history would help you, are very > welcome. > > cheers, > -- > Erik Romijn RIPE NCC software engineer >
Re: community real-time BGP hijack notification service
Hmm, I'm trying to figure out the application here. You have single prefixes originated or originate-able by more than 5 or 6 ASs? I see - is it that you have, say a /16 with 13 potential ASs that might be seen as originating more specifics inside that /16? Hadn't considered that; we were envisioning that those specifics would be set up as separate alerts. It's easy enough to extend the # of ASNs that can be listed, however. That'll be done this weekend. Thanks, Avi > Looks interesting, but it only takes a fairly short list of ASNs for a > prefix. For our big CIDR blocks, we have WAY too many ASNs to enter them > all, so it's pretty useless for me. I need to be able to enter at very > least a dozen ASes and I suspect may folks have a LOT more then that. > > For now, I'll enter some shorter pieces from the block, but I'm most > concerned with the pieces that are not currently assigned, so are > available for hijack. I have added the larger, unassigned blocks. I'll > start adding assigned bits and pieces as well as unassigned pieces, but > being able to put all valid origin ASes in the list for the full blocks > would be a lot nicer. > R. Kevin Oberman, Network Engineer