> On Jul 27, 2021, at 17:20, Vimal <j.vi...@gmail.com> wrote: > > Hi all, great replies. :) Let me clarify my initial question, and then > respond one by one: > > My intention is to run a web-crawling service on a public cloud. This service > is geographically distributed, and therefore will run in multiple regions > around the world inside AWS... this means there will be multiple AWS VPCs, > each with their own NAT gateway, and traffic destined to websites that we > crawl will appear to come from this NAT gateway's IP address. > > The reason I want a predictable IP is to communicate this IP to website > owners so they can allow access from these IPs into their networks. I chose > IP as an example; it can also be a subnet, but what I don't want to provide > is a list of 100 different IP addresses without any predictability. > > I understand that this is not perfect, and would frankly not be my preferred > approach to solve the problem.... but we've had requests of this nature from > websites to create an allowlist of a limited number of predictable IPs so it > doesn't trip their IDSs/other systems they might have... so we're trying to > see how well it would work in practice. For the moment, let's set aside the > issue as to whether AWS will even let me advertise the same IP on all my VPC > NAT gateways, and just look at whether it's technically feasible. My gut > feeling is that this wouldn't work well in practice, but I wanted to ask the > experts here... > > Also, pointers on what the best practices for solving this issue are most > welcome, so I can reference those who ask for IP addresses to this discussion > and follow recommendations here. > > Onto the responses: > > @o...@delong.com and @wo...@pch.net athomp...@merlin.mb.ca > > Because there’s no good/reliable way to get the replies back to the correct > > initiating host. > > > When my clients make connections outbound to anycast addresses, the > > destination is more-or-less stable, and the replies come back to the > > client's unique IP, so anycast works in that direction. The guarantees are > > not present in the reverse direction. > > Yes, this makes sense as the destination can be anywhere around the world, > and that routing is asymmetric as others mentioned. However, if the > destination service is "close" (in the routing metric sense) to the > initiating host, anycast return IP ought to work well, right? I understand > this is a very important caveat and impractical to implement correctly in the > real world. > > > We use our IGP (IS-IS) for our Anycast services. We find it to be very > basic, and as such, very predictable. > > This is interesting... I wonder whether Anycast will still have some failure > modes and break TCP connections if routing (configuration) were to change? I > checked the PDF linked by Bill Woodcock... while the methodology is the same > from 20y ago, would the data still be the same (order of magnitude)? :) > > https://www.pch.net/resources/Tutorials/anycast/Anycast-v10.pdf (p38) > "Limited operational data shows underlying instability to be on > the order of one flow per ten thousand per hour of duration." > > @dan...@corbe.net, @m...@netfire.net, > > Unless you’re twisting knobs, egress traffic should already exit your > > network at the closest possible egress point to its origin. Is your > > intention to carry the traffic for longer than that? > No, but I hope my intention is more clear in this email. It's to have a > predictable egress IP to simplify firewall rules. > > thanks all!! > > > On Tue, Jul 27, 2021 at 12:25 PM Adam Thompson <athomp...@merlin.mb.ca> wrote: > Without any sarcasm: to make it harder to block. > If, say, Google, always crawled your site from 8.8.1.2 (random made-up > example) then you would see a not-insignificant number of hosts and networks > null-routing that IP. I have no idea why someone would do so, but I've seen > it done many times. Mostly by people who don't understand how un-special > they are on the internet. Also it would trigger IDS/IPS systems all over the > place, having gobs and gobs of connections coming from a single IP. > > That's setting aside the technical issues involved; routing is often > asymmetric, i.e. the return packet takes a different path than the inbound > packet. So it would, as Owen implied, be nearly impossible to ensure the > reply packets got back to the correct TCP stack. As an example, I'm > multi-homed and use path-prepending, so if a packet claiming to be from > 8.8.8.8 arrived on one of my commercial links, I would send the reply out the > cheapest link, which in my case is a flat-rate R&E network (that has a path > to Google), thus ensuring the reply does not get to the originating anycast > node. > > When my clients make connections outbound to anycast addresses, the > destination is more-or-less stable, and the replies come back to the client's > unique IP, so anycast works in that direction. The guarantees are not > present in the reverse direction. > > The logical extremity of this is that it would be nearly impossible for two > anycast addresses to establish a TCP connection to each other. (In general. > There will be lots of local cases where it does happen to work, by > coincidence.) > > You'll find that even anycast nodes do not make connections outbound using > their anycast address, pretty much for these reasons. > > -Adam > > Adam Thompson > Consultant, Infrastructure Services > <Outlook-1593169877.png> > 100 - 135 Innovation Drive > Winnipeg, MB, R3T 6A8 > (204) 977-6824 or 1-800-430-6404 (MB only) > athomp...@merlin.mb.ca > www.merlin.mb.ca > From: NANOG <nanog-bounces+athompson=merlin.mb...@nanog.org> on behalf of > Vimal <j.vi...@gmail.com> > Sent: July 27, 2021 12:54 > To: nanog@nanog.org <nanog@nanog.org> > Subject: Anycast but for egress > > (Unsure if this is the right forum to ask this question, but here goes:) > > From what I understand, IP Anycast can be used to steer traffic into a server > that's close to the client. > > I am curious if anyone here has/encountered a setup where they use anycast IP > on their gateways... to have a predictable egress IP for their traffic, > regardless of where they are located? > > For example, a search engine crawler could in principle have the same IP > advertised all over the world, but it looks like they don't... I wonder why? > > -- > Vimal > > > > -- > Vimal
If I were in your shoes, I’d pick a VPS provider that assigns external, globally routable IPs to their customers. Linode, Vultr, Digital Ocean, etc. You may be able to do something on AWS with Elastic IPs but I don’t know enough about Amazon’s infrastructure to give you a qualified answer.