Hello,

On Wed, Oct 4, 2017 at 12:28 AM, Walter Dnes <waltd...@waltdnes.org> wrote:
>   I have some doubts about massive "hosts" files for adblocking.  I
> downloaded one that listed 13,148 sites.  I fed them through a script
> that called "host" for each entry, and saved the output to a text file.
> The result was 1,059 addresses.  Note that some adservers have multiple
> IP address entries for the same name.  A back-of-the-envelope analysis
> is that close to 95% of the entries in the large host file are invalid,
> amd return "not found: 3(NXDOMAIN)".
>
>   I'm not here to trash the people compiling the lists; the problem is
> that hosts files are the wrong tool for the job.  Advertisers know about
> hosts files and deliberately generate random subdomain names with short
> lifetimes to invalidate the hosts files.  Every week the sites are
> probably mostly renamed.  Further analysis of the 1,059 addresses show
> 810 unique entries, i.e. 249 duplicates.  It gets even better.  44
> addresses show up in 52.84.146.xxx; I should probably block the entire
> /24 with one entry.  There are multiple similar occurences, which could
> be aggregated into small CIDRs.  So the number of blocking rules is
> greatly reduced.
>
>   I'm not a deep networking expert.  My question is whether I'm better
> off adding iptables reject/drop rules or "reject routes", e.g...
>

If you want to filter connections based on IP, then use iptables or
the newer alternative, nftables. Nftables is faster and more
configurable.

I suggest the Wikipedia page before the documentation:
https://en.wikipedia.org/wiki/Nftables.

If you want to block advertisements, you should use a content aware
system that is integrated into a browser and that is maintained by
lots of people at the same time. You should also consider blocking
JavaScript.

Cheers,
     R0b0t1

Reply via email to