System-Wide Domain Blacklisting

Feb. 5, 2024 [hardening] [privacy-security] [guides] [libre] [technology]

Many of the same block lists used by ad blocker extensions can also be used globally by your hosts file to redirect all requests to that domain to a non-routable address. You may sometimes see this referred to as “blackholing” DNS. If a program initiates any lookups for cookielaw.org, and cookielaw.org is on this hosts file, it should be redirected to 0.0.0.0 which instantly fails out as unresolvable preventing the program from connecting to the actual destination.

The concept of blackholing has been popularized among newbie privacy communities by the likes of “pi-hole”. But I think that pi-hole misses the mark. First of all, we don’t want to rely on a system outside of the host, not just because that introduces yet another device which needs to be rigorously secured but also because you may decide to take your computer with you to connect to another network somewhere else. Especially if it is a laptop. Or what of hopping on to a system-wide VPN connection? Additionally, our tor-wrapped DNS solution detailed in Hardened DNS will evade any such device sitting on the LAN attempting to mediate DNS.

While blackholing can simply be accomplished by manually adding a domain list into hosts, I have substantially transformed a script to automate the process and to allow several different lists to be seamlessly combined. Create a cron or anacron job for the script to run:

vi /etc/cron.daily/hosts-block

And populate it with the following:

#!/bin/bash

#Automated script for maintaining a malware blocking hosts file
#Originally created by user SteveRiley https://www.kubuntuforums.net/showthread.php/56419-Script-to-automate-building-an-adblocking-hosts-file?s=e56f4375b9ded5ca30e26346a06d71f3
#Adapted and extended to only accept lists over https, add working directory, automatically apply to hosts, add configurable list categories, generalize beyond just ad blocking, add support for lists already pointing to 0.0.0.0, and prevent overwriting hosts with empty list (such as network issue)

if [ "$(whoami)" != "root" ]; then
    echo "Aborting: Must be run as root or via sudo."
    exit 1
fi

# If this is our first run, save a copy of the system's original hosts file and set to read-only for safety
if [ ! -f /var/local/hosts-blocking/hosts-system ]; then
	echo "Saving copy of system's original hosts file..."
	mkdir /var/local/hosts-blocking
	cp /etc/hosts /var/local/hosts-blocking/hosts-system
	chmod 444 /var/local/hosts-blocking/hosts-system
fi

# Perform work in temporary files
temphosts1=$(mktemp)
temphosts2=$(mktemp)

# Configurable blocklist files sources
block_lists=(\
#Block advertisements
"https://pgl.yoyo.org/adservers/serverlist.php?hostformat=hosts&showintro=0&mimetype=plaintext" \
"https://adaway.org/hosts.txt" \
"https://hostsfile.mine.nu/hosts0.txt" \
"https://raw.githubusercontent.com/jdlingyu/ad-wars/master/hosts" \

#Block malware
"https://www.malwaredomainlist.com/hostslist/hosts.txt" \
"https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Risk/hosts" \

#Block crypto miners
"https://raw.githubusercontent.com/hoshsadiq/adblock-nocoin-list/master/hosts.txt" \
"https://raw.githubusercontent.com/anudeepND/blacklist/master/CoinMiner.txt" \
"https://zerodot1.gitlab.io/CoinBlockerLists/hosts" \
"https://zerodot1.gitlab.io/CoinBlockerLists/hosts_browser" \

#Block spam
"https://raw.githubusercontent.com/FadeMind/hosts.extras/master/add.Spam/hosts" \

#Block trackers
"https://hostfiles.frogeye.fr/multiparty-trackers-hosts.txt" \
"https://hostfiles.frogeye.fr/firstparty-trackers-hosts.txt" \

#Block clickjackers & bad referers
"https://raw.githubusercontent.com/mitchellkrogza/Badd-Boyz-Hosts/master/hosts" \

#Block Facebook
"https://raw.githubusercontent.com/jmdugan/blocklists/master/corporations/facebook/all" \

#Block Google
#"https://raw.githubusercontent.com/jmdugan/blocklists/master/corporations/google/non_localized" \
#"https://raw.githubusercontent.com/jmdugan/blocklists/master/corporations/google/localized" \

#Block Huawei
"https://raw.githubusercontent.com/PikaMikaTuu/huawei-block-list/master/huawei-block-host.txt" \

#Block NSA known domains
"https://pastebin.com/raw/tNBM1j19" \
#STATIC BACKUP OF LAST KNOWN LIST BEFORE CHEF-KOCH TAKE DOWN

#Monotlithic lists to block spyware, ads, scams, spams, shock sites, popups, trackers, etc.
#"https://someonewhocares.org/hosts/zero/hosts" \
)

# Obtain various hosts files and merge into one
echo "Downloading blocklist files..."
successful_lists=0
for list in "${block_lists[@]}"; do
	torsocks wget --https-only --no-cookies -nv -O - "$list" >> $temphosts1
	if [ $? == "0" ]; then
		((++successful_lists))
	fi
done

#Test if temposts1 is empty
if [ -s "$temphosts1" ]; then	
	# Do some work on the file:
	# 1. Remove MS-DOS carriage returns
	# 2. Replace 0.0.0.0 with 127.0.0.1 to handle lists that already point to 0.0.0.0
	# 3. Delete all lines that don't begin with 127.0.0.1
	# 4. Delete any lines containing the word localhost because we'll obtain that from the original hosts file
	# 5. Replace 127.0.0.1 with 0.0.0.0 because then we don't have to wait for the resolver to fail
	# 6. Scrunch extraneous spaces separating address from name into a single tab
	# 7. Delete any comments on lines
	# 8. Clean up leftover trailing blanks
	# Pass all this through sort with the unique flag to remove duplicates and save the result
	echo "Parsing, cleaning, de-duplicating, sorting..."
	sed -e 's/\r//' -e 's/0.0.0.0/127.0.0.1/' -e '/^127.0.0.1/!d' -e '/localhost/d' -e 's/127.0.0.1/0.0.0.0/' -e 's/ \+/\t/' -e 's/#.*$//' -e 's/[ \t]*$//' < $temphosts1 | sort -u > $temphosts2

	# Combine system hosts with blocks
	echo Merging with original system hosts...
	echo -e "\n# General malware blocking hosts generated from $successful_lists out of ${#block_lists[@]} lists on "$(date) | cat /var/local/hosts-blocking/hosts-system - $temphosts2 > /var/local/hosts-blocking/hosts-block

	# Apply final blocklist to system hosts file
	cp /var/local/hosts-blocking/hosts-block /etc/hosts

	# Clean up temp files and remind user to copy new file
	echo "Cleaning up..."
	rm $temphosts1 $temphosts2
	echo "Done."
	echo
	echo "Manually copy malware blocking hosts file with this command:"
	echo " sudo cp /var/local/hosts-blocking/hosts-block /etc/hosts"
	echo
	echo "You can always restore your original hosts file with this command:"
	echo " sudo cp /var/local/hosts-blocking/hosts-system /etc/hosts"
	echo "so don't delete that file! (It's saved read-only for your protection.)"
	echo
	exit 0
else
	# Prevent existing blocklists from being overwritten with empty list
	echo "Aborting: No blocklist content has been retrieved into the working file."
	exit 1
fi

The Configurable blocklist sources section can be adjusted to include lists which have been commented out. Simply remove the leading “#”. You may want to do this if you don’t plan on connecting to any Google services, for example. Also you may find inspiration in adding lists from uBlock, uMatrix or other addons. Just make sure that the list uses IPv4 addresses.

If you want it to be applied immediately instead of waiting for the daily update job to run, just directly run the script with root privileges:

sudo /etc/cron.daily/hosts-block

All of the lists will be updated daily over Tor. You can check the status of your hosts file by running:

grep -e General /etc/hosts

It should reveal whether any lists were skipped which may indicate that a link is broken. For example;

# General malware blocking hosts generated from 16 out of 17
lists on Sat 26 Feb 2022 12:49:16 AM EST

Like with earlier customizations, make sure that Firefox is set to respect your system domain resolution instead of Mozilla’s disgraceful cloudflare honeypot. Now if your adblockers fail for whatever reason, most malicious domains should still be blocked through this defense-in-depth strategy.