Migrating from FreeNAS to FreeBSD


I love FreeNAS. Its awesome, well built, well-supported. But as my needs increased, I wanted to use my FreeNAS box for more than the basics. In particular, I was moving towards a single host to run as a:

  1. Family NAS server
  2. Development server
  3. IRC client
  4. VM server
  5. Web server
  6. Email Server
  7. Git Server
  8. Home Firewall
  9. Home IPv6 gateway
  10. IPv6 VPN and Jump box

FreeNAS could easily do all of this. But I found myself using the device for everything but a NAS server. Also, as my experience on FreeBSD reaching proficient-status, I wanted to jump in the deep end and manually configure a production system from scratch. So I thanked FreeNAS for their contribution, yanked out the USB disks and installed FreeBSD 11.1 on a separate USB disk.

During installation, I was careful not to touch the /dev/ada devices, as that would destroy my precious files. Instead, I installed to the second USB disk, /dev/da1, while the installation medium was /dev/da0. This was obviously a problem, because at reboot the USB disk would become /dev/da0 and the kernel would panic upon not finding a /dev/da1. So I dropped to the terminal and mounted zroot/ROOT/default volume,  which is the / directory, to /tmp/root as follows.

zfs set mountpoint=/tmp/root zroot/ROOT/default
zfs mount zroot/ROOT/default

Then I edited /tmp/root/etc/fstab and changed /dev/da1p2 to /dev/da0p2, umounted, reset the machine and FreeBSD booted without a glitch.

As mentioned, I plan on using this system fairly heavily going forward so the 8 GB USB disk would definitely not be sufficient. FreeBSD has an amazing feature where it isolates the base system from any user-installed applications or configurations. Rather than using symlink magic, my strategy was to store all application data on my two 4TB NAS disks.

First things first, I imported the pool as follows:

zpool import -f tank

The -f flag was necessary because for whatever reason ZFS thought tank was currently utilized. A quick zfs list revealed that FreeNAS had been mounting my disks to /tank. Unfortunately, the /tank directory is not utilized by default by FreeBSD. Therefore, I renamed each ZFS volume to a new /usr/local as follows. First, I created a zfs volume for tank/usr/share as follows.

zfs create tank/usr/local

Then I renamed the old paths to map to my new intended directory structure, as follows

zfs rename tank/old/path tank/usr/local/new/path
zfs set mountpoint=/usr/local/new/path tank/usr/local/new/path

This took a bit of time, but after completing these for all partitions, I ran:

zfs mount -a

With that, all ZFS shares were mounted as /usr/local subdirectories. All of my data was successfully migrated over without a single bit of data loss!

From here, I needed to re-create the jails. FreeNAS’s excellent jail web-based GUI allows you to create jails with their own independent network stack. This feature is called VIMAGE and is useful to isolate network services from the host FreeBSD system. VIMAGE is pre-compiled into the FreeNAS kernel. It is on by default on FreeBSD 12.0, but not 11.x and must be compiled in. To do this, you need to download and uncompress the src distribution, edit /usr/src/sys/amd64/conf/GENERIC and add in the following line:

options VIMAGE

Next, compile the kernel and install it as follows.

make -j 5 buildkernel
make installkernel

The -j 5 is because this machine is an i3 with 4 cores – feel free to adjust this depending on the number of cores you have.

With a successful reboot, I was now ready to migrate the jails over. I did so by moving the zfs jails volume to /usr/local/jail, such that my IRC client jail was /usr/local/jail/irc. Now the complicated part: Configuring the jails!

Since a jail using VIMAGE has a completely separate network stack, by default it renders a jail unable to communicate outside of itself. The way to allow communication you have to create an epair(4) pair and pass one side to the jail, as follows:

ifconfig epair create
ifconfig epair0a vnet JAILNAME

In this configuration epair0a would belong to the jail while epair0b would belong to the base FreeBSD host, such that they could communicate. But how to setup connectivity? I had a lot of options to have the jails connect outside, including:

  • Being on the same subnet (192.168.1.0/24)
  • Being on a separate VLAN from the rest of the network (might be the long-term plan)
  • Have a single VLAN, have legacy IPv4 addresses identifiably different for ease, but have a single IPv6 network. I opted for this for now. Its simple and works.

This means creating an if_bridge(4) and attaching the network interface card, in my case an em(4) card and epairXb. Any frame to the bridge is relayed to the relevant epair(4). (Note, this not a route). I set my jail IP range as 192.168.100.0/24, just for organizational purposes. I also set the ISPs IP subnet to be 192.168.0.0/16, otherwise it would drop packets from 192.168.100.0/24. I am using TunnelBroker for my IPv6 traffic, as Verizon Fios does not offer IPv6. (As an side, this may be a good thing, since ISPs typically blocks ports, whereas TunnelBroker is completely unfiltered.) With that, Boom, network connectivity!

But…I wanted something repeatable per reboot, in the event of a power failure or loss. This meant I needed to go a little further. And here’s the complicated part. It took me about 4 hours to properly configure /etc/jail.conf:


/* Template */
host.hostname = "${name}.my.domain.prefix";

$ip4_route      = "192.168.100.1";
$ip6_route      = "IPV6PREFIX::1";

vnet;
vnet.interface = "epair${if}b";

persist;
allow.mount;
mount.devfs;
allow.sysvipc;

exec.prestart =  "ifconfig epair${if} create up";
exec.prestart += "ifconfig epair${if}a up";
exec.prestart += "ifconfig bridge0 addm epair${if}a up";

#exec.start += "/sbin/ifconfig epair${if}b up";
exec.start += "/sbin/ifconfig epair${if}b inet  ${ip4_addr}/24 up";
exec.start += "/sbin/ifconfig epair${if}b inet6 ${ip6_addr} prefixlen 64 up";

exec.start += "/sbin/route -4 add default ${ip4_route}";
exec.start += "/sbin/route -6 add default ${ip6_route}";

exec.start += "/sbin/ifconfig epair${if}b down";
exec.start += "/sbin/ifconfig epair${if}b up";

exec.start += "/bin/sh /etc/rc";

exec.stop = "/bin/sh /etc/rc.shutdown";
exec.poststop = "ifconfig bridge0 deletem epair${if}a";
exec.poststop = "ifconfig epair${if}a destroy";

irc {
        path = /usr/local/jail/irc;
	$if = "0";
	$ip4_addr 	= "192.168.100.2";
	$ip6_addr 	= "IPV6PREFIX::2";
}

www {
        path = /usr/local/jail/www;
	$if = "1";
	$ip4_addr 	= "192.168.100.3";
	$ip6_addr 	= "IPV6PREFIX::3";
}

In short, upon initialization, this creates a new epair(4) as specified by $if, attaches it to the jail, assigns the relevant IPv4/IPv6 information, and starts the init scripts. Shutdown is a mere detachment from the bridge and destruction of the epair(4). I also needed to assign the legacy IPv4 address to my em(4) interface.

Finally, I added the following sysctl(8) settings to /etc/sysctl.conf:

net.inet.ip.forwarding: 1
net.inet6.ip6.forwarding: 1

I did a lot of testing, reboot, restarting the jail, etc, and every time it worked. From the jails’ perspective, they didn’t even “know” they were migrated from one system to another. I wish I had tested if a FreeNAS plugin survived the migration, but I never used FreeNAS plugins anyways (what is this Plex I keep hearing about?).

Going forward, I plan:

  • Place the jails on a properly separate VLAN to segment the network
  • Consider use pfSense running in bhyve(8) to function as the Jail’s firewall of choice
  • Look into vale(4) to replace if_bridge(4). But I can’t find any documentation on it!
  • Figure out why TunnelBroker is failing on FreeBSD, but works just fine on my Linux Raspberry Pi – likely the fault of the ISP router.

My only regret: not installing HardenedBSD with LibreSSL.

Thoughts?

FreeBSD kernel Makefile variables SRCTOP and SYSDIR


I am currently writing a FreeBSD device driver and find myself lugging around the entire src. As you can imagine, this is quite large, especially if you are using any sort of version tracking system. So following the example here, I extracted out:

/usr/src/sys/modules/rtwn/
/usr/src/sys/dev/rtwn/

into

/home/user/src/rtwn/sys/modules/rtwn/
/home/user/src/rtwn/sys/dev/rtwn/

However, when I ran make(1) in the /home/user/src/rtwn/sys/modules/rtwn, I received an error saying:

make: don't know how to make r92c_attach.c. Stop

This error message is extremely non-descriptive of the actual issue. After reviewing the aforementioned functioning Makefiles, I identified that the SRCTOP and SYSDIR were not set correctly.

SRCTOP is the equivalent of /usr/src. If your src directory differs from /usr/src, such as $HOME/src/freebsd12src, you would set SYSDIR to $HOME/src/freebsd12src/.

SYSDIR is similar. Ordinarily it would be /usr/src/sys, but now it might be $HOME/src/freebsd12src/sys/.

This can be resolved two ways:

  1. Command-line over-ride. I am doing this:
    make VARIABLE="something"
    For me, that would be:
    make SRCTOP=$HOME/src/freebsd12src/ SYSDIR=$HOME/src/freebsd12/sys/ -C sys/modules/rtwn load.
  2. Permanent method: Edit the Makefile in question, in my case sys/modules/rtwn/Makefile.
    SRCTOP="/home/user/src/freebsd12src/"
    SYSDIR="/home/user/src/freebsd12src/sys"

And of course, you have to have at least one correct src directory in order to compile a kernel object. This is pretty simple, but it confused me for a while. Hope this helps! Keep writing that BSD code!

Linux kernel code vs FreeBSD kernel code


Linux driver code contains some serious garbage. I heard this refrain, but I did not realize how bad it was until I looked at it myself. Here is just one example.

Device drivers typically read static memory, typically known as EEPROM or ROM, from the chip to identify version, hard-coded information, device capabilities, etc. These values are used throughout execution of the driver. The reading process is among the first things when the device is attached and powered on.

In the case of FreeBSD, after the kernel reads the ROM, it uses a struct pointer with all the variables pre-populated, and points it at the ROM blob data stored in memory. For example:

struct r88e_rom {
	uint8_t		reserved1[16];
	uint8_t		cck_tx_pwr[R88E_GROUP_2G];
	uint8_t		ht40_tx_pwr[R88E_GROUP_2G - 1];
	uint8_t		tx_pwr_diff;
	uint8_t		reserved2[156];
	uint8_t		channel_plan;
	uint8_t		crystalcap;
#define R88E_ROM_CRYSTALCAP_DEF		0x20

	uint8_t		thermal_meter;
	uint8_t		reserved3[6];
	uint8_t		rf_board_opt;
	uint8_t		rf_feature_opt;
	uint8_t		rf_bt_opt;
	uint8_t		version;
	uint8_t		customer_id;
	uint8_t		reserved4[3];
	uint8_t		rf_ant_opt;
	uint8_t		reserved5[6];
	uint16_t	vid;
	uint16_t	pid;
	uint8_t		usb_opt;
	uint8_t		reserved6[2];
	uint8_t		macaddr[IEEE80211_ADDR_LEN];
	uint8_t		reserved7[2];
	uint8_t		string[33];	/* "realtek 802.11n NIC" */
	uint8_t		reserved8[256];
} __packed;

_Static_assert(sizeof(struct r88e_rom) == R88E_EFUSE_MAP_LEN,
    "R88E_EFUSE_MAP_LEN must be equal to sizeof(struct r88e_rom)!");

Notice the assertion at the bottom, which ensures that the ROM struct’s size equals a pre-defined length. The code will fail to compile if this assertion is not valid. Later, the kernel will instantiate a struct pointer and point it to the ROM, stored in the variable buf, as follows:

struct r88e_rom *rom = (struct r88e_rom *)buf;

Now, rom->channel_plan is set to the correct value. Simple.

Unfortunately, this is not how the same code is written on Linux. As mentioned, the Linux driver also begins by reading the ROM blob and storing it in a value called hwinfo. But rather than creating an equivalent struct pointer, the Linux code uses offset values of the ROM on an as-needed basis. For example, the driver reads the channel_plan as follows:

rtlefuse->eeprom_version = *(u16 *)&hwinfo[params[7]];

In this example, params[7] comes from a list of ROM offsets values set in the previous calling function. (That alone made tracing difficult.) The rtlefuse->eeprom_version is now the same as FreeBSD’s rom->version. This manual process repeats for every variable in the ROM.

While that may be just annoying and require a negligible bit more CPU power, this is not be a problem if it was done all in one place. But instead, the driver reads from the hwinfo blob on a seemingly as-needed during execution. And because these as-needed instances are during normal execution, the driver reads-in the same static value from hwinfo every a simple WiFi function occurs, such as changing the channel.

Okay, but even that might not be too difficult…right? Here’s the real kicker.

Sometimes, the driver works by using incrementing offsets from the ROM blob. For example, consider at read_power_value_fromprom (in drivers/net/wireless/realtek/rtlwifi/hw.c). It initializes eeaddr as a u32 (uint32_t), then assigns it with the offset value EEPROM_TX_PWR_INX. So far so good. But then, rather than using new offsets for every successive value, it increments the eeaddr value in multiple doubly-nested for-loops. Here is a simplified version of the code:

for (rfpath = 0 ; rfpath < MAX_RF_PATH ; rfpath++) {
		/*2.4G default value*/
		for (group = 0 ; group < MAX_CHNL_GROUP_24G; group++) { pwrinfo24g->index_cck_base[rfpath][group] =
			  hwinfo[eeaddr++];
			if (pwrinfo24g->index_cck_base[rfpath][group] == 0xFF)
				pwrinfo24g->index_cck_base[rfpath][group] =
				  0x2D;
		}
}

Notice the line hwinfo[eeaddr++]! Merely reading in that variable changes the offset. Its the Heisenberg Uncertainty Principle equivalent of code. This is a cleaned-up version of the 188-line function. The actual function has 6 nested for-loops, some with if-statements, each incrementing the eeaddr parameter as they go along.

Why would anyone do it this way? You are needlessly using up the CPU, making the code difficult to follow, repeatedly reading in static values and making any minor modifications and re-ordering or re-structuring will essentially break the entire function.

And perhaps the worst offender is when 20 functions deep you are not even working with hwinfo anymore. You are working to a pointer to hwinfo that has been incremented God-knows where, with their own offsets that are near impossible to track down.

In my efforts to port this driver to FreeBSD, I literally resorted to printing out the entire ROM, manually finding the memory, and backing into the equivalent offset. Other bizarre code: I have seen if-conditions that are impossible to reach, misplaced code that should go in the previous function, code that does bits of a tasks, while another function does the entire task – so repeat code, unnecessarily repeated code, etc.

How does this make it into the Linux Kernel?

To be fair, this does not appear to be the fault of Larry Finger, who maintains this driver. This is the fault of Realtek, for vomiting this terrible driver in the first place, providing absolutely zero documentation and refusing to respond to any contact attempts.

I hope my FreeBSD port is cleaner and more performant!

Wasteful Government Spending


A few weeks back my boss called us all in and said, “We have left-over money on this contract I need to spend $1 million in the next two weeks”.
So we ordered a whole bunch of hardware and software – a $60,000 server, 2 smaller $6,000 servers, 4 NAS appliances, switches, about 40 $15k monitors, lots of software, two racks, switches, firewalls, etc.
We set all this up, setup an infrastructure, VLANing, configured everything.
The government just came in and said none of this was properly approved and it needs to be shutdown.
That’s $1 million wasted.

My dad said the ATF was the most wasteful government agency he ever worked for. He said they once ordered hundreds of high-end radios (the MSRP was $2,000), warehoused them, never used them for 10 years, and then sold them on the open market.

This is why we are in debt. Programs are right, but spending is insane.