Updating mlx4en Firmware on FreeBSD

My testbed has Nvidia/Mellanox/Chelsio 10GbE network cards which are quite old, but sit well in the price (super cheap) usability (they work great on FreeBSD ond Linux) spectrum.

There is an issue on FreeBSD 14 -CURRENT where when you load the kernel module for the card (mlx4en) kldload hangs. If you hit control C the process will continue and the module will load properly. This is also an issue when you load the module using kld_list in rc.conf and as my router machine can't be managed with serial yet I have no way to press control C when it is booting.

On the bug report I was asked if the firmware is up to date. It wasn't and doing so was not fun.

You should follow the Mellanox instructions rather than my blog post to do a firmware update, but The Mellanox FreeBSD documentation for ( Linux ) the cards is from 2015 and this is the process that worked for me in 2021.

Mellanox have a tools package you can download, there is also a port called mstflint you can install:

# pkg install mstflint

The Mellanox tools need to know which card they are speaking to, you can find the card with pciconf once you have loaded the kernel module:

# pciconf -lv | grep mlx4
mlx4_core0@pci0:9:0:0:  class=0x020000 rev=0x00 hdr=0x00 vendor=0x15b3 device=0x1007 subvendor=0x15b3 subdevice=0x000c

Downloading the firmware for card required a OPN and a PSID from the card. You can use mstflint to get information about the card with the pci address and the 'q' query command:

# mstflint -d pci0:9:0:0 q          
Image type:            FS2
FW Version:            2.40.5030
FW Release Date:       4.1.2017
Product Version:       02.40.50.30
Rom Info:              type=PXE version=3.4.746
Device ID:             4103
Description:           Node             Port1            Port2            Sys image
GUIDs:                 ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff 
MACs:                                       ec0d9ae13420     ec0d9ae13421
VSD:                   
PSID:                  MT_1200111023

This didn't give me an OPN so instead I looked at every entry on the download page until I found the correct PSID.

With the firmware downloaded and unzipped you can flash it using mstflint :

# mstflint -d pci0:9:0:0 -i fw-ConnectX3Pro-rel-2_42_5000-MCX312B-XCC_Ax-FlexBoot-3.4.752.bin b

Current FW version on flash:  2.40.5030
New FW version:               2.42.5000

Burning FS2 FW image without signatures - OK  
Restoring signature                     - OK

FreeBSD/Ubuntu Dual-boot testbed using Desktop Hardware

Tony needed his computers back, he is a great friend and so he offered me some replacements. This means that we get to make version 2 of the "The Bedroom by the bed testbed" .

The first version of the testbed was built to answer questions of the form:

"How does stock Ubuntu compare to FreeBSD?"

Version 2 of the testbed continues this approach with an addition:

How does stock Ubuntu compare to FreeBSD, and what if we try emulating a satellite network?"

Tony and I spoke for a while about my needs from a testbed, the machines he lent me before were part of a set, they were two boxes in a 6 node Bioinfromatics cluster. He needed them back to start doing shake out tests of the cluster so he can start selling time on it ( check out his excellent company ).

While I have been using his machines all of the work so far has been developing experiment tooling. For me the time to replace the machines was actually quite fortunate, I have managed to get automation working, but I don't yet have any finished results. This means that I can move to new machines and apart from a short down time reconfiguring things there shouldn't be any disruption.

To replace the Opteron 6380 systems he offered me 2 Threadripper 1950X systems.

-left , -right

  • Threadripper 1950X
  • Asrock X399M Taichi motherboard
  • 32GB RAM @2666MHz ( -right has 64GB, -left will get more once it arrives in the mail)
  • SSD storage

The Opteron machines were server motherboards, in massive whale sized cases and they made whale sized sounds when they were running. The Threadrippers are in lovely little mini-ATX cases and when they are running the make a lovely little hum that easily vanished into the background when I type on my cherry blue keyboard.

The smaller quieter form factor comes from the use of desktop hardware, sadly this also means the loss of lights out management.

When Tony listened to my needs he also offered me a third system (so I can answer the questions the addition raises), but I turned him down. For longer term plans I need to own machines and I am grudgingly happy to buy hardware to get experiments running.

I wanted to lean hard into using desktop hardware for non desktop tasks (my computers don't have an SLA to fulfil). With this goal in mind I speced out a Ryzen 5950X system that was a bit of a monster. I got cold feet at the price and decided to build a compatible system using a lower end Ryzen 7 processor. Gazing into the future I think this might be a safe bet, if I need more compute I should be able to pick up a 5950X on ebay for ~50% of the list price.

For 10GbE network I put a single port Mellanox Connect-X 3 Pro interface in each.

pokeitwithastick

  • Ryzen 3700x
  • Asrock X570 PRO4
  • 32GB RAM @2666MHz
  • NVME storage

From the 'big machine' spec I culled this down to using less and slower RAM and lower end processor. I think I should be able to push up this rig with more faster RAM and a bigger processor, but I get the flexibility to try doing this on the cheap first.

The Ryzen system has a Mellanox Connect-X 3 Pro interface with Dual ports. It is going to be routing packets.

This machine is able to be a more moving target so I installed FreeBSD-14-CURRENT on it from a recent snapshot. The use of stock CURRENT is worth noting when you think about the performance of the Ryzen system compared to the others.

I only know 1 functional test that I really care about and that is building FreeBSD. I pulled the pairs of drives from the testbed v1 machines and transferred them over (the lack of 2.5" drive hot swap on the Threadrippers was annoying). Once I got the drives on the correct SATA cables the boxes came up and I was able to see how all three machines did:

host            processor       RAM time buildworl buildkernel

freebsd-left        Opteron 6380        128GB   58:00
freebsd-left            Threadripper 19050X  32GB   30:45
freebsd-right           Threadripper 1950X   64GB   30:06
pokeitwithastick        Ryzen 3700X      32GB   33:09

Network Setup

With a third testbed machine the network diagram from before changes slightly. Rather than the interfaces of the two machines being connected back to back, they are now connected to pokeitwithastick which is acting as a router.

The network now looks like this:

              -left                pokeitwithastick        -right

                          10.0.10.x               10.0.20.x
           +-------------+         +-------------+         +-------------+
           |           .2|         |.1         .1|         |.2           |
           |             |         |             |         |             |
           |       mlxen0+<------->+mlxen0 mlxen1+<------->+mlxen0       |
           |             |         |             |         |             |
           |     igb0    |         |    igb0     |         |     igb0    |
           +------+------+         +-----+-------+         +------+------+
                  |                      |                        |
                  |               freebsd|192.168.100.50          |
                  ._______               |                    ____.
                          \________.     |    .______________/
freebsd 192.168.100.10             V     V    V              freebsd 192.168.100.20
linux   192.168.100.11           +--------------+             linux   192.168.100.21
                                 |    switch    |
                                 |   (openwrt)  |
                                 +--------------+
                                        ^
                                        |
                                        | freebsd 192.168.100.2
                                   +---------+
                                   | control |
                                   +---------+

This is a pretty standard dumbell network and is good set up for performance work when you need a bottleneck.

Remote Power On

There were two features of the previous hardware that I really liked. Both machines had serial ports, which gave me a last ditch management interface option if I completley hosed the network while I wasn't at home. And the machines support lights out management with IPMI. In a sheer irony, even if the boxes hadn't had serial broken out on the motherboard, IPMI would have given me access.

I was using serial and IPMI to allow me to power on the machines remotely and control which Operating System they booted into. IPMI allowed power on, grub was configured to output to serial and video and that gave me boot control.

Tony doesn't have the same requirements as me for machines so while my Asrock X570 motherboard has a COM Port header, the Taichi motherboards don't.

Wake on LAN (WOL) is a poor replacement for IPMI power control. It is sort of famously badly implemented and it is sort of clear why. It is a packet with the MAC addresses repeated a bunch of times that turns on the system by magic. No matter your opinion of WOL the Intel network interfaces on all three machines seem to be very good at booting with WOL when they get the packets.

WOL had to be configured in the BIOS before it could be used:

In the X570 PRO4 bios configure:        
    Advanced->ACPI Configuration->PCIE Devices Power On
        "Allow the system to be wakeed up by a
        PCIE device and enable wake on LAN"

In the Taichi bios configure:           
    Advanced->ACPI Configuration->PCIE Devices Power On
        "Allow the system to be wakeed up by a
        PCIE device and enable wake on LAN"

With the BIOS set up remote power on requires using a WOL tool to send a magic packet, the control host is well placed on the network to do this with the wol command:

control $ wol a8:a1:59:95:87:60

After running the command I got nothing.

This is fine, I am a network engineer and a hacker(!), I can debug this sort of issue. Some time with tcpdump showed that I wasn't getting broadcast traffic through at all.

I tested this assertion by using a host directed WOL packet:

control $ wol a8:a1:59:95:77:ab -i 192.168.100.50

These packets appear on the host in tcpdump and after a power off are able to wake the machines up. Well at first, if I waited a while then the machine was still not responding to the WOL packet.

The diagram in the v1 network and the diagram I would have drawn for the v2 network at first was a lie. control is connected to the switch on the back of an OpenWRT router that is acting as a WiFi client to the network in the house. That switch is in turn connected to an unmanaged Netgear switch that all the rest of the network is connected to (IPMI and useful interfaces).

One of these switches was not forwarding broadcast traffic and it was only forwarding unicast traffic when the host was 'alive' enough. Not having IPMI anymore I was able to remove the Netgear switch, my needs now fit onto the four port switch on the OpenWRT router. Removing the Netgear switch didn't solve the problem and I can't remove the OpenWRT router so a different solution is required.

OpenWRT has a tool called etherwake to support Wake On LAN, I installed this on the router and I immediately got consistently working WOL:

root@OpenWrt:~# etherwake a8:a1:59:95:77:ab

I have to ssh to the router to run power on commands, but that is enough for the testbed to be useful now.

Controlling the booted OS

Having a serial interface to access the grub menu and select the booted OS was great. But I have to wipe away my tears and accept that this isn't possible with this hardware.

Grub supports something called grub-reboot , normally grub tries very hard to not write anything to disk in normal operation. You can however configure grub to use a scratch space and remember which operating system was booted before (and maybe other things).

grub-reboot uses this mechanism from the Operating System to control which menu item grub uses as default when it boots. This is part of the grub environment and is documented here .

/etc/default/grub

# If you change this file, run 'update-grub' afterwards to update
# /boot/grub/grub.cfg.
# For full documentation of the options in this file, see:
#   info -f grub -n 'Simple configuration'

GRUB_DEFAULT=saved
GRUB_TIMEOUT_STYLE=menu
GRUB_TIMEOUT=10
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 console=tty0"
GRUB_CMDLINE_LINUX=""

# Uncomment to disable graphical terminal (grub-pc only)
GRUB_TERMINAL="console serial"
GRUB_SERIAL_COMMAND="serial --speed=115200"

To use grub-reboot you need to set the default menu entry to 'saved'. If you changed grub to enable grub-reboot make sure to update grub:

# sudo update-grub

With grub configured with a default menu entry of saved , we can configure grub to boot the next time from a different menu entry other than the first one.

# sudo grub-reboot 4
# sudo reboot

On these systems this allows me to boot from the FreeBSD menu entry (it is fifth in the list). Without serial to boot into FreeBSD I have to do a round trip into Linux, but this is a lot better than having to sit near the hot loud computers.

Performance

The generation change in AMD hardware had a huge improvement in processing speed, going from 58 minutes for a FreeBSD build to 30 minutes is amazing. These machines do networking stuff so it is good to look at network benchmarks for a baseline.

On Ubuntu and FreeBSD, the Threadripper machines are able to saturate 10GbE with TCP and get a about 6Gbit/s of UDP traffic. They can generate enough UDP to saturate the link so this is a huge step forward.

iperf3 benchmarks where the Ryzen system is the receiver manage half the traffic that the Threadripper systems do, capping out at about 4.5Gbit/s, the Ryzen can however send enough to saturate the link.

Base forwarding tests show no change in the throughput of the Threadripper systems. I had never considered that receive could be harder than transmit, but these baselines seem (and chatting to FreeBSD developers) seem to suggest that this isn't uncommon. For now this isn't a problem, but later I might need the Ryzen system to have more head room when running tests. If that happens I'll have a good reason to get a faster processor :D

FreeBSD/Ubuntu Dual Boot Homelab in The Bedroom by the bed testbed

Current events have meant that my work place is now my home office, frustratingly this is also where I sleep. On one hand this has resulted in a very short commute, but on the other hand it does mean that I am living in close quarters with the computers I use for experiments, on the third hand (where did that come from?) it means that I get to have a testbed in the room where I keep my bed.

Of course this raises the serious question, if I write tests from my bed, which bed is the testbed? Unfortunately I only did one year of philosophy and so others will have to offer answers to this grand question.

I am in the unusual (for me) situation of needing to (well getting do, I love perf) do network performance tests on real hardware, thankfully my friend Tony was able to lend me two machines from his Bioinformatics cluster to play with for a couple of months.

This testbed exists to answer questions of the form:

"How does stock Ubuntu compare to FreeBSD?"

For these tests to be as fair as possible I need to have as identical hardware for the tests as possible. Tony enabled this by giving machines built out to the same spec, annoyingly his target wasn't "push packets as fast as possible", but was instead "give a reasonable mix of a ton of storage and compute to look at DNA sequences. Anything weird in these computers is clearly his fault and we are very greatful to get to experience them.

A dmesg from one of the boxes is here , but roughly they are:

  • CPU: AMD Opteron(tm) Processor 6380 (2500.05-MHz K8-class CPU)
  • 128GB RAM
  • SuperMicro MNL-H8DGI6 motherboard
  • A pair of SSDs on a PCI-e SATA controller

On top of that I added a pair of dual port 10GbE interfaces I found 'lying' around the labs. One is an Intel X520 82599ES and the other is a Mellanox ConnectX-3 Pro. These interfaces don't match and have different performance characteristics, this is fine for setting up the experiments, but I am going to replace them with a matched pair of single Interface 10Gb adapters for 'production' experiments..

My normal method for evaluating if a machine is fast is to build FreeBSD, they managed a buildworld buildkernel in a respectable 58 minutes.

Setup

                        -left                                 -right                        
                 +------------------+    10.0.x.x      +------------------+                
                 |                  |.10.2        .10.1|                  |                
     ipmi        |            mlxen0+<---------------->+ix0               |     ipmi          
192.168.100.173  |                  |                  |                  | 192.168.100.167
                 |            mlxen1+<---------------->+ix1               |                
                 |                  |                  |                  |                
                 |          igb0    |                  |     igb0         |                
                 +-----------+------+                  +------+-----------+                
                             |___                          ___|                            
      freebsd 192.168.100.10     \_______          _______/   freebsd 192.168.100.20       
       linux  192.168.100.11             V        V           linux   192.168.100.21
                                      +--------------+                                     
                                      |    switch    |                                     
                                      +--------------+                                     
                                             ^                                             
                                             |
                                             | freebsd 192.168.100.2                     
                                        +---------+                                        
                                        | control |                                        
                                        |  host   |                                        
                                        +---------+

The two boxes are named with the suffixes '-left' and '-right' with the running OS setting the prefix, so we have freebsd-left, freebsd-right, ubuntu-left and ubuntu-right.

The machines have 3 network interfaces on the mother board, Dual Gigabit Intel Ethernet and an interface for IPMI. On each, one interface and the IPMI are connected to a switch which is in turn connected to the switch in the wireless router that bridges to the WiFi in my house. This setup is a little complicated, but because there isn't ethernet run up to my bedroom WiFi is the only sensible way to connect to the Internet. I'd much preferred a bit of NAT weirdness compared to having to set up WiFi in testbed machines.

I connected the serial ports on '-left' and '-right' to my control host, which is in the same switch domain as the hosts. I configured the SuperMicro motherboard to send the bios to the serial port.

I am really not using IPMI for all its abilities and instead it is a fancy remote power button I can press from the control host:

    [control] $ ipmitool -I lanplus -H 192.168.100.173 -U ADMIN -P ADMIN chassis power on   # power on -left
    [control] $ ipmitool -I lanplus -H 192.168.100.167 -U ADMIN -P ADMIN chassis power on   # power on -right

Serial Console

The serial ports are then connected to the control computer with an awesome two headed usb serial cable (I don't know what it is and would buy more if I did). The operating systems on -left and -right are configured to offer consoles over serial so I don't have to worry about breaking the network and locking myself out when I am far away.

On FreeBSD getting serial for loader and the system requires adding config to loader.conf and is documented in the FreeBSD handbook . Look at the bottom where it says "Setting a Faster Serial Port Speed" (I think the rest of the stuff on the page is out of date and rebuilding with a custom config is no longer required):

/boot/loader.conf:

    boot_multicons="YES"
    boot_serial="YES"
    comconsole_speed="115200"
    console="comconsole,vidconsole"

This configures loader to use both the video console and the serial console, tells it to use serial and sets the serial to the baud rate '115200' from the slow default of 9600. This baud rate matches between FreeBSD, Ubuntu and the BIOS so I don't have to reconfigure my serial terminal.

Getty (the thing that gives you login prompts) on FreeBSD is configured as 'onifconsole', so no further config is required. You can check this in /etc/ttys :

/etc/ttys:

    ...
    # The 'dialup' keyword identifies dialin lines to login, fingerd etc.
    ttyu0   "/usr/libexec/getty 3wire"      vt100   onifconsole secure
    ttyu1   "/usr/libexec/getty 3wire"      vt100   onifconsole secure
    ttyu2   "/usr/libexec/getty 3wire"      vt100   onifconsole secure
    ttyu3   "/usr/libexec/getty 3wire"      vt100   onifconsole secure

Getting Serial for GRUB on Ubuntu (in 2021) requires adding the to /etc/default/grub.d and rebuilding the config file with update-grub , this isn't really documented, but information can be found in the grub manual and in a selection of blogposts . For grub serial we need to add:

/etc/default/grub.d

    GRUB_TERMINAL="console serial"             
    GRUB_SERIAL_COMMAND="serial --speed=115200"

I am pretty sure the grub.d that ships with ubuntu is out of date with the actual file, when I rebuilt the menu timeout broke, it went from the default 10 seconds to the 0 seconds in the grub.d that I edited. I didn't care enough to file a bug report, this was a lot of faff.

To get console message from the Linux kernel you need to change the flags passed to the kernel when it is booted, you can do this too from grub.d :

/etc/default/grub.d:

    GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 console=tty0"

This tells the Linux kernel to use ttyS0 (com0) as the console, configures the baud rate to '115200' and tells the kernel to use tty0 as the console. After a few messages the kernel will hand over to something else that will ignore the console config if you haven't also configured systemd to offer a console.

To configure systemd to use a console you need to create a services file and it will handle the magic for you. This is documented on the Ubuntu wiki, but everything there is wrong. Instead the systemd versions are available in blog posts you can find online. I found the best results following documentation for a different distro targeting the Raspberry Pi 3 .

    # systemctl enable serial-getty@ttyS0.service
    # systemctl start serial-getty@ttyS0.service

You should now have a working getty on serial, but I think you then need to kick something else, I could only get this to work by rebooting.

Booting

To do comparisons I need to be able to boot both Operating Systems and manage them remotely. Dual boot of some sort means that I can dig into differences on the two the platforms quickly and get answers from the running systems.

Dual booting the machines turned out to be a lot harder than I expected. When I got the bios output working on serial I thought I was on to a winner, but that pesky SATA controller doesn't play well with the BIOS boot menu. Only the first drive in the SATA controller pair appears in the boot selector leaving me plumb out of luck.

Instead I dove into the Linux world. Knowing that grub knows how to boot FreeBSD I went with using grub to get a boot menu that I can control from the serial port. ( side note: I know that grub is a multiboot compatible boot loader, meaning that it will boot anything that matches that spec. I think it is also multiboot compatible and can be chained, i.e. grub can boot grub. If that is the case then FreeBSD's loader is also multiboot compatible and the FreeBSD kernel is probably too, can loader then boot grub? It will take a truly brave person to figure this particular puzzle out. )

After installing FreeBSD to the second SSD in the SATA controller. I got messages about a FreeBSD install being detected when I ran update-grub . I think these were just for fun though, I didn't get any new menu entries when I test rebooted. I installed FreeBSD by pulling the drive I installed Ubuntu to, booting a FreeBSD USB installer and installing to the only drive (thanks hot swap bay!).

Configuring grub to boot FreeBSD requires adding an entry to one of the extra config files. Internet searching suggested /etc/grub.d/40_custom which I filled out with:

/etc/grub.d/40_custom:

    #!/bin/sh
    exec tail -n +3 $0
    # This file provides an easy way to add custom menu entries.  Simply type the
    # menu entries you want to add after this comment.  Be careful not to change 
    # the 'exec tail' line above.

    menuentry "FreeBSD 13.0" {
    set root=(hd1)
    chainloader +1
    }

A Unix StackExchange Answer helped me figure out the rough grub commands to use and I tried them out on the grub command line (press 'c' from the menu).

The final grub config looks like this ( notice that default grub is friendly and doesn't beep by default ), with above /etc/grub.d/40_custom :

/etc/default/grub:

    # If you change this file, run 'update-grub' afterwards to update            
    # /boot/grub/grub.cfg.                                                       
    # For full documentation of the options in this file, see:                   
    #   info -f grub -n 'Simple configuration'                                   

    GRUB_DEFAULT=0                                                               
    GRUB_TIMEOUT_STYLE=menu                                                      
    GRUB_TIMEOUT=-1                 # pause at bootloader menu                   
    GRUB_DISTRIBUTOR=lsb_release -i -s 2> /dev/null || echo Debian               
    GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0,115200 console=tty0"               
    GRUB_CMDLINE_LINUX=""                                                        

    # Uncomment to enable BadRAM filtering, modify to suit your needs            
    # This works with Linux (no patch required) and with any kernel that obtains 
    # the memory map information from GRUB (GNU Mach, kernel of FreeBSD ...)     
    #GRUB_BADRAM="0x01234567,0xfefefefe,0x89abcdef,0xefefefef"                   

    # Uncomment to disable graphical terminal (grub-pc only)                     
    #GRUB_TERMINAL=console                                                       
    GRUB_TERMINAL="console serial"                                               
    GRUB_SERIAL_COMMAND="serial --speed=115200"                                  

    # The resolution used on graphical terminal                                  
    # note that you can use only modes which your graphic card supports via VBE  
    # you can see them in real GRUB with the command `vbeinfo'                   
    #GRUB_GFXMODE=640x480                                                        

    # Uncomment if you don't want GRUB to pass "root=UUID=xxx" parameter to Linux
    #GRUB_DISABLE_LINUX_UUID=true                                                

    # Uncomment to disable generation of recovery mode menu entries              
    #GRUB_DISABLE_RECOVERY="true"                                                

    # Uncomment to get a beep at grub start                                      
    #GRUB_INIT_TUNE="480 440 1"

Baseline Measurements

Before running more enjoyable experiments it is a requirement to get baseline measurements for what the systems can do. I think I need a bit more of a test framework for network performance tests, I want to sample memory usage, CPU usage and get flame graphs for tests, but for starters it is good to get raw iperf3 numbers.

For each configuration, I ran forward and backward iperf3 tests with UDP and TCP. I let iperf3 run in its default 10 seconds measurement mode, for UDP I requested it try infinite bandwidth (-b 0).

For each case I ran iperf3 as a server on *-right and the client on *-left .

Remember for these that by default the client iperf3 process sends and the server receives, this is swapped with the -R flag.

freebsd-left -> freebsd-right (server)

    tcp             iperf3 -c 10.0.10.1
            [  5]   0.00-10.00  sec  6.58 GBytes  5.66 Gbits/sec    0             sender
            [  5]   0.00-10.00  sec  6.58 GBytes  5.65 Gbits/sec                  receiver

    tcp             iperf3 -c 10.0.10.1 -R
            [  5]   0.00-10.00  sec  8.34 GBytes  7.16 Gbits/sec  2485            sender
            [  5]   0.00-10.00  sec  8.34 GBytes  7.16 Gbits/sec                  receiver

    udp             iperf3 -c 10.0.10.1 -u -b 0

            [  5]   0.00-10.00  sec  3.63 GBytes  3.11 Gbits/sec  0.000 ms  0/2666610 (0%)  sender
            [  5]   0.00-10.00  sec  2.03 GBytes  1.74 Gbits/sec  0.006 ms  1173881/2666555 (44%)  receiver

    udp             iperf3 -c 10.0.10.1 -u -b 0 -R

            [  5]   0.00-10.00  sec  3.24 GBytes  2.79 Gbits/sec  0.000 ms  0/2384960 (0%)  sender
            [  5]   0.00-10.00  sec  1.91 GBytes  1.64 Gbits/sec  0.003 ms  977341/2384881 (41%)  receiver

We run baselines so we can understand what future measurements show. Care has to be take that things are actually fair.

FreeBSD -> FreeBSD on the same hardware is a fair test, but it isn't what we have here. When we compare the forward and reverse modes for the iperf3 measurement we see that when the freebsd-left is the sender for TCP we get a much lower through put than when freebsd-right is the sender. My guess is that this is the difference between the offload engines in the Intel and Mellanox cards.

FreeBSD -> FreeBSD for UDP has interesting results. freebsd-left with the Mellanox card is able to sink more packets into the network than freebsd-right with Intel. Annoyingly these are opposite to the TCP results, where freebsd-right can send more.

This might already be highlighting an interesting place to dig, and it is where I would look next, IF I were comparing network interfaces.

ubuntu-left -> ubuntu-right (server)

    tcp             iperf3 -c 10.0.10.1

    [  5]   0.00-10.00  sec  9.59 GBytes  8.24 Gbits/sec  823             sender
    [  5]   0.00-10.00  sec  9.59 GBytes  8.23 Gbits/sec                  receiver

    tcp             iperf3 -c 10.0.10.1 -R

    [  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec    0             sender
    [  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec                  receiver

    udp             iperf3 -c 10.0.10.1 -u -b 0

    [  5]   0.00-10.00  sec  2.09 GBytes  1.79 Gbits/sec  0.000 ms  0/1546210 (0%)  sender
    [  5]   0.00-10.00  sec  1.50 GBytes  1.29 Gbits/sec  0.006 ms  436284/1546104 (28%)  receiver

    udp             iperf3 -c 10.0.10.1 -u -b 0 -R

    [  5]   0.00-10.00  sec  2.08 GBytes  1.79 Gbits/sec  0.000 ms  0/1544000 (0%)  sender
    [  5]   0.00-10.00  sec  1.12 GBytes   965 Mbits/sec  0.015 ms  710878/1543876 (46%)  receiver

Next up Ubuntu -> Ubuntu. When looking at later measurements we need an idea of where changes come from isolating out variables is a good thing to do.

Again with the TCP tests we see a difference in performance between the two systems, for Ubuntu -> Ubuntu it is approximately 1.2 Gbit/s, whereas for FreeBSD it is around 1.5Gbit/s, but FreeBSD has a lower baseline for comparison.

Next up for UDP with Ubuntu -> Ubuntu we see something really weird, for the ubuntu-left sender and the ubuntu-right sender the packets we try to send are much lower than the FreeBSD hosts. This correlates with lower overall throughput in both tests, but almost half the performance for ubuntu-right looks really weird.

I have a hunch that the lower sending rate is related to better pacing interactions between iperf3 and the Linux kernel. I have no idea why the received rate is so low for ubuntu-right .

freebsd-left -> ubuntu-right (server)

    tcp             iperf3 -c 10.0.10.1

            [  5]   0.00-10.00  sec  10.7 GBytes  9.19 Gbits/sec  1720             sender
            [  5]   0.00-10.22  sec  10.7 GBytes  8.98 Gbits/sec                  receiver


    tcp             iperf3 -c 10.0.10.1 -R

            [  5]   0.00-10.22  sec  8.75 GBytes  7.35 Gbits/sec  1464             sender
            [  5]   0.00-10.00  sec  8.75 GBytes  7.52 Gbits/sec                  receiver


    udp             iperf3 -c 10.0.10.1 -u -b 0

            [  5]   0.00-10.00  sec  3.63 GBytes  3.12 Gbits/sec  0.000 ms  0/2667830 (0%)  sender
            [  5]   0.00-10.04  sec  1.29 GBytes  1.11 Gbits/sec  0.011 ms  1717151/2667664 (64%)  receiver


    udp             iperf3 -c 10.0.10.1 -u -b 0 -R

            [  5]   0.00-10.04  sec  2.07 GBytes  1.77 Gbits/sec  0.000 ms  0/1524220 (0%)  sender
            [  5]   0.00-10.00  sec  1.90 GBytes  1.63 Gbits/sec  0.003 ms  125112/1524149 (8.2%)  receiver

Finally we get to run it all again with both operating systems in play. If everything was optimal (and we therefore had no work to do) there would be no difference in their performance, but we already know that this isn't true.

Running the tests with differing operating systems gives us an opportunity to see if the receiver side of the test has an impact (rather than just the sender). We can do this by pair the faster side with the slower side.

For now though, I think the different network cards are introducing too much variation between the systems. The numbers differ here and that on its own is quite interesting, but there seem to be too many choices for why. I find the Ubuntu -> Ubuntu reverse test halving the rate very suspicious and want to run the tests again.

The variation in the network cards is actually too much for me, I think it is a red flag in the measurements and would only encourage stupid review comments. This annoyed me enough that I bought a pair of Single Port Mellanox ConnectX-3 EN PCIe 10GbE to evaluate before running any meaningful experiments.

These systems are up and running, even with the questions that the baselines raised they are functional enough to start developing the interesting parts of the experiments and writing enough automation glue to rule out me making mistakes. I can then return, rerun the automated experiments and get better numbers.

Advanced Documentation Retrieval on FreeBSD

On Fri, Jul 16, 2021 at 04:23:20PM -0400, ░▒▓░░ ░▐░▒ wrote:
> Hi Tom.
> I just not realized you've not have your inbox assaulted by our listener
> feedback. Is there a specific address you'd like that routed to?
> 
> Anyway, here is a choice one.
> 

*sigh*

░▒▓░░ I don't know what we should do about Michael . I spoke to a priest
about an exorcism and he said "I'm not going near that monster, not on your
life", which I thought was pretty alarmist for a priest.

Follows is a rough markdown draft of the article "Advanced Documentation
Retrieval on FreeBSD" as we discussed

- Tom

-----------

Advanced Documentation Retrieval on FreeBSD

FreeBSD is renowned for its very high quality documentation. For many queries the man pages have a wealth of accurate and up to date documentation that is frequently a surprise to uses of other operating systems. It is not uncommon to hear from new FreeBSD users that they have to relearn to try the man pages before searching on the web.

Beyond man pages FreeBSD also has very high quality documentation in the form of the FreeBSD handbook. Only talking about the FreeBSD Handbook actually cuts short the range of really high quality documentation that the FreeBSD project offers.

The FreeBSD Handbook covers installation and day to day usage of a FreeBSD system and it is kept reasonably up to date by the FreeBSD documentation team. The handbook is an amazing document that comes from a time before blog posts and wikis and it contains a mixture of official project direction and tutorial style walkthroughs on how to use different FreeBSD subsystems and third party software. Not everything is covered by the handbook and many times only one path of a piece of software use is covered, but it is an excellent resource to get started using and configuring a FreeBSD system.

Deeper into FreeBSD there are several other 'books' that the FreeBSD project maintains. The full list of books is available from https://docs.freebsd.org/en/books/ , it includes technical information about how the FreeBSD kernel works in the forms of the Design and Implementation of the 4.4BSD Operating System and the Architecture handbook. Documentation on how to contribute to different parts of the operating system as the porters-handbook, the fdp-primer and the developers-handbook.

The FreeBSD project also hosts a wiki which contains less formal and in progress documentation, written by users and developers. The wiki can sometimes be much more like a temporary source of information, but it does contain valuable guides. It is the right place for pages such as the laptop compatibility matrix . The wiki is a unique FreeBSD project resource in that users are also able to have edit access.

What else is there?

Beyond the documentation the project provides there are outside sources of information on how to use and configure a FreeBSD system. Searching the web will bring up a lot of Technical information in the form of blog posts and articles.

Searching the web is not the only way to get more information on FreeBSD systems. We can use external 'daemonised' resources by using the FreeBSD base system tool invoke . This tool is a little esoteric to use and sadly it is one of the excellent FreeBSD tools written by developers that just quite haven't seen the light of day.

invoke ships in the FreeBSD source tree and is in src/tools/tools/invoke . invoke isn't built by default, but it is easy to build on a system using its Makefile:

# cd /usr/src/tools/tools/invoke
# make
# make install

invoke requires quite a lot of information to be useful, annoying the author ���������@ was in the process of writing a book on invoke when they went missing travelling in rural Romania. However from reading the source we can see the list of information or 'principals' required to correctly invoke documentation.

Principals are information locators which are tied closely to the source of the information. Personal web pages of authors work well and are easy to obtain, more potent sources such as hand written notes or the authors blood work the best, but we can substitute social media accounts to get a similar level familiar information about the author.

XXX more on principals XXX

For our example I have collected the personal web page, blog and twitter account of the source we want to use, we can pass them to invoke as arguments or in a configuration file.

In addition to 'principals' the invoke tool needs to be run from a special environment. A comment in the source code describes this, but it took the author some trial an error to figure it out in practice.

/*
 * invoke must be run from either a larger or lesser circle. These
 * can be ancient such as the very high quality circle at Midmar,
 * however, if you are unable to travel a Ars Theurgia Goetia will
 * suffice. You must interface the machine running invoke via a 
 * galvanic isolator. The transformer in an Ethernet connector with
 * magnetics works great, if you only have an SBC you'll have to 
 * figure out something with transformers
 */

From this comment we can see that we need to create a substitute circle to use with invoke and connect it to our computer, but isolate it from the machine. If we fail to isolate properly we can damage the machine and likely destroy it.

We can create the substitute circle by using our preferred ethereal bonding fluid. Blood works very well, but collection can be legally tricky. Luckily we can use a vegan alternative in the form of beeted oat milk. If your local shops don't carry beeted oat milk you can make your own by mixing oat milk with beetroot juice during a full moon.

With the substitute bonding fluid we need to draw our circle for summoning and holding triangle (you can see a good example here ). In this example we are going to use an Ethernet interface with magnetics, all you really need to do is to plug it into the circle.

With the interface connected to the circle we can check for state using ifconfig, we need to look for the ETHER flag in the list of options.

igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500                            
options=e507bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6,ETHER>
ether ac:1f:6b:46:9e:da                                                                               
inet 11.14.17.13 netmask 0xffffff00 broadcast 137.50.17.255 
inet6 fe80::ae1f:6bff:fe46:9eda%igb0 prefixlen 64 scopeid 0x1                                         
media: Ethernet autoselect (1000baseT <full-duplex>)                                                  
status: active                                                                                        
nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>

Running Invoke

With our principals collected, and our link established to the circle we can run the invoke command as follows:

# invoke -p [principals] [-i interface] [question]

We pass in the principals we have gather and tell invoke the interface name. If you hacked together a device you'll need to pass in the gpio controller and pin using -d, for the gpio controller and -P for the pin number.

# invoke -p [principals] -d /dev/gpioc0 -P 1 [question]

The final argument to invoke is a quoted string that contains the question.

If you are all set up then you can use invoke to summon your documentation author of choice, for this example I am using Michael W. Lucas :

# invoke -p https://mwl.io https://twitter.com/mwlauthor -i igb0 "How do I configure dummynet with weighted fair queues?"

If all goes well you should hear (an normally quite grumpy) disembodied voice bark the answer to your question back to you.

As great as this is for single questions sometimes you want the author to hang around and help with protracted debugging sessions. We can achieve this with the manifest option to invoke:

# invoke -p https://mwl.io https://twitter.com/mwlauthor -i igb0 manifest

With this option you will get the author directly summoned into the triangle attached to your summoning circle. Care should be taken with manifestations, the circle and triangle must remain intact for the duration of the session. If you are using a laptop (which I recommend and you need to use with an existing circle). Then make sure to watch the battery life careful.

If you don't properly shutdown the session then you risk getting the spirit of the author trapped in your machine and no one wants that.

Conclusion

FreeBSD has a great range of available documentation, but sometimes you hit the limits of information that is readily available online. Here we discussed the invoke command and the ways it can be used to get direct help from the authors of high quality documentation that are a little too public.

Blog more in 2020

In June I tried to write 4 blog posts and I elicited help from some of my friends to do this. I managed to write 5 posts beyond the announcement I would blog:

Of course it wasn't just me, I asked other people to blog to help me stay on track. The idea here was that seeing other peoples blog posts would inspire and force me to keep going. This worked reasonably well. The pressure to write the blog posts was there, but publishing was harder. This ended up with me pushing several posts in the final few days of June.

The pressure didn't really show up either, I know that the others wrote blog posts, but they didn't tell me!

They were great sports to get involved and help me with this, you should look up their blogs and drop them into your rss reader.

Because this wort of worked I think we should aim to keep doing this. Now 4 posts a month is a lot (maybe even too much) and so I thought that 8 more this year would be good. That is about 1.3333333... a month and seems entirely achievable.

I am going to try and blow this number out the water, but even if I fail completely and only manage one or two more post that will still be great.

Posted on

previous next