Realmode bhyve
I have been poking around bhyve, seeing what is up and I came across this article about writing a Linux kvm driver from scratch . In the article is an example of minimal program to run as a first test in the kvm driver:
; Output to port 0x3f8
mov dx, 0x3f8
; Store the address of the message in bx, so we can increment it
mov bx, message
loop:
; Load a byte from `bx` into the `al` register
mov al, [bx]
; Jump to the `hlt` instruction if we encountered the NUL terminator
cmp al, 0
je end
; Output to the serial port
out dx, al
; Increment `bx` by one byte to point to the next character
inc bx
jmp loop
end:
hlt
message:
db "Hello, KVM!", 0
That seems fun, a nice small example of getting some code running. I don't really want to write my own bhyve, I like the one we have, but it might be nice to try and get this running.
I assembled the example:
nasm -fbin nello.S nello
And looked around to see how to load a bios in bhyve. bhyve(8) has some
examples at the end, it looks like the
-l
flag can be used to set a bootrom
(bios) like so:
$ sudo bhyve -l bootrom,./nello nello
vm exit[0]
reason VMX
rip 0x000000000000fff0
inst_length 3
status 0
exit_reason 48 (EPT violation)
qualification 0x0000000000000784
inst_type 0
inst_error 0
Well that didn't work. I poked a bit in bhyve, but it wasn't clear what to do
about an EPT violation. The examples also mentioned using
/usr/local/share/uefi-firmware/BHYVE_UEFI_CODE.fd
, I opted for the CSM
version:
$ sudo bhyve -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI_CSM.fd hello
I had a poke around the CSM bootrom and while it is always fun to use hexdump, it really didn't help me understand what was wrong with my example assembly.
I tried with
BHYVE_UEFI_CSM.fd
and guess what I got:
vm exit[0]
reason VMX
rip 0x000000000000fff0
inst_length 3
status 0
exit_reason 48 (EPT violation)
qualification 0x0000000000000784
inst_type 0
inst_error 0
The same trap!
I think that means I need to figure out the minimal viable bhyve command that will run known good bootrom before I try running that example. The last example in bhyve(8) is:
Run a UEFI virtual machine with a VARS file to save EFI variables. Note
that bhyve will write guest modifications to the given VARS file. Be
sure to create a per-guest copy of the template VARS file from /usr.
bhyve -c 2 -m 4g -w -H \
-s 0,hostbridge \
-s 31,lpc -l com1,stdio \
-l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI_CODE.fd,BHYVE_UEFI_VARS.fd
uefivm
-w
waits for the debugger and
-H
emulates halt to save power, no need for
those. So I tried:
bhyve -s 0,hostbridge -s 31,lpc -l com1,stdio -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI_CSM.fd hello
And that worked:
Boot Failed. CDROM 0
Boot Failed. Harddisk 1
UEFI Interactive Shell v2.1
EDK II
UEFI v2.40 (BHYVE, 0x00010000)
Error. No mapping found
Press ESC in 1 seconds to skip startup.nsh or any other key to continue.
Now to try my bios:
$ sudo bhyve -s 31,lpc -l com1,stdio -l bootrom,./nello hello
bhyve: ROM size 65552 is not a multiple of the page size
Device emulation initialization error: No such file or directory
32 (the raw unpadded 16 bit program size) is also not a multiple of the page
size, I padded out the example using
TIMES 4096 - ($ - $$) db 0
from a
bootsector nasm example
This has not succeeded.
Fine, whatever, I will use gdb to look at what is going on. bhyve supports the -G flag to integrate with gdb. I added
-G wlocalhost:1234
to the bhyve command asking bhyve to wait for gdb to attach and continue listening on localhost port 1234.
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
0x000000000000fff0 in ?? ()
(gdb) x/32i 0x000000000000fff0
=> 0xfff0: add %al,(%rax)
0xfff2: add %al,(%rax)
...
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
(gdb) x/32x 0x000000000000fff0
0xfff0: 0x00000000 0x00000000 0x00000000 0x00000000
0x10000: 0x00000000 0x00000000 0x00000000 0x00000000
...
0x10060: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/32x 0x0
0x0: 0x00000000 0x00000000 0x00000000 0x00000000
0x10: 0x00000000 0x00000000 0x00000000 0x00000000
0x20: 0x00000000 0x00000000 0x00000000 0x00000000
0x30: 0x00000000 0x00000000 0x00000000 0x00000000
Connecting and poking around shows the obvious places are all zeros (or sometimes all 1s).
gdb has a 'find' command for searching memory, our example is pretty distinctive so it should find it.
Didn't work for me this time
Stepping immediately just starts the program, for nello we are stopped with rip as 0x000000000000ffef.
0x000000000000ffef in ?? ()
(gdb) x/64x $rip
0xffef: 0x960000ff 0x00ffff00 0x00000200 0x46f00000
0xffff: 0x00000000 0x00000000 0x00000000 0x00000000
disassembly time, FreeBSD's llvm-objdump doesn't have support for 16 bit x86 (fair), so I grabbed binutils and used a command like this:
x86_64-unknown-freebsd15.0-objdump -b binary -m i386 -D -Maddr16,data16 -Mintel nello
Working from objdump I tweaked some offsets to get bytes into the correct places with padding, but there wasn't an obvious clue what was up. I couldn't associate the memory I could read in gdb to anything from my binary.
$ hexdump -C nello
00000000 ba f8 03 bb 11 00 8a 07 3c 00 74 04 ee 43 eb f6 |........<.t..C..|
00000010 f4 48 65 6c 6c 6f 2c 20 62 68 79 76 65 21 00 90 |.Hello, bhyve!..|
00000020 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 |................|
*
0000fff0 e9 0d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00010000
I turned to qemu to see if that helped:
$ qemu-system-i386 -bios nello -S -s -nographic
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
0x0000fff0 in ?? ()
(gdb) x/32xb 0xffff0000
0xffff0000: 0xba 0xf8 0x03 0xbb 0x11 0x00 0x8a 0x07
0xffff0008: 0x3c 0x00 0x74 0x04 0xee 0x43 0xeb 0xf6
0xffff0010: 0xf4 0x48 0x65 0x6c 0x6c 0x6f 0x2c 0x20
0xffff0018: 0x62 0x68 0x79 0x76 0x65 0x21 0x00 0x90
(gdb) x/16xb 0xfffffff0
0xfffffff0: 0xe9 0x0d 0x00 0x00 0x00 0x00 0x00 0x00
0xfffffff8: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) c
Continuing.
That all looks good, it matches up with our hexdump of the bios example. If I
hit
^C
then we stop at
0x00000011
.
^C
Program received signal SIGINT, Interrupt.
0x00000011 in ?? ()
If we recall that we are running in 16
bit mode in the last sector and convert that off set into the memory dumps we
find the byte value
0xf4
an x86 halt instruction.
"HLT causes the 80386 to stop execution. Following a halt, execution can
only be resumed by the receipt of an enabled interrupt or by a reset of
the computer."
- Programming the 80386
So we did what we wanted to and stopped, but qemu gave us no output. I think that has confirmed that the bios image is now correct if not functional. So either we are running fine in bhyve and just not getting output, or there is something else up.
In the example minimal Linux hypervisor they just did a straight printf for an IO vmexit. Lets catch the vmexit handlers in bhyve and see what is up:
diff --git a/usr.sbin/bhyve/amd64/vmexit.c b/usr.sbin/bhyve/amd64/vmexit.c
index e0b9aec2d17a..e1669c2b5051 100644
--- a/usr.sbin/bhyve/amd64/vmexit.c
+++ b/usr.sbin/bhyve/amd64/vmexit.c
@@ -72,6 +72,7 @@ vm_inject_fault(struct vcpu *vcpu, int vector, int errcode_valid,
static int
vmexit_inout(struct vmctx *ctx, struct vcpu *vcpu, struct vm_run *vmrun)
{
+fprintf(stderr, "%s:%d\n", __func__, __LINE__)
struct vm_exit *vme;
int error;
int bytes, port, in;
I reconfigured my test script to output serial to /dev/nmdm0A so I would get printfs from bhyve, but nothing.
Our assembly doesn't do what we think it should does.
Adding port configuration from
this so
and
osdev
wiki
led my modified bhyve to print on calls to
vmexit_inout
.
$ sudo sh ./run.sh nello
outputting serial to /dev/nmdm0B
waiting for gdb
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
vmexit_inout:75
Those 10
vmexit_input
lines match up perfectly with the configuration and
test example. This is an excellent debugging sign.
With an extreme amount of further faffing I discovered that the loop in the example I started from was not making it to the first print statement. I confirmed this by stripping away all of the configuation and just spat out some characters explicitly.
In the hexdump nasm was loading the wrong address into
bx
, but even with the
correct address in
bx
I got no output. As I only wanted to say hello from
real mode, I'm done. Debugging segments (segments, not even once) in a pre bios
environment where you can't single step just isn't my idea of fun.
The example I started from was run from the base address, by writing their own kvm driver they were able to configure the instruction pointer and segments to look sensible. Me - an idiot, decided to work with the brain melting x86 hardware as it is.
Most of my fighting here was because gdb connecting to the bhyve stub isn't able to read guest memory in the bios region. Neither qemu or bhyve let me single step instructions, which just makes debugging here tedious.
OS Dev wiki is a great resource, but it is very annoying to have lots of "you shouldn't do this" everywhere when you push their 'perfect path'. I just want to know what I need to know.
If you want to play with real mode in bhyve you can start from this, minimal, working example:
; A 64k bios for bhyve which does nothing at all
bits 16
equ PORT 0x3f8
%macro outb 1
mov al, %1
out dx, al
%endmacro
start:
mov dx, PORT ; store the port
outb 0x0a ; print a message
outb 'b'
outb 'h'
outb 'y'
outb 'v'
outb 'e'
outb '!'
outb 0x0a
end:
hlt ; hang around
TIMES 0xFFF0 - ($ - $$) db 0 ; pad out to reset vector
; cpu is going to start from 0xFFF0, with CS set to 0xF000 basically we are
; going to start at 0xFFFFFFF0, with only 16 bytes to play with, but we can
; just to start of the 64k segment reasonably easily.
jmp start
TIMES 0x10000 - ($ - $$) db 0 ; padd out to 64k
Hopefully that end isn't too negative, I had a lot of fun doing this, I just don't want to do anymore of it.