Skip to content
This repository has been archived by the owner on Apr 13, 2024. It is now read-only.

Boot via EFI on arm64 and x86_64 #199

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

nathanchance
Copy link
Member

This is currently a WIP but I wanted to push what I had for initial review before I left for work.

@kees noticed that booting via EFI was broken on arm64 because of an ld.lld change: ClangBuiltLinux/linux#634

This will allow us to catch regressions like this in the future.

To work properly, the ovmf and qemu-efi-aarch64 packages need to be installed, which will be done in a separate pull request to the Docker image repo.

As it stands now, there are three distinct issues:

  1. x86_64 panics on init because /dev/sda is no longer available:
[   13.227390] VFS: Cannot open root device "sda" or unknown-block(0,0): error -6
[   13.227919] Please append a correct "root=" boot option; here are the available partitions:
[   13.229050] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[   13.230040] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc4+ #1
[   13.230565] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[   13.231331] Call Trace:
[   13.232041]  dump_stack+0xa3/0x10b
[   13.232465]  panic+0xfd/0x2ff
[   13.232734]  ? klist_next+0x84/0xb0
[   13.233104]  mount_block_root+0x12e/0x1bd
[   13.233462]  ? gen6_set_rps+0xa0/0x1d0
[   13.233819]  ? kernel_init+0x6/0x2d0
[   13.234134]  prepare_namespace+0x17c/0x181
[   13.234474]  kernel_init_freeable+0x195/0x1be
[   13.234865]  ? rest_init+0x1e0/0x1e0
[   13.235204]  kernel_init+0x6/0x2d0
[   13.235491]  ? rest_init+0x1e0/0x1e0
[   13.235819]  ret_from_fork+0x3a/0x50
[   13.237306] Kernel Offset: 0x26a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   13.238690] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

Looks like -bios might be an option instead:

diff --git a/driver.sh b/driver.sh
index d86a718..33d381b 100755
--- a/driver.sh
+++ b/driver.sh
@@ -153,12 +153,8 @@ setup_variables() {
           qemu_cmdline=( -drive "file=images/x86_64/rootfs.ext4,format=raw,if=ide"
                          -append "console=ttyS0 root=/dev/sda" ) ;;
       esac
-      ovmf=/usr/share/OVMF
-      if [[ -f ${ovmf}/OVMF_CODE.fd && -f ${ovmf}/OVMF_VARS.fd ]]; then
-        cp ${ovmf}/OVMF_VARS.fd images/x86_64
-        qemu_cmdline+=( -drive "if=pflash,format=raw,readonly,file=${ovmf}/OVMF_CODE.fd"
-                        -drive "if=pflash,format=raw,file=images/x86_64/OVMF_VARS.fd" )
-      fi
+      ovmf=/usr/share/qemu/OVMF.fd
+      [[ -f ${ovmf} ]] && qemu_cmdline+=( -bios ${ovmf} )
       # Use KVM if the processor supports it (first part) and the KVM module is loaded (second part)
       [[ $(grep -c -E 'vmx|svm' /proc/cpuinfo) -gt 0 && $(lsmod 2>/dev/null | grep -c kvm) -gt 0 ]] && qemu_cmdline=( "${qemu_cmdline[@]}" -enable-kvm )
       image_name=bzImage
  1. arm64 panics on init because /dev/vda is no longer available:
VFS: Cannot open root device "vda" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.189+ #12
Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
Call trace:
[<ffffffc000089ecc>] dump_backtrace+0x0/0x144
[<ffffffc000089ec4>] show_stack+0x14/0x1c
[<ffffffc00037b2e8>] dump_stack+0xf8/0x148
[<ffffffc0000b69d4>] panic+0xd8/0x22c
[<ffffffc0009f8054>] mount_block_root+0x1d8/0x2b0
[<ffffffc0009f81d0>] mount_root+0xa4/0x19c
[<ffffffc0009f83e4>] prepare_namespace+0x11c/0x190
[<ffffffc0009f7d40>] kernel_init_freeable+0x1d4/0x1f8
[<ffffffc0007471fc>] kernel_init+0x10/0x1d4
[<ffffffc000085e50>] ret_from_fork+0x10/0x40
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)

Have not begun to triage this at all.

  1. kernel/common builds all appear to boot but hang when shutting down so the VM doesn't exit cleanly:
[   11.292195] Run /init as init process
Starting syslogd: OK
Starting klogd: OK
Initializing random number generator... [   11.881486] random: dd: uninitialized urandom read (512 bytes read)
done.
Starting network: [   12.264558] ip (982) used greatest stack depth: 12320 bytes left
OK
Linux version 4.19.66-gd8837b869 (driver@clangbuiltlinux) (clang version 9.0.0-svn366197-1~exp1+0~20190716095603.167~1.gbp7d3830 (trunk)) #7 SMP PREEMPT Thu Jan 1 00:00:00 UTC 1970
Linux version 4.19.66-gd8837b869 (driver@clangbuiltlinux) (clang version 9.0.0-svn366197-1~exp1+0~20190716095603.167~1.gbp7d3830 (trunk)) #7 SMP PREEMPT Thu Jan 1 00:00:00 UTC 1970
Stopping network: OK
Saving random seed... [   12.888645] random: dd: uninitialized urandom read (512 bytes read)
done.
Stopping klogd: OK
Stopping syslogd: OK
umount: devtmpfs busy - remounted read-only
umount: can't unmount /: Invalid argument
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[   15.266490] reboot: System halted
root@097b229733d2:/ci2# echo $?
124

I'll try to get this all cleaned up so we can actually run this on Travis and see everything a bit more clearly.

@nathanchance nathanchance added the WIP Work in progress label Aug 14, 2019
@nathanchance nathanchance requested a review from kees August 14, 2019 20:52
driver.sh Outdated Show resolved Hide resolved
driver.sh Show resolved Hide resolved
driver.sh Outdated Show resolved Hide resolved
driver.sh Outdated Show resolved Hide resolved
@nathanchance
Copy link
Member Author

@kees thank you for all of those suggestions, I have made them and everything appears to work for stable, mainline, and linux-next.

kernel/common is still a problem child but I don’t know if it is worth trying to debug that because I don’t think they care about EFI support.

Presubmit: https://travis-ci.com/nathanchance/continuous-integration/builds/123430893

I am doing another run with kernel/common excluded: nathanchance@8e47262

I will clean everything up tonight and push for final review.

We need this to support booting up with EFI, otherwise arm64 on 4.4
panics because the block device cannot be found. This mirrors what is
done for arm32.

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
This allows booting up via EFI; without this, init panics because
/dev/sda is not found. The IDE block driver is seldom used, virtio is
better for our purposes.

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
We need this to start booting via EFI.

Signed-off-by: Nathan Chancellor <[email protected]>
If the ovmf or qemu-efi-aarch64 packages are installed on Debian or
Ubuntu, use those files to boot up via EFI. This will help prevent
regressions like ClangBuiltLinux/linux#634.

This is disabled for kernel/common because the kernel does not shut down
cleanly; I don't think this is worth exploring because Android does not
care about EFI as far as I am aware.

I considered checking in the fd files so that other distributions could
take advantage of this but these files are over 50MB, which is too much
of a burden to force on everyone.

Presubmit: https://travis-ci.com/nathanchance/continuous-integration/builds/123450697

[skip ci]

Suggested-by: Kees Cook <[email protected]>
Signed-off-by: Nathan Chancellor <[email protected]>
@nathanchance nathanchance removed the WIP Work in progress label Aug 16, 2019
@nathanchance
Copy link
Member Author

@nickdesaulniers
Copy link
Member

cc @ardbiesheuvel for thoughts on EFI

@ardbiesheuvel
Copy link

It is not obvious to me why the ide->virtio change is necessary, but if it allows us to do EFI boot testing, I'm all for it.

While you're at it, could you add virtio-rng-pci as well so we get a KASLR seed on arm64 kernels built with CONFIG_RANDOMIZE_BASE?

@nathanchance
Copy link
Member Author

I added -device virtio-rng-pci locally but the kernel panics, is there something else that is needed?

+ timeout 2m unbuffer qemu-system-aarch64 -m 512m -cpu cortex-a57 -drive file=images/arm64/rootfs.ext4,format=raw,id=rootfs,if=none -device virtio-blk-device,drive=rootfs -device virtio-rng-pci -append 'console=ttyAMA0 earlycon root=/dev/vda' -drive if=pflash,format=raw,readonly,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive if=pflash,format=raw,file=images/arm64/AAVMF_VARS.fd -display none -serial mon:stdio -kernel linux/arch/arm64/boot/Image.gz -machine virt
EFI stub: Booting Linux Kernel...
EFI stub: Generating empty DTB
EFI stub: Exiting boot services and installing virtual address map...
[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x411fd070]
[    0.000000] Linux version 5.3.0-rc5-gbb7ba8069-dirty (driver@clangbuiltlinux) (clang version 9.0.0-svn366197-1~exp1+0~20190716095603.167~1.gbp7d3830 (trunk)) #3 SMP PREEMPT Thu Jan 1 00:00:00 UTC 1970
[    0.000000] efi: Getting EFI parameters from FDT:
[    0.000000] efi: EFI v2.70 by EDK II
[    0.000000] efi:  SMBIOS 3.0=0x5bf40000  MEMATTR=0x5aabf018  ACPI 2.0=0x58560000  RNG=0x5bffca18  MEMRESERVE=0x5854f018 
[    0.000000] efi: seeding entropy pool
[    0.000000] cma: Reserved 32 MiB at 0x000000005d400000
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x0000000058560000 000024 (v02 BOCHS )
[    0.000000] ACPI: XSDT 0x0000000058550000 00004C (v01 BOCHS  BXPCFACP 00000001      01000013)
[    0.000000] ACPI: FACP 0x0000000058510000 00010C (v05 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x0000000058520000 00482C (v02 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 0x0000000058500000 0000A8 (v03 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: GTDT 0x00000000584F0000 000060 (v02 BOCHS  BXPCGTDT 00000001 BXPC 00000001)
[    0.000000] ACPI: MCFG 0x00000000584E0000 00003C (v01 BOCHS  BXPCMCFG 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR 0x00000000584D0000 000050 (v02 BOCHS  BXPCSPCR 00000001 BXPC 00000001)
[    0.000000] ACPI: SPCR: console: pl011,mmio,0x9000000,9600
[    0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '9600')
[    0.000000] printk: bootconsole [pl11] enabled
[    0.000000] ACPI: NUMA: Failed to initialise from firmware
[    0.000000] NUMA: Faking a node at [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x5feff740-0x5ff00fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA32    [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000040000000-0x000000005856ffff]
[    0.000000]   node   0: [mem 0x0000000058570000-0x000000005874ffff]
[    0.000000]   node   0: [mem 0x0000000058750000-0x000000005bc1ffff]
[    0.000000]   node   0: [mem 0x000000005bc20000-0x000000005bffffff]
[    0.000000]   node   0: [mem 0x000000005c000000-0x000000005fffffff]
[    0.000000] Zeroed struct page in unavailable ranges: 624 pages
[    0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
[    0.000000] psci: probing for conduit method from ACPI.
[    0.000000] psci: PSCIv0.2 detected in firmware.
[    0.000000] psci: Using standard PSCI v0.2 function IDs
[    0.000000] psci: Trusted OS migration not required
[    0.000000] ACPI: SRAT not present
[    0.000000] percpu: Embedded 49 pages/cpu s161096 r8192 d31416 u200704
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] CPU features: detected: ARM erratum 832075
[    0.000000] CPU features: detected: ARM erratum 834220
[    0.000000] CPU features: detected: EL2 vector hardening
[    0.000000] CPU features: kernel page table isolation forced ON by KASLR
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.000000] ARM_SMCCC_ARCH_WORKAROUND_1 missing from firmware
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 129024
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: console=ttyAMA0 earlycon root=/dev/vda
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 425964K/524288K available (14012K kernel code, 2756K rwdata, 6880K rodata, 6144K init, 10912K bss, 65556K reserved, 32768K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] Running RCU self tests
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu: 	RCU lockdep checking is enabled.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=1.
[    0.000000] rcu: 	RCU debug extended QS entry/exit.
[    0.000000] 	Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] GICv2m: ACPI overriding V2M MSI_TYPER (base:80, num:64)
[    0.000000] GICv2m: range[mem 0x08020000-0x08020fff], SPI[80:143]
[    0.000000] random: get_random_bytes called from start_kernel+0x1ec/0x3c4 with crng_init=0
[    0.000000] arch_timer: cp15 timer(s) running at 62.50MHz (virt).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x1cd42e208c, max_idle_ns: 881590405314 ns
[    0.000052] sched_clock: 56 bits at 62MHz, resolution 16ns, wraps every 4398046511096ns
[    0.007182] Console: colour dummy device 80x25
[    0.007823] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[    0.008038] ... MAX_LOCKDEP_SUBCLASSES:  8
[    0.008166] ... MAX_LOCK_DEPTH:          48
[    0.008293] ... MAX_LOCKDEP_KEYS:        8192
[    0.008455] ... CLASSHASH_SIZE:          4096
[    0.008613] ... MAX_LOCKDEP_ENTRIES:     32768
[    0.008773] ... MAX_LOCKDEP_CHAINS:      65536
[    0.008928] ... CHAINHASH_SIZE:          32768
[    0.009093]  memory used by lock dependency info: 6237 kB
[    0.009274]  per task-struct memory footprint: 1920 bytes
[    0.009621] ------------------------
[    0.009761] | Locking API testsuite:
[    0.009890] ----------------------------------------------------------------------------
[    0.010144]                                  | spin |wlock |rlock |mutex | wsem | rsem |
[    0.010402]   --------------------------------------------------------------------------
[    0.010931]                      A-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.023074]                  A-B-B-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.034663]              A-B-B-C-C-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.046920]              A-B-C-A-B-C deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.059284]          A-B-B-C-C-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.072615]          A-B-C-D-B-D-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.085968]          A-B-C-D-B-C-D-A deadlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.099206]                     double unlock:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.109670]                   initialize held:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.119802]   --------------------------------------------------------------------------
[    0.120058]               recursive read-lock:             |  ok  |             |  ok  |
[    0.123260]            recursive read-lock #2:             |  ok  |             |  ok  |
[    0.126461]             mixed read-write-lock:             |  ok  |             |  ok  |
[    0.129640]             mixed write-read-lock:             |  ok  |             |  ok  |
[    0.132715]   mixed read-lock/lock-write ABBA:             |FAILED|             |  ok  |
[    0.135995]    mixed read-lock/lock-read ABBA:             |  ok  |             |  ok  |
[    0.139469]  mixed write-lock/lock-write ABBA:             |  ok  |             |  ok  |
[    0.142809]   --------------------------------------------------------------------------
[    0.143242]      hard-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[    0.147981]      soft-irqs-on + irq-safe-A/12:  ok  |  ok  |  ok  |
[    0.152799]      hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[    0.157489]      soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
[    0.162188]        sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
[    0.166908]        sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
[    0.171596]          hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[    0.176279]          soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
[    0.180984]          hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[    0.185649]          soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
[    0.190311]     hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[    0.195365]     soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
[    0.200424]     hard-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[    0.205491]     soft-safe-A + unsafe-B #1/132:  ok  |  ok  |  ok  |
[    0.210551]     hard-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[    0.215590]     soft-safe-A + unsafe-B #1/213:  ok  |  ok  |  ok  |
[    0.220633]     hard-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[    0.225663]     soft-safe-A + unsafe-B #1/231:  ok  |  ok  |  ok  |
[    0.230701]     hard-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[    0.235493]     soft-safe-A + unsafe-B #1/312:  ok  |  ok  |  ok  |
[    0.240290]     hard-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[    0.245320]     soft-safe-A + unsafe-B #1/321:  ok  |  ok  |  ok  |
[    0.250383]     hard-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[    0.255617]     soft-safe-A + unsafe-B #2/123:  ok  |  ok  |  ok  |
[    0.260688]     hard-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[    0.265799]     soft-safe-A + unsafe-B #2/132:  ok  |  ok  |  ok  |
[    0.270891]     hard-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[    0.275985]     soft-safe-A + unsafe-B #2/213:  ok  |  ok  |  ok  |
[    0.281058]     hard-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[    0.286170]     soft-safe-A + unsafe-B #2/231:  ok  |  ok  |  ok  |
[    0.291264]     hard-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[    0.296381]     soft-safe-A + unsafe-B #2/312:  ok  |  ok  |  ok  |
[    0.301455]     hard-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[    0.306527]     soft-safe-A + unsafe-B #2/321:  ok  |  ok  |  ok  |
[    0.311898]       hard-irq lock-inversion/123:  ok  |  ok  |  ok  |
[    0.317179]       soft-irq lock-inversion/123:  ok  |  ok  |  ok  |
[    0.322264]       hard-irq lock-inversion/132:  ok  |  ok  |  ok  |
[    0.327340]       soft-irq lock-inversion/132:  ok  |  ok  |  ok  |
[    0.332419]       hard-irq lock-inversion/213:  ok  |  ok  |  ok  |
[    0.337493]       soft-irq lock-inversion/213:  ok  |  ok  |  ok  |
[    0.342611]       hard-irq lock-inversion/231:  ok  |  ok  |  ok  |
[    0.347734]       soft-irq lock-inversion/231:  ok  |  ok  |  ok  |
[    0.352824]       hard-irq lock-inversion/312:  ok  |  ok  |  ok  |
[    0.357938]       soft-irq lock-inversion/312:  ok  |  ok  |  ok  |
[    0.363012]       hard-irq lock-inversion/321:  ok  |  ok  |  ok  |
[    0.368106]       soft-irq lock-inversion/321:  ok  |  ok  |  ok  |
[    0.373178]       hard-irq read-recursion/123:  ok  |
[    0.375032]       soft-irq read-recursion/123:  ok  |
[    0.376795]       hard-irq read-recursion/132:  ok  |
[    0.378563]       soft-irq read-recursion/132:  ok  |
[    0.380430]       hard-irq read-recursion/213:  ok  |
[    0.382199]       soft-irq read-recursion/213:  ok  |
[    0.383959]       hard-irq read-recursion/231:  ok  |
[    0.385832]       soft-irq read-recursion/231:  ok  |
[    0.387616]       hard-irq read-recursion/312:  ok  |
[    0.389385]       soft-irq read-recursion/312:  ok  |
[    0.391241]       hard-irq read-recursion/321:  ok  |
[    0.393003]       soft-irq read-recursion/321:  ok  |
[    0.394750]   --------------------------------------------------------------------------
[    0.394999]   | Wound/wait tests |
[    0.395241]   ---------------------
[    0.395378]                   ww api failures:  ok  |  ok  |  ok  |
[    0.401371]                ww contexts mixing:  ok  |  ok  |
[    0.404695]              finishing ww context:  ok  |  ok  |  ok  |  ok  |
[    0.411251]                locking mismatches:  ok  |  ok  |  ok  |
[    0.416117]                  EDEADLK handling:  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |  ok  |
[    0.433222]            spinlock nest unlocked:  ok  |
[    0.434917]   -----------------------------------------------------
[    0.435104]                                  |block | try  |context|
[    0.435303]   -----------------------------------------------------
[    0.435540]                           context:  ok  |  ok  |  ok  |
[    0.440625]                               try:  ok  |  ok  |  ok  |
[    0.445465]                             block:  ok  |  ok  |  ok  |
[    0.450251]                          spinlock:  ok  |  ok  |  ok  |
[    0.455708] -------------------------------------------------------
[    0.455928] Good, all 261 testcases passed! |
[    0.456066] ---------------------------------
[    0.456998] ACPI: Core revision 20190703
[    0.460273] Calibrating delay loop (skipped), value calculated using timer frequency.. 125.00 BogoMIPS (lpj=250000)
[    0.460638] pid_max: default: 32768 minimum: 301
[    0.461923] LSM: Security Framework initializing
[    0.463153] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.463388] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
[    0.491191] ACPI PPTT: No PPTT table found, CPU and cache topology may be inaccurate
[    0.519025] ASID allocator initialised with 32768 entries
[    0.526299] rcu: Hierarchical SRCU implementation.
[    0.537598] Remapping and enabling EFI services.
[    0.547107] smp: Bringing up secondary CPUs ...
[    0.547386] smp: Brought up 1 node, 1 CPU
[    0.547597] SMP: Total of 1 processors activated.
[    0.547851] CPU features: detected: 32-bit EL0 Support
[    0.548076] CPU features: detected: CRC32 instructions
[    0.556481] CPU: All CPU(s) started at EL1
[    0.556790] alternatives: patching kernel code
[    0.566995] ------------[ cut here ]------------
[    0.567128] kernel BUG at arch/arm64/mm/mmu.c:155!
[    0.567450] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    0.567772] Modules linked in:
[    0.568057] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc5-gbb7ba8069-dirty #3
[    0.568364] pstate: 60000005 (nZCv daif -PAN -UAO)
[    0.568529] pc : init_pte+0x168/0x198
[    0.568651] lr : init_pmd+0x298/0x328
[    0.568772] sp : ffff000010013c20
[    0.568896] x29: ffff000010013cb0 x28: ffffffce455c0000 
[    0.569070] x27: ffffffce455b0000 x26: 00000000455b0000 
[    0.569238] x25: ffff7dfffe638150 x24: ffffffce45600000 
[    0.569416] x23: ffffffce455fffff x22: ffffffce46000000 
[    0.569590] x21: 00e0000000000793 x20: 00f0000000000793 
[    0.569760] x19: 0000000000000002 x18: 00000000d2a81d91 
[    0.569924] x17: 0000000000000000 x16: ffffffce587f8000 
[    0.570101] x15: 0000000000000800 x14: 00e00000455b0793 
[    0.570266] x13: 00e80000455b0f13 x12: ffd7fffffffff77f 
[    0.570438] x11: 0040000000000001 x10: 0000000000010000 
[    0.570658] x9 : ffff5821256d5000 x8 : ffff7dfffe639d80 
[    0.570837] x7 : ffff58212474b5e0 x6 : 0000000000000002 
[    0.570997] x5 : 0000000000000000 x4 : 00e0000000000793 
[    0.571164] x3 : 00000000455b0000 x2 : ffffffce455c0000 
[    0.571322] x1 : ffffffce455b0000 x0 : ffff7dfffe638150 
[    0.571557] Call trace:
[    0.571683]  init_pte+0x168/0x198
[    0.571809]  alloc_init_pud+0x300/0x39c
[    0.571938]  __create_pgd_mapping+0x94/0xdc
[    0.572063]  update_mapping_prot+0x58/0xf4
[    0.572198]  mark_linear_text_alias_ro+0xec/0xf4
[    0.572354]  smp_cpus_done+0x38/0x40
[    0.572480]  smp_init+0x120/0x134
[    0.572594]  kernel_init_freeable+0x110/0x1a0
[    0.572720]  kernel_init+0x14/0x284
[    0.572838]  ret_from_fork+0x10/0x18
[    0.573153] Code: 3758008f ca0d01cd ea0c01bf 54fffd40 (d4210000) 
[    0.573699] ---[ end trace 40bbcccbfa45bc62 ]---
[    0.574081] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:38
[    0.574326] in_atomic(): 1, irqs_disabled(): 128, pid: 1, name: swapper/0
[    0.574552] INFO: lockdep is turned off.
[    0.574692] irq event stamp: 2488
[    0.574822] hardirqs last  enabled at (2487): [<ffff58212474fa08>] _raw_spin_unlock_irq+0x2c/0x68
[    0.575104] hardirqs last disabled at (2488): [<ffff5821239b17c4>] do_debug_exception+0x58/0x1ec
[    0.575384] softirqs last  enabled at (2210): [<ffff582123a28018>] irq_exit+0x114/0x134
[    0.575630] softirqs last disabled at (2203): [<ffff582123a28018>] irq_exit+0x114/0x134
[    0.575947] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G      D           5.3.0-rc5-gbb7ba8069-dirty #3
[    0.576232] Call trace:
[    0.576320]  dump_backtrace+0x0/0x140
[    0.576450]  show_stack+0x14/0x1c
[    0.576558]  dump_stack+0xa8/0x104
[    0.576680]  ___might_sleep+0x1b0/0x1c0
[    0.576816]  __might_sleep+0x4c/0x80
[    0.576923]  exit_signals+0x30/0x3a8
[    0.577039]  do_exit+0x9c/0xa18
[    0.577158]  arm64_force_sig_fault+0x0/0x58
[    0.577308]  bug_handler+0x40/0x78
[    0.577423]  early_brk64+0x10/0x20
[    0.577547]  do_debug_exception+0x188/0x1ec
[    0.577683]  el1_dbg+0x18/0x8c
[    0.577783]  init_pte+0x168/0x198
[    0.577901]  alloc_init_pud+0x300/0x39c
[    0.578031]  __create_pgd_mapping+0x94/0xdc
[    0.578161]  update_mapping_prot+0x58/0xf4
[    0.578301]  mark_linear_text_alias_ro+0xec/0xf4
[    0.578457]  smp_cpus_done+0x38/0x40
[    0.578573]  smp_init+0x120/0x134
[    0.578675]  kernel_init_freeable+0x110/0x1a0
[    0.578824]  kernel_init+0x14/0x284
[    0.578940]  ret_from_fork+0x10/0x18
[    0.579560] note: swapper/0[1] exited with preempt_count 1
[    0.581196] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.581656] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

@ardbiesheuvel
Copy link

OK, so the KASLR kernel is buggered when compiled with Clang. Are there any known issues in this area that we haven't fixed yet?

(You can pass 'nokaslr' on the kernel command line to double check that this is the issue, but it seems highly likely)

@nathanchance
Copy link
Member Author

Yes, nokaslr works.

I don't believe there are any outstanding issues, all of the KASLR ones that I was aware of were ld.lld related but this is with ld.bfd. What's odd is that CONFIG_RANDOMIZE_BASE is enabled in the arm64 defconfig as of 5.3-rc1 and we have no issues booting with it when not trying to boot via EFI.

Where should we start debugging?

@ardbiesheuvel
Copy link

ardbiesheuvel commented Aug 22, 2019

That is explained by the fact that the upstream arm64 kernel requires EFI boot for KASLR.

@nickdesaulniers
Copy link
Member

That is explained by the fact that the upstream arm64 kernel requires EFI boot for KASLR.

cc @Ajs1984 @samitolvanen

Huh, I didn't think Pixel kernels booted with EFI, but they definitely use KASLR, but TBH maybe they do use EFI and I'm wrong.

Shouldn't arch/arm64/Kconfig select EFI when RANDOMIZE_BASE is selected?

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Aug 22, 2019

Correction, most Pixels do use UEFI. Not sure how they get their KASLR seed though. I assume the bootloader is involved in collecting entropy, but we're out of my area of expertise.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants