OSDev.org https://forum.osdev.org/ |
|
[AArch64 / Bare metal] Need help with CPU communication https://forum.osdev.org/viewtopic.php?f=1&t=33444 |
Page 1 of 1 |
Author: | dublevsky [ Fri Jan 18, 2019 11:36 am ] |
Post subject: | [AArch64 / Bare metal] Need help with CPU communication |
Been trying to solve this for a week, decided to reach out for help. What I have is code running on bare metal (RPI Model B 3+). I'm trying to initialize every CPU with general stuff and then wait for CPU0 to zero out the BSS (and stuff like MMU setup in the future). After CPU0 initialized all the stuff it needed to it is supposed to release secondary CPUs and then every single CPU jumps into the kernel by calling kmain. kmain containts very primitive waiting function (for now, just to check if other CPUs get there) and prints out every CPU's id. The problem is only CPU0 gets to kmain. start.S Code: #include "asm/macros.h" #include "arch/arch.h" #include "board/spec.h" .section .bss.stack .align 8 .skip ARCH_STACK_SIZE * BOARD_NUM_CPUS DATA(___stack_end) .section .data .align 8 DATA(cpu_barrier) .long 1 .section .text cpuid .req x9 FUNCTION(_start) // ---------------------------------------- // Initialization to carry out on every CPU // ---------------------------------------- // Find out which CPU we are running at mrs cpuid, mpidr_el1 and cpuid, cpuid, #0xff // Set up the stack adr x0, ___stack_end ldr x1, =ARCH_STACK_SIZE mul x1, x1, cpuid sub sp, x0, x1 // ----------------------------------- // Initialization to carry out on CPU0 // ----------------------------------- cbnz cpuid, .Lwait_for_primary_cpu // Zero out the bss section // Note: relies on ___bss and ___bss_end being 16 byte aligned adr x0, ___bss adr x1, ___bss_end sub x1, x1, x0 cbz x1, .Lbss_init_done .Lbss_init_loop: stp xzr, xzr, [x0], #16 sub x1, x1, #16 cbnz x1, .Lbss_init_loop .Lbss_init_done: // Release secondary cpus adr x0, cpu_barrier str xzr, [x0] b .Lwait_for_primary_cpu_done // Wait for primary cpu .Lwait_for_primary_cpu: adr x0, cpu_barrier .Lwait_for_primary_cpu_loop: ldr x1, [x0] cbnz x1, .Lwait_for_primary_cpu_loop .Lwait_for_primary_cpu_done: // Jump into the kernel .Lkernel_entry: mov x0, cpuid bl kmain .Lhang: wfe b .Lhang asm/macros.h Code: #ifndef INCLUDE_ASM_MACROS_H #define INCLUDE_ASM_MACROS_H #define FUNCTION(x) .global x; .type x, STT_FUNC; x: #define DATA(x) .global x; .type x, STT_OBJECT; x: #define LOCALFUNCTION(x) .type x, STT_FUNC; x: #define LOCALDATA(x) .type x, STT_OBJECT; x: #endif /*INCLUDE_ASM_MACROS_H*/ kmain.c Code: #include <stdint.h> #include "peripherals/mu.h" static void wait(const uint64_t c) { for (uint64_t i = 0; i < c; ++i) { __asm__ volatile("nop"); } } void kmain(const uint64_t cpuid) { if (cpuid) { wait(1000000 * cpuid); } else { mu_init(9600); } mu_putc((char)cpuid + '0'); } link64.ld Code: ENTRY(_start) SECTIONS { . = 0x80000; .text : { *(.text) } .rodata : { *(.rodata) } .data : { *(.data) } .bss : { . = ALIGN(8); *(.bss.stack) . = ALIGN(16); ___bss = .; *(.bss) . = ALIGN(16); ___bss_end = .; } } Would love some help, because I'm going crazy at this point. |
Author: | Octocontrabass [ Fri Jan 18, 2019 1:48 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
How are you telling all of the CPUs to jump to _start? |
Author: | zaval [ Fri Jan 18, 2019 1:52 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
I honestly haven't even touched multiprocessing stuff yet, so hardly I'd be helpful, but, seriously, looking at your code, I am wondering - why do you think secondary CPUs are even running? Where is it seen? They won't run just because your bootstrap cpu writes 0 into some variable, you need to wake them up first! And it all goes to the way it's done on RPi. With all that VC things... who knows. But I guess your secondary CPUs aren't running. Firmware starts on CPU0, your code takes control over on it too and that's all. No secondary CPUs on the scene. Learn more on secondary CPU bring up for RPi. |
Author: | dublevsky [ Fri Jan 18, 2019 1:57 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Octocontrabass wrote: How are you telling all of the CPUs to jump to _start? zaval wrote: I honestly haven't even touched multiprocessing stuff yet, so hardly I'd be helpful, but, seriously, looking at your code, I am wondering - why do you think secondary CPUs are even running? Where is it seen? They won't run just because your bootstrap cpu writes 0 into some variable, you need to wake them up first! And it all goes to the way it's done on RPi. With all that VC things... who knows. But I guess your secondary CPUs aren't running. Firmware starts on CPU0, your code takes control over on it too and that's all. No secondary CPUs on the scene. Learn more on secondary CPU bring up for RPi. RPI bootloader does the stuff it needs then every single core enters _start. |
Author: | Octocontrabass [ Fri Jan 18, 2019 2:18 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Which bootloader are you using that sends every CPU to _start? The official bootloaders only start one CPU and leave the others halted. |
Author: | dublevsky [ Fri Jan 18, 2019 2:23 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Octocontrabass wrote: Which bootloader are you using that sends every CPU to _start? The official bootloaders only start one CPU and leave the others halted. I'm using the official RPi one. Pretty sure every single CPU is awake as this simple code works. |
Author: | nullplan [ Fri Jan 18, 2019 3:02 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Cache coherency problems? Is it possible the other cores never see the update to cpu_barrier? Do you need a barrier in that loop and at the point where you write the variable? |
Author: | dublevsky [ Fri Jan 18, 2019 3:13 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
nullplan wrote: Cache coherency problems? Is it possible the other cores never see the update to cpu_barrier? Do you need a barrier in that loop and at the point where you write the variable? Neither MMU nor I/D caches are enabled yet. |
Author: | Octocontrabass [ Fri Jan 18, 2019 3:24 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
dublevsky wrote: That code uses "kernel_old=1" in config.txt to bypass the boot stub. You are not using "kernel_old=1" in your config.txt, so the firmware's default boot stub is running (or armstub8.bin from your SD card), and that boot stub is halting all but one of the CPUs. |
Author: | dublevsky [ Fri Jan 18, 2019 3:29 pm ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Octocontrabass wrote: dublevsky wrote: That code uses "kernel_old=1" in config.txt to bypass the boot stub. You are not using "kernel_old=1" in your config.txt, so the firmware's default boot stub is running (or armstub8.bin from your SD card), and that boot stub is halting all but one of the CPUs. You have a point. BRB, checking this out. EDIT: It's midnight for me, but I checked some resources and Octocontrabass's answer seems to be right. I will work on this tomorrow and will reply with a full solution if it works. |
Author: | bzt [ Sat Jan 19, 2019 5:21 am ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
Hi, Your code will be executed on all cores no matter what you do in config.txt. This is the case even if config.txt does not exists (recommended). The memory cache is wired per core, but you have one RAM. Therefore if you change the memory from one core, you need to refresh the cache in other cores. To do that, either map the memory as non-cacheable, outter-shareable or implicitly use a data barrier (dsb). Cheers, bzt |
Author: | dublevsky [ Sat Jan 19, 2019 7:31 am ] |
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication |
bzt wrote: Hi, Your code will be executed on all cores no matter what you do in config.txt. This is the case even if config.txt does not exists (recommended). The memory cache is wired per core, but you have one RAM. Therefore if you change the memory from one core, you need to refresh the cache in other cores. To do that, either map the memory as non-cacheable, outter-shareable or implicitly use a data barrier (dsb). Cheers, bzt Hi, I/D caches are not enabled yet. I'm currently working on Octocontrabass's answer. I revised this asnwer on /r/asm and decided to google for 'raspberry pi cpu-release-addr', which led me to Device Tree Blobs. After compiling bcm2710-rpi-3-b-plus.dtb back to .dts format and looking into it there's indeed a cpu-release-addr parameter for every cpu. I'm currently writing a quick and dirty mailbox interface implementation to check this, so no progress yet. EDIT: ok, I checked the code with kernel_old=1 and disable_commandline_tags=1 in config.txt and it still doesn't work, so I'm gonna try using dsb and report back. EDIT2: wrapping every single load and store into with 'dsb sy' didn't work either. |
Author: | dublevsky [ Sat Jan 19, 2019 9:10 am ] | ||
Post subject: | Re: [AArch64 / Bare metal] Need help with CPU communication | ||
OK. After being a complete idiot for ~1 week I finally got it working. Big thanks to Octocontrabass and /u/TNorthover. Solution: If you have no custom boot options in config.txt RPI bootloader will load your image at 0x8000 for kernel7.img (32-bit kernel) or 0x80000 for kernel8.img (64-bit kernel). The stubs that are used for loading in that case are armstub7.S and armstub8.S. As I'm writing a 64-bit kernel for AArch64 I looked into the process of booting in armstub8.S. After some minimal CPU initialization the bootloader loads Device Tree Blob (Flattened Device Tree) address to x0 and kernel entry address (_start in my case) to x4 for CPU0 and CPU0 jumps to the specified address. CPU[1:3], on the other hand, load x4 with their respective barrier's address and sit in a loop, which consists of 2 steps: Waiting For Event (WFEing), then checking x4 for a non-zero value. x4 = x5 + (x6 << 3), where x5 = spin_cpu0 address - basically a base address for cpu 'barriers'. Equals to 0xd8 x6 = cpu id So by writing value '0x80000' or '&_start' to 0xe0, 0xe8 and 0xf0 and then Sending an EVent (SEVing) from CPU0 CPU[1:3] wakes up and jumpts to _start.
|
Page 1 of 1 | All times are UTC - 6 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |