OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 9:14 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 02, 2023 4:32 am 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
Hi,

I installed QEMU from brew on M2 Macbook running Ventura 13.4
The Qemu version is 8.0.2

My OS is 32 bit protected mode OS for x86 arch. It is cross compiled using i686-elf-gcc. I am running it using qemu-system-x86_64

The same setup on WSL (windows) works fine but on MacOS, I traced that QEMU hangs when it encounters FPU instruction "FILDLL" or "FDIVRP" - either one of them

If I remove that floating point math code then my OS boots fine. I confirmed that the generated assembly on WSL also has same FILDLL and FDIVRP instructions and it is working fine there

When I looked at the 'out_asm' debug logs of Qemu on MacOS, the last few lines before hanging are a bunch of (I think around 8 ) .quad lines.

I have verified that there are no guest_errors, interrupts from debug logs etc.

Does anyone know what could be wrong here ?

My FPU initialisation is done fine. I verified that CPU has builtin FPU, EM is cleared, MP and NE bits are set in CR0. SSE is also enabled

This is my entire command line
qemu-system-x86_64 \
-serial file:serial_debug.log \
-pflash ./OVMF.fd \
-m 512 \
-smp 1 \
-usb \
-d guest_errors,int,out_asm \
-drive if=none,id=usbbootdrive,file=$BOOT_DRIVE \
-drive if=none,id=usbdrive1,file=$UPANIX_HOME/USBImage/300MUSB_ehci.img \
-device usb-ehci,id=ehci \
-device usb-storage,bus=ehci.0,drive=usbdrive1 \
-device nec-usb-xhci,id=xhci \
-device usb-storage,bus=xhci.0,port=1,drive=usbbootdrive \
-device usb-hub,bus=xhci.0,port=3

Regards
Prajwal

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 02, 2023 11:05 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
prajwal wrote:
The same setup on WSL (windows) works fine but on MacOS, I traced that QEMU hangs when it encounters FPU instruction "FILDLL" or "FDIVRP" - either one of them

Did you run any other FPU instructions before these?

prajwal wrote:
I verified that CPU has builtin FPU, EM is cleared, MP and NE bits are set in CR0.

How about CR0.TS?


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 02, 2023 5:16 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
Yes, I have run other FPU instructions before this. I modified the code to do the same math operation using function local variables instead of those that came as function parameters. The compiler generated instructions in this case, did not have FILDLL and FDIVRP but continued to include FSTPL and FLDL. (fyi: I am not doing code optimisation by passing -O0 param)

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 02, 2023 6:00 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
Does the behavior change according to the operands to the FILD or FDIVRP instructions? Is there enough room in the x87 stack for FILD?

Can you turn on one-insn-per-tb and log out_asm where QEMU hangs?

Can you run QEMU itself under a debugger to see why it hangs?


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 03, 2023 7:44 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
I tried using LLDB but couldn't succeed is single stepping beyond the first breakpoint at main function.

In any case, I tried couple of other things. I downloaded and built qemu-7.2.3 on my Mac M2 Ventura 13.4 successfully and then used qemu-system-x86_64 from that build. That failed (got froze) at the same point

one-insn-per-tb is not supported in qemu-8.0.2, so I used -singlestep option with -d out_asm,in_asm and this is the output before qemu froze

Code:
IN:
0x0019cadf:  de f9                    fdivrp   %st(1)

OUT: [size=72]
  -- guest addr 0x0000000000000adf + tb prologue
0x10ca72200:  b85f0274  ldur     w20, [x19, #-0x10]
0x10ca72204:  7100029f  cmp      w20, #0
0x10ca72208:  540001cb  b.lt     #0x10ca72240
0x10ca7220c:  aa1303e0  mov      x0, x19
0x10ca72210:  52800021  movz     w1, #0x1
0x10ca72214:  9602645a  bl       #0x104b0b37c
0x10ca72218:  aa1303e0  mov      x0, x19
0x10ca7221c:  96026166  bl       #0x104b0a7b4
0x10ca72220:  b940d274  ldr      w20, [x19, #0xd0]
0x10ca72224:  7905e674  strh     w20, [x19, #0x2f2]
0x10ca72228:  f9404274  ldr      x20, [x19, #0x80]
0x10ca7222c:  f9017e74  str      x20, [x19, #0x2f8]
0x10ca72230:  91000a94  add      x20, x20, #2
0x10ca72234:  2a1403f4  mov      w20, w20
0x10ca72238:  f9004274  str      x20, [x19, #0x80]
0x10ca7223c:  16a3a77b  b        #0x10735c028
0x10ca72240:  70fff600  adr      x0, #0x10ca72103
0x10ca72244:  16a3a77a  b        #0x10735c02c


Any clue from above code on what could be going wrong ?

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 03, 2023 9:42 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
In addition, if this information helps - my OS successfully boots (from USB) and runs on my real laptop, that has x86_64 processor

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 03, 2023 10:57 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
prajwal wrote:
I tried using LLDB but couldn't succeed is single stepping beyond the first breakpoint at main function.

Instead of setting a breakpoint, hang QEMU and send SIGINT as your breakpoint. That should let you see what QEMU is doing when it hangs.

Code:
0x0019cadf:  de f9                    fdivrp   %st(1)

In Intel syntax this is FDIVP, not FDIVRP.

Code:
0x10ca72200:  b85f0274  ldur     w20, [x19, #-0x10]
0x10ca72204:  7100029f  cmp      w20, #0
0x10ca72208:  540001cb  b.lt     #0x10ca72240

This is checking for some kind of exception. I'm not sure what, though.

Code:
0x10ca7220c:  aa1303e0  mov      x0, x19
0x10ca72210:  52800021  movz     w1, #0x1
0x10ca72214:  9602645a  bl       #0x104b0b37c

This is a call to helper_fdiv_STN_ST0().

Code:
0x10ca72218:  aa1303e0  mov      x0, x19
0x10ca7221c:  96026166  bl       #0x104b0a7b4

This is a call to helper_fpop().

Code:
0x10ca72220:  b940d274  ldr      w20, [x19, #0xd0]
0x10ca72224:  7905e674  strh     w20, [x19, #0x2f2]
0x10ca72228:  f9404274  ldr      x20, [x19, #0x80]
0x10ca7222c:  f9017e74  str      x20, [x19, #0x2f8]
0x10ca72230:  91000a94  add      x20, x20, #2
0x10ca72234:  2a1403f4  mov      w20, w20
0x10ca72238:  f9004274  str      x20, [x19, #0x80]

This is updating the FPU instruction pointer and adding 2 to EIP.

Code:
0x10ca7223c:  16a3a77b  b        #0x10735c028
0x10ca72240:  70fff600  adr      x0, #0x10ca72103
0x10ca72244:  16a3a77a  b        #0x10735c02c

This is exiting the translation block. There's a normal exit and an exception exit.

prajwal wrote:
Any clue from above code on what could be going wrong ?

I don't see anything wrong in the generated code. I don't think the helper functions are doing anything crazy enough to cause a problem either, especially since they seem to work fine on other CPUs.

At this point you might have better luck coming up with the smallest program that replicates the problem and submitting a bug report to the QEMU developers.


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sun Jun 04, 2023 2:45 am 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
Thank you. I looked at the qemu crash report @ /Library/Logs/DiagnosticReports and found below thread stack dump.

The line in qemu where it was hanging was "host-utils.h:576" - which is a call to function "__builtin_addcll"

I then modified "include/qemu/compiler.h" and forced (redefined) "#define __has_builtin(x) 0" and recompiled qemu-7.2.3. It threw a bunch of warnings for overriding the definition of __has_builtin(x) but compilation went through fine.

This time, it all worked fine - the OS ran successfully! So, at this point - the issue points to __builtin_addcll().

Appreciate if the details shared so far helps find the root cause - so, I don't have to go with this hack/workaround.

Code:
static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry)
{
#if __has_builtin(__builtin_addcll)
    unsigned long long c = *pcarry;
    x = __builtin_addcll(x, y, c, &c); // This is the line with problem
    *pcarry = c & 1;
    return x;
#else
    bool c = *pcarry;
    /* This is clang's internal expansion of __builtin_addc. */
    c = uadd64_overflow(x, c, &x);
    c |= uadd64_overflow(x, y, &x);
    *pcarry = c;
    return x;
#endif
}


Code:
  Thread 0xecd2a    48 samples (1-48)    priority 31 (base 31)    cpu time 4.698s (16.4G cycles, 43.1G instructions, 0.38c/i)
  <process frontmost, thread QoS default (requested default), process unclamped, process received importance donation from WindowServer [350], IO tier 0>
  48  thread_start + 8 (libsystem_pthread.dylib + 7584) [0x18c196da0] 1-48
    48  _pthread_start + 148 (libsystem_pthread.dylib + 28584) [0x18c19bfa8] 1-48
      48  qemu_thread_start + 128 (qemu-thread-posix.c:505,9 in qemu-system-x86_64 + 5037844) [0x104d81f14] 1-48
        48  rr_cpu_thread_fn + 480 (tcg-accel-ops-rr.c:223,21 in qemu-system-x86_64 + 3606504) [0x104c247e8] 1-48
          48  tcg_cpus_exec + 44 (tcg-accel-ops.c:69,11 in qemu-system-x86_64 + 3603396) [0x104c23bc4] 1-48
            48  cpu_exec + 1764 (cpu-exec.c:1032,13 in qemu-system-x86_64 + 3469648) [0x104c03150] 1-48
              48  cpu_loop_exec_tb + 32 (cpu-exec.c:868,10 in qemu-system-x86_64 + 3469648) [0x104c03150] 1-48
      48  qemu_thread_start + 128 (qemu-thread-posix.c:505,9 in qemu-system-x86_64 + 5037844) [0x104d81f14] 1-48
        48  rr_cpu_thread_fn + 480 (tcg-accel-ops-rr.c:223,21 in qemu-system-x86_64 + 3606504) [0x104c247e8] 1-48
          48  tcg_cpus_exec + 44 (tcg-accel-ops.c:69,11 in qemu-system-x86_64 + 3603396) [0x104c23bc4] 1-48
            48  cpu_exec + 1764 (cpu-exec.c:1032,13 in qemu-system-x86_64 + 3469648) [0x104c03150] 1-48
              48  cpu_loop_exec_tb + 32 (cpu-exec.c:868,10 in qemu-system-x86_64 + 3469648) [0x104c03150] 1-48
                48  cpu_tb_exec + 148 (cpu-exec.c:438,11 in qemu-system-x86_64 + 3467428) [0x104c028a4] 1-48
                  48  ??? [0x10ca72218] 1-48
                    48  helper_fdiv_STN_ST0 + 76 (fpu_helper.c:578,10 in qemu-system-x86_64 + 2454472) [0x104b0b3c8] 1-48
                      48  helper_fdiv + 16 (fpu_helper.c:159,20 in qemu-system-x86_64 + 2454472) [0x104b0b3c8] 1-48
                        1   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 1
                          1   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 1
                            1   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 1
                              1   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 1
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 1
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 2
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 2
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 2
                        1   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 3
                          1   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 3
                            1   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 3
                              1   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 3
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 3
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 4
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 4
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 4
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 5-7
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 5-7
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 5-7
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 5-7
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 5-7
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 8
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 8
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 8
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 9-11
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 9-11
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 9-11
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 9-11
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 9-11
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 12
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 13-15
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 16
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 17
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 18-28
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 29
                        2   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                          2   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                            2   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                              2   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                                2   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 30-31
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 32
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 33
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 34-44
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 45
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 9-11
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 9-11
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 12
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 13-15
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 16
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 17
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 18-28
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 29
                        2   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                          2   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                            2   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                              2   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                                2   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 30-31
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 32
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 33
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 34-44
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 45
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 46-48
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 12
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 12
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 13-15
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 13-15
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 16
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 16
                        1   floatx80_div + 512 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                          1   parts128_div + 304 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389604) [0x104bef8a4] 17
                            1   frac128_div + 256 (softfloat.c:1052,5 in qemu-system-x86_64 + 3389604) [0x104bef8a4] (running) 17
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 18-28
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 18-28
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 29
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 29
                        2   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                          2   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                            2   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                              2   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 30-31
                                2   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 30-31
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 32
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 32
                        1   floatx80_div + 500 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                          1   parts128_div + 292 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                            1   frac128_div + 244 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                              1   add192 + 4 (softfloat-macros.h:460,14 in qemu-system-x86_64 + 3389592) [0x104bef898] 33
                                1   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389592) [0x104bef898] (running) 33
                        11  floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                          11  parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                            11  frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                              11  add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 34-44
                                11  uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 34-44
                        1   floatx80_div + 508 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                          1   parts128_div + 300 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389600) [0x104bef8a0] 45
                            1   frac128_div + 252 (softfloat.c:1053,11 in qemu-system-x86_64 + 3389600) [0x104bef8a0] (running) 45
                        3   floatx80_div + 496 (softfloat.c:2560,10 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                          3   parts128_div + 288 (softfloat-parts.c.inc:605,28 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                            3   frac128_div + 240 (softfloat.c:1054,9 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                              3   add192 + 0 (softfloat-macros.h:459,14 in qemu-system-x86_64 + 3389588) [0x104bef894] 46-48
                                3   uadd64_carry + 0 (host-utils.h:576,9 in qemu-system-x86_64 + 3389588) [0x104bef894] (running) 46-48

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sun Jun 04, 2023 3:29 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
prajwal wrote:
The line in qemu where it was hanging was "host-utils.h:576" - which is a call to function "__builtin_addcll"

What happens if you replace line 574 with "#if 0"? That should remove the call to __builtin_addcl without changing anything else.

If that fixes it, it narrows down the possibilities a bit: it could be QEMU's build scripts enabling unsupported instructions, it could be a bug in the compiler (Clang?), or it could be a bug in the CPU.


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 16, 2023 9:35 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
I narrowed the problem to usub64_borrow()

The problem seems to be with __builtin_subcll() call actually

After I put some debug printf, I found that (in my case), when I start qemu,

the first call to usub64_borrow is for x = 0, y = 0, carryin = 0, and the output result = 0 and carryout = 0
the next call to usub64_borrow is for x = 7205759403792793600, y = 7205759402719051776, carryin = 0, and the output result = 1073741824 and carryout = 1

The carryout must be 0 in this case but instead it is coming as 1. If I mark the code as #if 0 and execute the other alternate, it works fine

By returning the incorrect carryout = 1, the calculation actually gets into an infinite loop and hence qemu is hanging

I tried above calculation in a sample c program but it works fine there. The problem exists only as part of qemu execution

PS: The weird thing I observed is if I pass carryin as "0" - as a hardcoded param to __bulitin_subcll() instead of passing *pborrow, then it works as expected, i.e. the carryout comes back as 0

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Fri Jun 16, 2023 10:25 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
I think you'll have to either disassemble usub64_borrow() or step through it at the instruction level to see what's going on. That might be difficult if it's been inlined, but there are ways to track it down.


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 17, 2023 8:04 am 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
I suspect there is some compiler optimisation happening with bool to uint64_t conversions - so, I have below patch that makes it work properly. This patch was actually needed for usub64_borrow() alone - however, for consistency, I modified uadd64_carry() as well

Code:
diff --git a/include/qemu/host-utils.h b/include/qemu/host-utils.h
index 3ce62bf4a5..bc9955a3ad 100644
--- a/include/qemu/host-utils.h
+++ b/include/qemu/host-utils.h
@@ -571,8 +571,8 @@ static inline bool mulu128(uint64_t *plow, uint64_t *phigh, uint64_t factor)
static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry)
{
#if __has_builtin(__builtin_addcll)
-    unsigned long long c = *pcarry;
-    x = __builtin_addcll(x, y, c, &c);
+    volatile uint64_t c = *pcarry;
+    x = __builtin_addcll(x, y, c, (uint64_t*)&c);
     *pcarry = c & 1;
     return x;
#else
@@ -596,8 +596,8 @@ static inline uint64_t uadd64_carry(uint64_t x, uint64_t y, bool *pcarry)
static inline uint64_t usub64_borrow(uint64_t x, uint64_t y, bool *pborrow)
{
#if __has_builtin(__builtin_subcll)
-    unsigned long long b = *pborrow;
-    x = __builtin_subcll(x, y, b, &b);
+    volatile uint64_t b = *pborrow;
+    x = __builtin_subcll(x, y, b, (uint64_t*)&b);
     *pborrow = b & 1;
     return x;
#else

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 17, 2023 3:11 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
prajwal wrote:
I suspect there is some compiler optimisation happening with bool to uint64_t conversions

Have you examined the code at the instruction level to see for sure?

prajwal wrote:
I have below patch that makes it work properly.

That just hides the problem, it doesn't fix it.


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 17, 2023 6:06 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 23, 2004 11:00 pm
Posts: 154
I moved the inlined usub64_borrow() function from host-utils.h to fpu/softfloat.c as a non static, non inline function and then reproduced the problem with and without my workaround

Here's the disassembly output of the original (failing) code:
Code:
0000000000016f94 <_usub64_borrow>:
;     unsigned long long b = *pborrow;
   16f94: 48 00 40 39     ldrb   w8, [x2]
;     x = __builtin_subcll(x, y, b, &b);
   16f98: 09 00 01 eb     subs   x9, x0, x1
   16f9c: ea 27 9f 1a     cset   w10, lo
   16fa0: 20 01 08 eb     subs   x0, x9, x8
   16fa4: e8 27 9f 1a     cset   w8, lo
   16fa8: 48 01 08 2a     orr   w8, w10, w8
;     *pborrow = b & 1;
   16fac: 48 00 00 39     strb   w8, [x2]
;     return x;
   16fb0: c0 03 5f d6     ret


Here's the disassembly output of the modified (working) code:
Code:
; {
   17980: ff 43 00 d1     sub   sp, sp, #16
;     volatile unsigned long long b = *pborrow;
   17984: 48 00 40 39     ldrb   w8, [x2]
   17988: e8 07 00 f9     str   x8, [sp, #8]
;     x = __builtin_subcll(x, y, b, (uint64_t*)&b);
   1798c: e8 07 40 f9     ldr   x8, [sp, #8]
   17990: 09 00 01 eb     subs   x9, x0, x1
   17994: ea 27 9f 1a     cset   w10, lo
   17998: 20 01 08 eb     subs   x0, x9, x8
   1799c: e8 27 9f 1a     cset   w8, lo
   179a0: 48 01 08 2a     orr   w8, w10, w8
   179a4: e8 07 00 f9     str   x8, [sp, #8]
;     *pborrow = b & 1;
   179a8: e8 07 40 f9     ldr   x8, [sp, #8]
   179ac: 08 01 00 12     and   w8, w8, #0x1
   179b0: 48 00 00 39     strb   w8, [x2]
;     return x;
   179b4: ff 43 00 91     add   sp, sp, #16
   179b8: c0 03 5f d6     ret

_________________
complexity is the core of simplicity


Top
 Profile  
 
 Post subject: Re: QEMU hangs on M2 MacBook running Ventura
PostPosted: Sat Jun 17, 2023 10:13 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
Those two functions do exactly the same thing. There's no difference.

Are you sure the failing version hasn't been inlined anywhere?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 60 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group