My Haiku ARM (UEFI) port progress

which qemu version are you using?
we get alignment exception on qemu 6.1 and newer

(btw welcome back! :smiley: )

2 Likes

it’s qemu 6.2.0, but it works well for riscv64.

2 Likes

yes, this is an arm-specific thing

we do have an alignment problem in current latest hrev
qemu up to 6.0 did not enforce alignment so it’s able to boot up as alignment exceptions are not implemented in qemu <=6.0.
6.1 and later (and real hardware) generates an alignment exception so we get stuck at 4th icon in the bootup.

the exception comes from bfs_read_dir in add-ons/kernel/file_systems/bfs/kernel_interface.cpp
anyways i’ve been looking into exception handling on the ARM so i’d like to add a proper printout with stack trace for this case.

3 Likes

I had fixed some issues with misaligned access for the sparc port in BFS here: https://review.haiku-os.org/c/haiku/+/2363

It’s strange that ARM needs even more fixes, do you know which structures are involved already?

6 Likes

it’s the dirent structure, as I remember there’s an strd instruction when setting the d_ino field

I have a kludge like this, i.e. round up d_reclen to 4 bytes boundary

                dirent->d_dev = volume->ID();
                dirent->d_ino = id;
                dirent->d_reclen = ROUNDUP(offsetof(struct dirent, d_name) + length + 1, 4);

                bufferSize -= dirent->d_reclen;
                dirent = (struct dirent*)((uint8*)dirent + dirent->d_reclen);
4 Likes

with below _PACKED declare for struct dirent:

typedef struct dirent {
dev_t d_dev; /* device /
dev_t d_pdev; /
parent device (only for queries) /
ino_t d_ino; /
inode number /
ino_t d_pino; /
parent inode (only for queries) /
unsigned short d_reclen; /
length of this record, not the name /
#if GNUC == 2
char d_name[0]; /
name of the entry (null byte terminated) /
#else
char d_name[]; /
name of the entry (null byte terminated) */
#endif
} _PACKED dirent_t;

We could reach rocket again for arm platform, but I have checked Linux referring headers, they don’t use this attribute for dirent_t. instead they have another dirent64_t for this:

struct dirent
  {
#ifndef __USE_FILE_OFFSET64
    __ino_t d_ino;
    __off_t d_off;
#else
    __ino64_t d_ino;
    __off64_t d_off;
#endif
    unsigned short int d_reclen;
    unsigned char d_type;
    char d_name[256];		/* We must not include limits.h! */
  };

#ifdef __USE_LARGEFILE64
struct dirent64
  {
    __ino64_t d_ino;
    __off64_t d_off;
    unsigned short int d_reclen;
    unsigned char d_type;
    char d_name[256];		/* We must not include limits.h! */
  };
#endif
2 Likes

Probably _PACKED triggers the use of unaligned-safe instructions, while without the compiler assumes the struct is memory-aligned. Really it is just a matter of aligning up d_reclen.

2 Likes

yep I’d rather align reclen instead of marking the whole structure as packed.

first attempt - which might be not entirely right (reverted):
https://review.haiku-os.org/c/haiku/+/5623

more cautious approach:
https://review.haiku-os.org/c/haiku/+/5631

3 Likes

After rocket launching, we enter kernel debug:

syscall(_kern_entry_ref_to_path,5)
returning 0
Exception: Page Fault
R00=00000000 R01=00000000 R02=00000000 R03=707315d8
R04=00000000 R05=70731b98 R06=707316c0 R07=70731930
R08=183e7ac8 R09=70731680 R10=70731988 R11=707316f0
R12=01591e84 SPs=8020ef8c LRs=0201b804 PC =0201b804
             SPu=707315c8 LRu=014fca34 CPSR=20000053
FAR: 0201b804, FSR: 0000000f, isUser: 0, isWrite: 0, isExec: 1, thread: worker
PANIC: PXN violation trying to execute user-mapped address 0x0201b804 from kernel mode

Welcome to Kernel Debugging Land...
Thread 835 "worker" running on CPU 0
iframe 0x8020ef40 (end = 0x8020ef8c)
stack trace for thread 0x343 "worker"
    kernel stack: 0x8020b000 to 0x8020f000
      user stack: 0x706f2000 to 0x70732000
frame            caller     <image>:function + offset
Exception: Page Fault
R00=801f74a0 R01=00000001 R02=80aa92b0 R03=00000001
R04=00000001 R05=8020ef40 R06=00000000 R07=00000000
R08=00000000 R09=80aa92b0 R10=80aa8e80 R11=00000001
R12=00000078 SPs=8020ed54 LRs=80157138 PC =801572cc
             SPu=707315c8 LRu=014fca34 CPSR=200000d3
FAR: 00000001, FSR: 00000005, isUser: 0, isWrite: 0, isExec: 0, thread: worker


I am not sure if this is caused by “worker” thread, which is used for kernel debuging.

3 Likes

I don’t understand entirely what’s happening but somehow the return address (or SPSR) gets screwed up when returning from the exception handler. (PULL_FRAME and/or PULL_FRAME_FROM_SVC_AND_EXIT)

Some experimental code - if I disable interrupts before pull_frame, I don’t get any PXN violations any more. I took the idea from FreeBSD exception handling code.
https://review.haiku-os.org/c/haiku/+/5633

reference code from FreeBSD: exception.S
the DO_AST macro does a bunch of things but as I understand it leaves interrupts disabled at the end.

There’s also a patch from Kallisti5 that enables proper stack frame tracing:
https://review.haiku-os.org/c/haiku/+/4873

2 Likes

That is not directly related.

Historically Linux had 32bit everything, including off_t and ino_t. That prevented files larger than 2GB and limited the number of files on disk.

Later on they introduced 64bit versions of these, but for backwards compatibility it is still possible to get the old ones, and it is possible in two ways:

  • Initially, they kept the old “dirent” the same as it was. They introduced a new “dirent64” and a lot of new functions that handled everything 64bit, in a feature named __USE_LARGEFILE64. You can guess what happened next: the new functions were nonstandard, most apps did not bother to use them, no progress was made except for specific apps which really had a need for large files or lots of files
  • Then, they understood this was not going to work; and so they introduced another feature where the original dirent (and all related functions) would also switch to 64bit.

Now both structures are identical and they are kept for compatibility reasons.

Note that glibc is working on switching time_t to 64bit in a similarly confusing and complicated way. Probably as 2038 gets nearer they will, again, change their mind, and find a way to make time_t a 64bit value. I hope.

1 Like

After some debug I found it’s still blocked by launch_daemon.

launch_daemon process(ARM):

launch_daemon: read file /boot/system/data/launch/system
  add job "x-vnd.haiku-registrar"
  add job "x-vnd.haiku-debug_server"
  add job "x-vnd.haiku-package_daemon"
  add job "x-vnd.haiku-systemlogger"
  add job "x-vnd.haiku-mount_server"
  add job "x-vnd.haiku-media_server"
    event: or [initial_volumes_mounted]
  add job "x-vnd.haiku-midi_server"
    event: or [demand]
  add job "x-vnd.haiku-net_server"
  add job "x-vnd.be-psrv"
    event: or [demand]
  add job "x-vnd.haiku-notification_server"
    event: or [demand]
  add job "x-vnd.haiku-power_daemon"
  add job "x-vnd.haiku-cddb_lookup"
    event: or [volume_mounted]
  add job "x-vnd.haiku-autologin"
Register external event 'initial_volumes_mounted': -2147483641
BRoster::_LaunchApp()BRoster::_LaunchApp()BRoster::_LaunchApp()  find app: No error (0)  
  build argv: No error (0)
  token: 0
  find app: No error (0) application/x-vnd.Haiku-mount_server 
  build argv: No error (0)
  token: 0
  find app: No error (0) application/x-vnd.Haiku-debug_server 
  build argv: No error (0)
  token: 0

THEN IT HANGS

launch_daemon process(RISCV64):

launch_daemon: read file /boot/system/data/launch/system
  add job "x-vnd.haiku-registrar"
  add job "x-vnd.haiku-debug_server"
  add job "x-vnd.haiku-package_daemon"
  add job "x-vnd.haiku-systemlogger"
  add job "x-vnd.haiku-mount_server"
  add job "x-vnd.haiku-media_server"
    event: or [initial_volumes_mounted]
  add job "x-vnd.haiku-midi_server"
    event: or [demand]
  add job "x-vnd.haiku-net_server"
  add job "x-vnd.be-psrv"
    event: or [demand]
  add job "x-vnd.haiku-notification_server"
    event: or [demand]
  add job "x-vnd.haiku-power_daemon"
  add job "x-vnd.haiku-cddb_lookup"
    event: or [volume_mounted]
  add job "x-vnd.haiku-autologin"
Register external event 'initial_volumes_mounted': -2147483641
BRoster::_LaunchApp()  find app: No error (0)  
  build argv: No error (0)
  token: 0
BRoster::_LaunchApp()BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-debug_server 
  build argv: No error (0)
  token: 0
  find app: No error (0) application/x-vnd.Haiku-mount_server 
  build argv: No error (0)
  token: 0
slab memory manager: created area 0xffffffc015001000 (2991)
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-autologin run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-net_server 
  build argv: No error (0)
  token: 0
BRoster::InitMessengers()
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-debug_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.haiku-package_daemon 
  build argv: No error (0)
  token: 0
BRoster::InitMessengers()
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-mount_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-powermanagement 
  build argv: No error (0)
  token: 0
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-net_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.haiku-registrar 
  build argv: No error (0)
  token: 0
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-package_daemon run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-SystemLogger 
  build argv: No error (0)
  token: 0
BRoster::InitMessengers()
Last message repeated 2 times.
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-power_daemon run BRoster::InitMessengers()
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
  found roster port
BRoster::InitMessengers() done
x-vnd.haiku-registrar run   found roster port
BRoster::InitMessengers() done
  found roster port
BRoster::InitMessengers() done
  found roster port
BRoster::InitMessengers() done
  found roster port
BRoster::InitMessengers() done
  found roster port
BRoster::InitMessengers() done
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-systemlogger run BRoster::InitMessengers()
BRoster::InitMessengers()
BRoster::InitMessengers() done
  no (useful) reply from roster: error: 80001200: Bad port ID
  found roster port
BRoster::InitMessengers() done
  got reply from roster
Last message repeated 6 times.
REG: Failed to open shadow passwd DB file "/etc/shadow": No such file or directory

6 Likes

do you have any more specific error message or register/memory dump from qemu? (e.g. using “info registers” and “x” commands)

i’m not sure about our setjmp implementation.
for one thing, it doesn’t save the FPU context but the jmp_buf definition and the way how the regular registers are handled might be also not entirely right.

the current implementation sets _SETJMP_BUF_SZ to (15+2)*4 = 34 which is the number of ints so if I’m not mistaken we end up reserving 136 bytes for the jmp_buf structure.
sigsetjmp stores r0-14 and cpsr
siglongjmp restores r0, r2-r14

:face_with_raised_eyebrow:

edit: I’m trying to look at FreeBSD setjmp.S as reference code, based on that the following seems to be a good approach:
save registers r4-r14, FPU registers d8-d15, FPU status register fpscr
probably we can clobber r0-r3 as these are save-by-caller

FreeBSD also seems to store a bunch of signal related stuff in the jump buffer which is probably handled in a different way in Haiku but I’m not sure

edit 2: there might be something wrong with Thread->fault_handler_state
it’s supposed to be a jmp_buf but someone is misusing it for something else
- it was only a build issue

5 Likes

I have checked the thread schedule, the kernel thread can run normally(longjmp should be ok), but not for user threads.

2 Likes

with align declare (referring to Cross Reference: /linux-master/arch/arm/kernel/entry-v7m.S), we could reach more steps:

	// jump to user mode entry point
	movs	pc, lr
FUNCTION_END(arch_return_to_userland)
    .data
    .align	10

now, we can reach registrar:

egister external event 'initial_volumes_mounted': -2147483641
BRoster::_LaunchApp()enter to userland thread "worker" (836)
arch_thread_enter_userspace: entry 0x1a70d2c, args 0x00d6bab8 0x1852bc48, ustack_top 0x71b0e000
BRoster::_LaunchApp()BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-mount_server 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/mount_server', args = 0xcd47f2a8, argCount = 1
  find app: No error (0) application/x-vnd.Haiku-debug_server 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/debug_server', args = 0xcd47f1e8, argCount = 1
  find app: No error (0)  
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/bin/autologin', args = 0x80ca4800, argCount = 1
team_create_thread_start: entry thread 839
team_create_thread_start: loading elf binary '/boot/system/bin/autologin'
elf_load_user_image e_entry 0xd7f4, delta 0x1847000
team_create_thread_start: loaded elf. entry = 0x18547f4
enter to userland thread "autologin" (839)
arch_thread_enter_userspace: entry 0x18547f4, args 0x72bd7100 0x621cf000, ustack_top 0x72bd7000
team_create_thread_start: entry thread 838
team_create_thread_start: loading elf binary '/boot/system/servers/debug_server'
elf_load_user_image e_entry 0xd7f4, delta 0x4aa000
team_create_thread_start: loaded elf. entry = 0x4b77f4
enter to userland thread "debug_server" (838)
arch_thread_enter_userspace: entry 0x4b77f4, args 0x71ff2100 0x61864000, ustack_top 0x71ff2000
team_create_thread_start: entry thread 837
team_create_thread_start: loading elf binary '/boot/system/servers/mount_server'
elf_load_user_image e_entry 0xd7f4, delta 0x76c000
team_create_thread_start: loaded elf. entry = 0x7797f4
enter to userland thread "mount_server" (837)
arch_thread_enter_userspace: entry 0x7797f4, args 0x71294100 0x61ad5000, ustack_top 0x71294000
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-autologin run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-net_server 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/net_server', args = 0xcd47f2a8, argCount = 1
BRoster::InitMessengers()
team_create_thread_start: entry thread 840
team_create_thread_start: loading elf binary '/boot/system/servers/net_server'
elf_load_user_image e_entry 0xd7f4, delta 0x53a000
team_create_thread_start: loaded elf. entry = 0x5477f4
enter to userland thread "net_server" (840)
arch_thread_enter_userspace: entry 0x5477f4, args 0x71d92100 0x618da000, ustack_top 0x71d92000
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-debug_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.haiku-package_daemon 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/package_daemon', args = 0x80979f10, argCount = 1
team_create_thread_start: entry thread 841
team_create_thread_start: loading elf binary '/boot/system/servers/package_daemon'
elf_load_user_image e_entry 0xd7f4, delta 0x14c8000
team_create_thread_start: loaded elf. entry = 0x14d57f4
enter to userland thread "package_daemon" (841)
arch_thread_enter_userspace: entry 0x14d57f4, args 0x727e6100 0x61fad000, ustack_top 0x727e6000
BRoster::InitMessengers()
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-mount_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.Haiku-powermanagement 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/power_daemon', args = 0x80979f10, argCount = 1
team_create_thread_start: entry thread 842
team_create_thread_start: loading elf binary '/boot/system/servers/power_daemon'
BRoster::InitMessengers()
elf_load_user_image e_entry 0xd7f4, delta 0x7f4000
team_create_thread_start: loaded elf. entry = 0x8017f4
enter to userland thread "power_daemon" (842)
arch_thread_enter_userspace: entry 0x8017f4, args 0x7142a100 0x602a3000, ustack_top 0x7142a000
  load image: No error (0)
  set thread and team: No error (0)
  resume thread: No error (0)
BRoster::_LaunchApp() done: No error (0)
x-vnd.haiku-net_server run BRoster::_LaunchApp()  find app: No error (0) application/x-vnd.haiku-registrar 
  build argv: No error (0)
  token: 0
_user_load_image: argc = 1
load_image_internal: name '/boot/system/servers/registrar', args = 0x80979cd0, argCount = 1
team_create_thread_start: entry thread 843
team_create_thread_start: loading elf binary '/boot/system/servers/registrar'
elf_load_user_image e_entry 0xd7f4, delta 0x2118000
team_create_thread_start: loaded elf. entry = 0x21257f4
enter to userland thread "registrar" (843)
arch_thread_enter_userspace: entry 0x21257f4, args 0x733ef100 0x61c35000, ustack_top 0x733ef000
BRoster::InitMessengers()
17 Likes

After some more hacking and debuging, The rocket finally hanged(All the processes started by launch daemon stopped here) with fault_get_page by VMCache::WaitForPageEvents(vm_page* page, uint32 events, bool relock) when running “vm_soft_fault”, we will find the bug flying soon :wink:

18 Likes

@davidkaroly @korli @waddlesplash

VMCache::WaitForPageEvents add the pointer of local variable waiter to a global fPageEventWaiters and make it as a root node of waiter, is this ok for the vm management?

944void
945VMCache::WaitForPageEvents(vm_page* page, uint32 events, bool relock)
946{
947	PageEventWaiter waiter;
948	waiter.thread = thread_get_current_thread();
949	waiter.next = fPageEventWaiters;
950	waiter.page = page;
951	waiter.events = events;
952
953	fPageEventWaiters = &waiter;
954
955	thread_prepare_to_block(waiter.thread, 0, THREAD_BLOCK_TYPE_OTHER, page);
956
957	Unlock();
958	thread_block();
959
960	if (relock)
961		Lock();
962}

the process of fPageEventWaiters as below:

/*!	Wakes up threads waiting for page events.
	\param page The page for which events occurred.
	\param events The mask of events that occurred.
*/
void
VMCache::_NotifyPageEvents(vm_page* page, uint32 events)
{
	PageEventWaiter** it = &fPageEventWaiters;
	while (PageEventWaiter* waiter = *it) {
		if (waiter->page == page && (waiter->events & events) != 0) {
			// remove from list and unblock
			*it = waiter->next;
			thread_unblock(waiter->thread, B_OK);
		} else
			it = &waiter->next;
	}
}
3 Likes

The thread will block until being waken up, so the on-stack PageEventWaiter is valid for the time of the wait.

2 Likes

can it be related to accessed/modified flags?

if so, this is not really a bug but rather an unimplemented part as we don’t have any accessed and modified flags yet.

1 Like