XFS file system testing

I am trying to test current status of xfs file system for Haiku and using xfs shell to test it.

I created disk image containing a XFS filesystem in it.
I took help from this article Working with filesystem and disk images.

  • First I created a sparse file of 1GB size using $ dd if=/dev/zero of=fs.img count=0 bs=1 seek=5G command

  • Added XFS file system to the fs.img file using $ /sbin/mkfs.xfs fs.img

  • After that I mounted the image as described in above article and added 1000’s of directories for testing purpose.

I compiled xfs_shell and ran it, it compiled successfully but when I used ls command to list all the directories I ran into following errors

I think the problem lies in my XFS image file.
Any help on how to resolve it?

The way you created the image seems correct. Probably something is wrong or missing in our xfs code.

The error and debug messages are not very clear, so a good first step to investigating this would be adding more debug messages to the xfs code, or see if it can have extra tracing enabled at compile time (usually by defining a TRACE_xxx macro in the sourcefiles where you need tracing). Then it will be easier to see why our xfs code is nottable to understand your test file.

Is there some example or source file where extra tracing is enabled?
I will then try to add it to xfs code and check error logs again.

You can enable tracing for XFS by adding a #define TRACE_XFS here: Debug.h\xfs\file_systems\kernel\add-ons\src - haiku - Haiku's main repository

Then all the call to the TRACE() macro in the XFS sourcecode will result in logs in xfs_shell, making things easier to debug.

1 Like

I tried to read the forum of last GSoC student who worked on XFS.
He was able to read files using xfs_shell and I tried to test using the images shared by him on this blog GSoC 2021 Project: XFS support progress - #16 by island0x3f.

I was able to read the files in that image using xfs_shell, to do some more testing I added my own directories in that image and successfully tested 10k directories.

To look why my image was giving error I checked the version of XFS in both the images.
My XFS image has XFS version 5 while the image I successfully tested has XFS version 4.

This gives us an idea that in our current state we are not able to read XFS version 5 images.

2 Likes

I think the idea was to implement the older versions first and then gradually add the new features. However, this should fail with a better message like “xfs version 5 is not yee supported”, for now

Can we make the changes in source file to give better error message?
For eg : when I tried to run command mkdir it has given “Command not supported yet” we could make same for XFS version 5 as well I guess, though right now I don’t know how to do that though :frowning:

Anyways I will continue my testing on xfs version 4 image.
I encountered some segmentation fault on running ls on directory with 20k directories.

Its very strange though, at first I was able to read completely upto 20k but when I ran ls command again it has shown segmentation fault after reading 6k directories.
Some large directories are showing same segmentation fault : 11 error.

Looks like this error wasn’t resolve last year.
Any guide on how to fix this?

Yes.

For segmentation faults, maybe you can use a debugger to investigate them, get a stack trace, and see where is the code it’s crashing.
Or just add a lot of TRACE() call everywhere in the code to see which parts of the code are run, and where it’s failing.

Regarding Trace I added #define TRACE_XFS here Debug.h\xfs\file_systems\kernel\add-ons\src - haiku - Haiku’s main repository
But when used Jam run there is build failure
One file failed to update and the error msg I got is
Screenshot 2022-04-13 at 4.23.52 PM

There are total 12 files.
When I remove #define TRACE_XFS, jam run compiles successfully.

You can look in more details at the build logs. Probably some of the TRACE calls are incorrect and need to be fixed: wrong format strings for example (the format strings work like the ones used for the standard C printf function)

I thought so…
Good thing is only one file failed to build, so its going to be easy to debug it.
I will look at build logs to find that file and incorrect trace call.

I tried to look at source code to get better idea on how we can generate better fail message and found this xfs.cpp\xfs\file_systems\kernel\add-ons\src - haiku - Haiku's main repository

whenever we try to run xfs image version other than 4 its going to show us error due to ASSERT function.

Considering now linux builds xfs images in version 5 I tried to create a new block here specially handling version 5 case and got satisfactory error message.

What do you think about this @PulkoMandy

You can submit your fixes and changes on review.haiku-os.org (the TRACE fix and the code to show a better error for v5 images).

Then you can continue exploring the problems:

  • Either try more things with v4 images and try to identify the problems with reading these
  • Or see what would be needed for v5 support

Then you can set up a list of tasks/roadmap for things to work on next.

I submitted the patch!

I think for now I will try to debug that segmentation fault we encountered while testing and see if I make some progress there.
If not then I will look for v5 support or maybe extended attributes feature.

2 Likes

I need help in attaching debugger to xfs_shell.
I am using lldb debugger.

Steps I am following are :

  1. jam run xfs_shell as usual.

  2. Get pid of xfs_shell using pgrep -x xfs_shell command.

  3. open new terminal window and run lldb -p pid_of_xfs_shell .

Now lldb attaches itself to xfs_shell but there is initial SIGSTOP I guess due to my local machine.
Anyways when I command continue process simply exit with status 0.

I tried to look for some solution to this situation but didn’t find any :frowning:

The Ideal situation should have been process getting continued and then I would have been able to generate segmentation fault as usual in xfs_shell and get backtrace in lldb.

Command window for reference :

According to Tutorial — The LLDB Debugger it should be possible to do this:

  • First start lldb
  • Use the command process attach --name xfs_shell --waitfor

Then start xfs_shell using jam run

This will allow lldb to attach to the process right as it is starting, hopefully it will help.

1 Like

Actually This is almost same as the process I applied and unfortunately gave me same results.

Anyways I tried to get GDB on my local machine and I am able to actually continue process using GDB debugger, but the problem now is whenever a segmentation fault occurs xfs_shell terminates entire process.

So when I try to reproduce segmentation fault for GDB to catch it, we can’t get any trace as xfs_shell terminates and GDB gives result as

Inferior 1 (process ___) exited normally.

If we could get xfs_shell to not terminate we can then get a backtrace.

I think I should ask this on mailing list to get other developers help as well.

I’m not familiar with how lldb works. It’s strange that gdb would not see the segmentation fault.

An option is adding code in xfs_shell to “ignore” the SIGSTOP signal and not exit when receiving it.

Another option is to start the process from inside lldb (using it’s run command) but I am not sure how easy it is to do this (since you need to replicate what “jam run” does). Maybe other developers can share their tips indeed.

In my case I run these things inside Haiku, and in that case, I can attach Haiku Debugger after the crash occurs, which is very convenient. I don’t know why this isn’t the default behavior in other systems.

The backtrace is usually found in the syslog, if you need it.

I checked and there isn’t any backtrace.