How to store file references efficiently/inode to file

(while I’m testing beta2 and reporting issues, trying to be a useful developer and aware of the fact that I should help finalize the release, I can’t help but having to prototype an idea I have been breeding for ages now, starting already back in classic BeOS times…:smirk:)

I need to store references to files and directories in an efficient way in attributes (more context later, for now please don’t ask :smirk:).
For that, I want to store the inode, so I don’t have to rely on the path and can resolve the file as long as it stays on the same device.

I can easily get the inode of a file and store that, and I’m aware that I need the directory to go back again, but so far have been unable to resolve the file from the inode.

It’s been a long time since I’ve used the storage kit and friends, but I can’t quite wrap my head around the API regarding traversal between nodes, refs and stats.
It’s easy to go from high level to low level (file to inode), but not back again, as for some reason, storage kit only provides functions for higher level structures, but I can’t for example create an entry_ref from an inode.

I can get from inode to stat, but not to entry.
I’m sorry if this is a bit confusing, I don’t have the code at hand right now as my only HaikuOS capable laptop has issues.

So I’m kind of lost in low level land and seem to fall in the gaps of the API.
Any helping pointer (literally :smirk:) here?

Hmm, the public API only allows creating BDirectory objects from a node_ref, not BFile.
It’s not really a good idea anyway, as it depends on BFS where inodes points to their parent directory…

Thanks @mmu_man, I was hoping for a more positive answer but it seems not possible, which is a bit strange since the API is only one way then…
Dependency on BFS is no problem for me however, as my idea would build on it anyway…
Do you see an alternative for storing file references in an efficient and location agnostic (i.e. stable and consistent) way then?

Maybe when it’s time for the application context to become clearer, it will be easier to say what you need. Are you saying you want to store the path to the file, in the file … in a location agnostic way? Maybe you could elaborate.

Sorry for the confusion and lack of context @donn, I’d need to store references from one file/dir to other files/dirs in an efficient way, so the full path would be too much bloat, and also break when moving the references files.
However upon usage, I’d need to resolve the current location of the file again.
I want to use the power of BFS to do this, instead of having to depend on a database. It should be as HaikuOS native as possible.

As they said back in the Be days… “tell us why you need something, we’ll tell you why you don’t.” :smiley:

3 Likes

The API is indeed one-way, and so are the filesystem structures. If all you have is the inode, you have very fast access to the file contents, but you don’t have access to the directory containing that file. Indeed, if there are hardlinks there could be multiple directories pointing to that same file.

This is documented at The Be Book - System Overview - The Storage Kit

From an inone you can get to one of the BNode subclasses: BFile, BDirectory or BSymlink. But you can’t get back to en entry_ref, because the entry_ref info does not come from the place on disk where the node is stored, it comes from the parent directory, and there is no way in the filesystem to get from a node to its parent directory.

2 Likes

Thanks for the excellent background information @PulkoMandy, I get this now.

But your last paragraph gives me hope - actually I don’t care about the directory of the referenced file, I only need the File itself, but found no suitable way to construct that from an inode (and device_no if necessary).
That’s why I was going down the rabbit hole with Entries, entry_refs and nodes from the start,-) and for that I needed a directory.
If there’s another way, directly using BFile or superclasses, I’m all for it.

There is no way to do this for files. It can, however, be done for directories, if you have a node_ref pointing to a directory you can create a BDirectory from it. This is because directories can access their parent (through the “…” directory), while files can’t.

So, the best you could get is storing a node_ref to the parent directory, and the name of the file inside that directory. And this is, in fact, exactly what an entry_ref does.

The reson for this is that while the inode number is sufficient to access the file data, allowing to open a file solely from it would prevent checking that the file is indeed accessible, taking into account file permissions of the file itself and its parent directories, chroots, and any other thing that may prevent access to files inside or outside a given path.

1 Like

In BeOS there is an _open_vn_ syscall, although it also has a path argument, so not sure how it worked, but Haiku doesn’t have that anyway.

It all depends what you want to do, is it for a few small files? Maybe you can tag them with an attribute and use a query…

Thanks again @PulkoMandy, I was not aware of the rationale regarding permissions (as we don’t have them in BeOS/HaikuOS anyway :smiling_imp:), but they they could still be checked when trying to access that file. The API could still provide me a (readonly if you will) reference to that file from the inode…

Using name+parent dir would easily break my use case using references between files when you rename or move them…

It could potentially be system wide so efficient storage and fast retrieval is essential.
I intended to do just that and use queries though, like you suggested :smirk:
So if just I could get a “pointer” to a file that resolves in a stable way that would be awesome.

If performance is important and you want to potentially impact all files, another option that didn’t exist in BeOS is filesystem overlays. You could in theory add extra indices over existing filesystems, and even preempt queries before they reach the filesystem to resolve those you want directly.
They run in the kernel though, so you need extra care in the code, and a way to communicate with whatever userland stuff you need. Or maybe the existing userlandfs framework also supports running overlays?

1 Like

Thanks that’s a really good pointer, I wanted to keep it simple but it seems this is the way to go then.
Will check this out…
Still would love to use BFS features as much as possible to keep it native and tightly integrated though, will see how to best combine this…

Well, you should really describe what you want with others, maybe on IRC, that would help figuring out which way to go.

1 Like

Will definitely do that once I have a concrete proposal ready :smirk: you’re welcome to join the mission if you’re convinced then :relieved:

filesystem overlays seem to do what I need, but I could hardly find any current information on that for haiku @mmu_man. Esp the part with adding indices and rewriting queries sounds great, do you have any links/docs or should I search the code base?
I’ve only found 2 old articles about Ingo’s netfs and fs plugins for userlandfs.
What actually happened to netfs? Sounds interesting and I don’t like samba (bad performance and system hangs with parallel writes of big files regardless of config)

So this has bugged me until last week I finally had my eureka moment and realized I can just store the inodes of referenced files in an attribute (don’t care about the file/entry semantics, just need a quick and easy but stable and unique id that’s strongly tied to the file).
For retrieving the referenced file(s) later, I query for files with that attribute and id, which will give me back the full path of all referenced files😌
So with this blocker out of my way, I can finally start my prototype and reveal some more context later :star_struck:

1 Like