Database as a file system

Androsio · July 15, 2024, 6:46am

I would like this thread to be for constructive conversation and not a heated fight.

I have been reading the book written by Dominic Giampaolo, former Be engineer.

During the development of the first version of BeOS (AT&T Hobbit CPU, 1992), a database synchronized file system (OFS) was used.

After the cancellation of the Hobbit CPU, Be engineers changed architecture (PowerPC 603, 1994), and designed a new file system, which had database capabilities (BFS).

Now comes the big question.

If OFS is a file system synchronized with a database, and BFS is a file system with database capabilities, could a database be created that does the file system functions?

I imagine it as the logical evolution of BFS.

What do you think?

nephele · July 15, 2024, 7:05am

IIRC one of the stated reasons was performance, but leaving that aside (after all, drives are much faster nowadays)

I think that yes, indeed this can be done. In the haiku api we already make heavy use of find_directory for example. For configuration files this could be improved further with a get_configuration_entry api, which would just give you the handle apropriate to your applications configuration.

From the users point of view not having a hierarichal file system could be quite nice. on mobile Operating systems not having to deal with this seems to be the norm and many users do use that.

donn · July 15, 2024, 9:09am

I wish I could remember better what was going on with the early DR versions that came with the BeBox. There was a break in there - during the BeOS timeline - where the early releases had some DB aspects that by more production releases had gone out the window. All I really remember was that there was an annoying limitation on record length, that mean really large files needed some kind of indirection.

I don’t remember missing it a bit, but I’m not a DB guy.

phschafft · July 15, 2024, 12:20pm

I’m working on something like that for some time now.

I think the main problem is how users and software interact with it. If you want it both POSIX-y and a database you’ll fail. You can have both, but one way will always be the slow one. To really shine the way one interacts with it needs to be different from how users typically think of filesystems today.

I also found that the more you move towards some specific patterns (such as that files are not all the same but have a defined life cycle) you improve the situation drastically.

As for drive speeds: they became faster, but they also became bigger. I wouldn’t say it makes too much difference in the end. (specifically given that drive speed improvements are only a constant factor, while e.g. number of inodes is an argument to the complexity of operations.)

All that said, given the different “major” operating systems I worked with so far Haiku and Android are the two that are closest to what is needed on the OS side. My work is not specifically targetted at Haiku. But it is in my list of platforms kept in mind. And maybe if Haiku is seeking for an alternative filesystem who knows what this could result in.

If you have interest I’m also in the Haiku channel on IRC. Also always happy to present some of my work in a little more presentation/workshop setting.

syd · July 15, 2024, 10:16pm

There’s been a few DB OSs developed: Pick OS and spinoff OpenQM. The latest is DBOS - with all sorts of claims of efficiency, security, and challenging Linux. They usually end up being proprietory, in the cloud, and designed for business use (i.e., servers).

So a desktop DB OS would be cool, and a novel selling point.

phschafft · July 15, 2024, 10:41pm

That is part of why I’m currently working on it. Beside it being in my general research area I have this toy project which is a very database centred OS (but it seems unlike the ones you referenced). However don’t claim it to be better in any way nor useful to others.

Back to topic:
It would still be interesting to see some of my work ending up in Haiku. After all doing some FS stuff for two systems might overall broaden the vision and result in better quality.

I also wondered after my last post in this thread if there is anyone actively working on an replacement filesystem. Haven’t found much here beside “we could/we should” posts from the past. If anyone is working on something. feel free to ping me.

SamuraiCrow · July 16, 2024, 5:35am

Glass Elevator or Current Betas?

More than half of software development is designing the algorithm. I’m not presently working on anything database related but I, for one, think that having BeFS upgraded to BeFS 2.0 or a more revolutionary-sounding HaikuFS could be a killer feature to have. It might even put to shame the Windows Vista feature called WinFS that never actually came to pass (as a BeOS thunder stealer).

If you’re willing to outline your ideas here, maybe this could be drafted into a “glass elevator” feature for adding when Haiku release 1 is complete. If you’re not willing to wait that long, I suspect that incorporating your database features will require some Haiku devs to make time to help, in other words replies to forum posts may be most of what you’ll get.

Probing Questions

If you could answer the following questions it will make decision-making easier.

Question 1: What kind of database does your database OS feature? Graph? Relational SQL? NoSQL? Some hybrid to rule them all?
Question 2: Can it be made backwardly compatible to BeFS’ feature-wise implementation? This would obviously smoothe the transition.
Question 3: Would you need more Haiku documentation to answer the question 2?

BlueSky · July 16, 2024, 8:34am

Don´t get me wrong, I don´t want to discredit any of the ideas presented here (I don’t know enough about either file system or database internals to do that anyway).

But this sounds to me like a solution looking for a problem, at least at the moment. The big question should be: what new features do we want for the user? After that we can decide if it can be built on top of our existing file system (maybe with some enhancements) or if we need/want a new FS design (possibly involving some kind of database).

This is all R2 stuff, of course.

PulkoMandy · July 16, 2024, 9:09am

I’m a bit confused by this. My understanding, from reading Be newsletters about this, is that the old filesystem was essentially a filesystem built on top of a database. So, exactly what you are looking for here.

It turned out that:

At the time, people were too used to working with files and directories, and using a database was a bit confusing. This may be different, if Be had started a few years earlier (before directories became such a common concept) or a few years later (iOS and Android both do not use directories so visibly in the user interface)
It is not really possible to build a good, fast filesystem this way. Some features work almost-but-not-quite as expected, or implementing exactly right is a problem. Especially if you have compatibility with some standards (such as POSIX) in mind
Organizing and navigating things mainly as a database is complicated. You have to write queries to find files. This is a bit opaque and not easily represented graphically as is the case with directory navigation
Filesystem corruption was harder to recover from (maybe more a problem of implementation than theoretical design)
In the end, you really only need very few database-like functions in your filesystem

That’s how Be ended up with BFS. A good filesystem with a minimal set of database-like functionalities, largely sufficient for what they really wanted to do with database-like features. Even today, I think we are far from reaching the limitations of what can be done with BFS queries. And, if you have more complex needs, you can use existing database software (sqlite or berkeleydb or one of their competitors for example).

I agree with that. And I think we have a lot of things we could do with BFS queries, before we start hitting limitations that deserve re-thinking it. For now, BFS can stay as it is

phschafft · July 16, 2024, 9:25am

I think you’re right. As for me I’m just offering that we could have a look and maybe some common playground. My work is not about Haiku in the first place, so I personally don’t try to solve any problem here. Just offering that if something close to what I work on is actually needed, we could consider to build upon it.

Maybe that makes a little more sense. Beside that maybe @Androsio could share a bit more on his vision with us, after he started the thread in the first place. Maybe he has some specific problem in mind.

phschafft · July 16, 2024, 9:58am

The OS uses our standard data model which is graph based. The filesystem and the OS are independent projects with the common surface of the data model which is developed in other projects.

Clearly this would require checking some of the details carefully. I’m a bit worried e.g. about permission models and similar. But the data structure is already used to describe POSIX files in real world applications.

That said there are some shifts in the way application should use it to actually get the most out of it. If applications use the filesystem as if it’s some space to store byte addressable random access blobs, then there is hardly a win. Any classic filesystem will do that better. At least with less overhead.

Haiku is a bit closer to this than other systems. But I think that there are a few key aspects still missing.

Hardly, but thanks for the offer. I have a copy of that design book on the current filesystem. It was actually part of my research for this project. Plus I had a good look at different APIs POSIX and extensions from different systems. And finally, those nice superheroes on IRC are always fast to point me to anything else I may need. (Thank you all for that!)

Androsio · July 20, 2024, 5:49am

Yes, that’s right. The difference is that “the problem” is there, it just doesn’t manifest itself now, but it will sometime in the next few years.

I was on the “Learn X in Y minutes” website, watching the Raku tutorial, when the first sentence I read inspired me: “Raku (formerly Perl 6) is a highly capable programming language with abundant features to make it the ideal language for the next 100 years”.

Think of it this way: “HaikuDB (formerly HaikuFS) is a highly capable and feature-rich file system to make it the ideal file system for the next 100 years.”

In the coming years we will face several challenges: the capacity of storage devices, volume management, data integrity, error handling, data recovery in case of failure…

It is no secret that in the coming years mechanical disks will disappear, and SSD drives will take their place, and I have a very bad experience with SD and microSD drives on my first generation Raspberry Pi: data corrupt very quickly.

Surely @PulkoMandy will tell me that operations are cached to minimize the use of SSD drives and extend their lifetime. Yes, I know that.

I come from a time when industrial PLC EEPROM memories had a life of 100,000 read/write cycles, and NAND-Flash memories 10,000 read/write cycles, so I know how chaotic everything can become with a simple upgrade. Obviously this is no longer the case, but it’s still just as much of a concern as the years go by.

I’m not proposing something that we should do already, it’s simply an idea that we should keep in mind, and that we can plan for with plenty of time. Obviously, such a drastic change is not going to be ready for R1, but possibly for R2. We might even get an award for technological innovation. Think about it.

PulkoMandy · July 20, 2024, 7:08am

Well, why put words in my mouth like this? I don’t even see how that would be helpful to that discussion. Making SSDs that last for a long time may be related, to some extent, to filesystem data structures, yes. But it is not so much related to what interface the filesystem provides to the userspace.

A replacement of the filesystem is already more or less planned for R2. But the goal is to solve more minor problems, such as supporting hardlinks and improving performance.

I think I like my operating system to be plain and a little boring. Not innovating to the point that all existing features are thrown away

But there’s something I’m not too sure about here: are wr talking just about internal datastructures, where the filesystem would still work, mainly, as a filesystem? Or are we talking about making things more database-like for the user, and making filesystem paths a less essential feature of how a filesystem operates?

For the later, I think current phone and tablets are already way ahead of us, and I must say I don’t really like the results. Just moving photos I took on my phone into separate directories for different things (a two click operation in Haiku) is a pain to do. As a result I have an endless list of pictures on my phone and it’s impossible to find anything. Is that an inevitable future? At least that’s a concrete usecase we can talk about. There are a few other ones, such as:

Managing a collection of music files (do people still even do these things? is it all streaming nowadays?)
Managing sourcecode for different projects, compiler include search paths, that kind of things
Managing versions of a document you’re working on, having backups so you don’t lose it
Maybe network transparency: being able to find and open files from other machines in the local network, or even through the internet if people are sharing something

All of these are interesting use cases, and this is where I would approach this from. I wouldn’t start by talking about filesystem datastructures, that would come much later once we have identified what exactly we’re trying to do (and why the existing filesystems can’t do it). Otherwise, I don’t think the discussion can be very interesting, because if you don’t know exactly what the problem is, how can you decide if a solution is going to work?

BlueSky · July 20, 2024, 8:08am

What exactly is “the problem” ?

Raku/Perl6 is probably not an example you want to follow. I can’t judge it from a technical standpoint because I’m not familiar enough with it and don’t know enough about language design, but regarding adoption by users it was a disaster. Perl went from one of the most used scripting languages to being pretty much invisible by now.

nephele · July 20, 2024, 9:32am

Highly unlikely. For your common laptop, os desktop maybe. But everywhere else, no.

The thing is that certain technology can fill certain nieches, if you just want to stream data over a network then HDD are an excellent choice, big capacity, easily work in a raid, long lifetime, low cost.

The same setup done with a ssd would likely cost you 10 times as much.

But there is a storage medium that is even slower and cheaper (per Tb)… Tape! which is still used for archival storage for all sorts of things. So yes HDDs will be seen less in your average desktop but they won‘t dissapear.

PulkoMandy · July 20, 2024, 10:07am

That remains to be seen. If SSD keep getting more reliable and end up cheaper per gigabyte of storage, they may eventually replace spinning disks at everything they do. Just like tape drives did eventually completely replace punched paper rolls and punchcards.

I think hard disk have not had a major update in technology in a while? As far as I know the last thing was perpendicular recording back in 2005. Since then, there have been research and experiment on various other things (dhingled recording, heating or microwaving the magnetic surface to improve densities), but did any of this end up being put in production? Or did I miss something else? (I am not following this very closely, I admit).

nephele · July 20, 2024, 10:11am

One recent improvement was increasing the density of recording by moving the write head in smaller increments than the write head is wide, basically having to overwrite severall lines for one write operation. This gives 2TB per platter and as such quite high capacities. I have two of those disks and they work nicely. Compared to that I would have indeed payed 10 times for SSDs, if they were even available in that size. I suppose eventually SSDs might catch up to this prisewise and it would not matter. : )

nephele · July 20, 2024, 10:18am

I have a signed perl6 book : D, some things that language does are absolutely crazy (in a good way) for example stuff how they manage to iterate over unicode glyphs in constant time (i.e what you as a user perceive as a glphy). Quite wild.

Their biggest mistake was calling it Perl 6. It just wasn’t perl 5+1, it is a completely different language. People who used perl5 would say “why would I need 6?” and people who don’t use perl 5 would say “eww perl” : D

Calling it raku is certainly much better from a marketing standpoint

Androsio · July 20, 2024, 10:37am

I think I’ve been on the forum long enough to know that the first thing you’re going to talk to me about is the performance of cache operations. Intuition, I guess.

The line separating the file system from the database is a fine one. Data stored on disk (or in NAND-flash memory) would still be treated in the same way as now: currently the operating system makes a call to the file system, and the file system returns the data in question; the data will still be sorted in blocks. The difference is in the i-node. Traditionally, the i-node is identified by an integer, and the directories are formed by the i-node and the identifying name. In the case of the database, the i-node would be replaced by tables, with their ID and possibly a primary-key. The tables would be related, and in this way the “database as a file system” would be formed. Answering your question, we would be talking about making things more like databases. From the programmer’s point of view, the API would have to be modified to include SQL style statements.

At this point, I have to say that I am no expert in file systems, nor in databases. Some time ago I started to read a book on SQL fundamentals (ISO 2006) from McGraw Hill, which is the same one used at the University of Seville, for the database course, but university texts are extremely boring and dry, so I left it half-read. At some point I will pick it up again.

I understand your frustration; it’s the same for me.

I don’t know what Apple does with the iPhone, but in the case of Android, Google decided to use ext4 synchronized with SQLite3: all the installed applications are managed by SQLite3, the configurations, permissions, associated data… It is the same point where Be was with his OFS, before designing BFS.

Although it is not the best example, I have to refer to the Windows Registry, where the whole system is configured in a table and records format, similar to a database.

Sounds good to me. I’ll keep thinking, and if I have something relevant to say, I’ll write it down.

memsom · July 20, 2024, 7:53pm

DR9 was the one I believe used the new BFS. There were versions of DR8 for Mac, so the OFS was still there late in the game.

OFS lacked and form of abstraction layer, so using foreign file systems was non trivial.

OFS had serious synchronisation issues. If the file system was not cleanly unmounted, the index was corrupted and needed to be rebuilt. There was an option to do that in the Boot menu - it was that common.