Idea: Clusterkit

cb88 · April 1, 2010, 2:12am

I was thinking about multiuser use cases and this popped into my head.

*BeOS/Haiku was and is designed to use SMP lets extend that ideology and improve on it to make Be engineers proud!
*What if the program being used runs too slow even with SMP?!

My answer is that some apps can make use of processor resources available on other networked computers. The best way to implement that would be with a clusterkit so all future Haiku applicatons can share the same interface.

The type of applications this would improve the most are applications that operate on chunks of data for long periods (video encoding, raytracing, folding)

You may have noticed I just mentioned folding here is how I see it working.
I have 3 haiku computers on my network for instance and I install folding@home for Haiku … if it were implemented with a clustering kit all my haiku computers could be folding but only it would only have to be installed on one computer!

Imagine being at a tech show and having 4 or 5 haiku computers rendering a raytraced scene having a specator volunteer thier computer and booting up haiku and attaching to the network automatically that computer could start raytracing as well imagine the impact that would have in peoples minds!

Some things that might be good to keep in mind
*use llvm to recompile code for cross arch clustering and arch specific optimizations possibly also enable OpenCL once gallium3D is available
*keep it simple packet centric design possibly redundancy in calculations for error correction (folding)
*allow users to control available resources to other users with a deskbar applet or desktop replicant
*how to compute untrusted code.
*make it automatic zeroconf?(popup possibly the first time someone else trys to acess your computer’s resources)
*compatibility with other APIs for easy porting (shouldn’t be hard since most of the do the same thing anyway)
*Mironet – less network overhead look into it possibly massive proformance enhancements I don’t know if it can operate simultaneously with TCP/IP
*make sure it doesn’t interfere with normal network operation
*possibly have a high proformace mode with dedicated network

I would like to hear from your point of view as well. Clearly I don’t know everything about how this would work especially zeroconf perhaps this isn’t a good use case for it? Also I would like to mention the reason I want it to be automatic, I have attempted to run WRF-EMS a weather sim model on a 40Cpu pentium 2 cluster after many days of fiddling all we ever got it to run was on the master node which acutally rendered the sim slower than realtime. That just shouldn’t happen ever (this was with about 3 *nix knowledgeable guys on hand and a weather guy). Who knows we might see Haiku on the weather channel some day :-). Note that I am not developing this I have loads of homework to work on :-).

tonestone57 · April 4, 2010, 1:21am

@cb88
yes, and yet very few will take advantage of it. How many Haiku or Linux users actually use clustering in total as a percentage?

I understand the concept of distributed computing (clustering). I know it allows to use multiple computers, through networking, as one to speed things up. But this is more for professionals (who work graphic design, animation) or scientific community - specialized towards specific groups.

Your average Haiku user will never make use of it. What good is clustering if 10% or less make use of it? I and generally everybody else prefer to have full multimedia support - enhanced browsing, hd audio & video, flash, video acceleration, 3D, etc…

How many users are doing lots of 3d animation or graphics? or scientific computing? Clustering serves a very small market niche! Haiku should not become the next Linux where it tries to do and be everything (bloated). Haiku should stick to giving what majority of users want and doing it very well.

If you really believe it is important then talk about it on the mailing list and you may get some developer feedback. You won’t convince me on the subject. I think clustering should stay with Linux which is best suited for it. We have a difference of opinion and we will not agree.

The123king · April 1, 2010, 8:29am

Very good suggestion, but Haiku is a desktop system, so this may not be a high priority. Although, it could be a very useful feature in corporate environments where there’s surplus processing power sat under everyone’s desks…

cb88 · April 1, 2010, 4:38pm

I dont know about you but most people have more than one pc…people in corporations can already benefit from clustering to some degree but the idea here would be to bring it to the desktop and integrate it better than any other os.

There would be no configuration it would ust work off the bat once the two PCs were on the network.

The123king · April 2, 2010, 8:50am

I’m no programmer, but i’m pretty sure it wouldn’t “just work off the bat”. There would be some form of configureation required in any case. I still don’t think it would be a particularly used feature in the home desktop environment. I for one only have 1 PC on at once in my house, and it’s sufficient for image and video editing on its own (and it cost me £350 last year).

Unless you have a house full of PentiumIII’s, i’m pretty sure this feature would only get significant usage in corporate environments, but for linking workstations up, not necesarily servers.

ColinG · April 2, 2010, 6:39pm

Nice idea cb88, I like the way of your thinking. It’s just like in the days when no one would know what end users should do with several processor units. You have a really forward looking idea here. As a reference: for linux there was a project called openMosix. It used an arbitrary number of pcs, to combine them to a virtual-smp machine. Quiet in the same manner as your proposal, so it would be doable, at least

tonestone57 · April 2, 2010, 8:14pm

Cool concept but very limited use. Most average users do internet, email, messaging, documents, play music and videos. I, and many others would prefer maximizing multimedia options - playing flash, reading books/comics, watching HD content, 3D support (OpenGL games), etc. ,etc.

Clusters are meant for certain type of users (would have to be power users) and very speciific types of applications. Mostly benefit those in corporate or scientific environments. Haiku’s target is general desktop users.

You won’t see this in R1 and I highly doubt in any future Releases either.

cb88 · April 3, 2010, 3:54am

BeOS certainly was targeted at non regular users Windows could get email and browse the web just as well as Be perhaps even better in later years. BeOS was about advanced and innovative features and making using a computer better and easier not just getting the job done… Windows and Linux can get the job done its just what you have to go through to get there.

What would be wrong with making Haiku an unmatched OS for the Scientific and Power user community BeOS was supposed to have been unmatched for power users… multimedia multitasking etc… distributed compilation (icecream & distcc etc…)

another point is that you could offload mundane chores to a second perhaps older headless PC so you primary desktop is fast. I see no reason to tie this idea to any release schedule It’s either a good idea or it is not… personally
I think its a good idea and should be considered for including in the API planning of R2 as in R1 it would probably be a cludge in R2 it could be made elegant.

From what I understand clustering consists of head node and sub nodes where either a binary is sent to the nodes to run or a packet is sent to be processed (ie SunGrid engine java based) or something along those lines.

Perhaps wherever an application wants something to be processed either of two things could happen
*a packet containing all the data and a c/c++ file for llvm to compile at runtime is sent out and once processed that packet communicates back its results
*or a c/c++ code file is sent out and a network filesystem is used to access needed data.

Being a general desktop OS is boring and once Haiku has 3D drivers, WebPositive and Encrypted WiFi it will be pretty complete on that front (assuming you don’t mind koffice) … I would think the devs would like to try something adventurous once in awhile.

Another Edit:
If using llvm I think you could think of it as threaded interpreter/jit compiler running on multiple computers I wonder if clang could be modified to do this…

http://llvm.org/cmds/lli.html that is what I was looking for the head node would compile the code into llvm bytecode and the sub nodes would execute it with lli (llvm interpreter?) it should run on any platform or as a jit compiler on supported platforms. I think it would probably be best for the code to execute with as little privileges as possible and only able to write into a certain place once execution completes the data in the specified placed could be sent back to the head node (could be ram, disk, network share etc…).

@ColinG We had openMosix and all the other clustering libs on a rpm based PII/PIII cluster (centos 5) but the application we really wanted to get going was a massive hackfest for us as the installer was written in csh (there were jokes in that seemed to indicate it was surprising to the developer we had made it that far) I think we ccould have don’t some cool things with povray (I learned a bit of povray enough to generate a few animations) but we ran out of time in the semester an I moved to a 4 year uni (though I still drop by there alot on the way home). I think openMosix or any other 3rd party lib would be the wrong way to go unless it was BSD licensed and could be forked into haiku.

Edit again!: I may be overthinking this a bit… so Would it be possible to extend the Be api to pass messages between apps not only the local machine but also any machine on the network that happens to be listening? I think that could also be used as a basis since then code that is running on the separate machine can communicate actively with the head node or any node it wants or do I misunderstand the concept of BeMessages? It would also need to be as low latency as possible.

Apparently this has been done on BeOS before to some extent (it isn’t really secure at all for one thing) https://public.msli.com/lcs/jaf/SockHop/ perhaps the developer could be persuaded to donate the code to Haiku?

tonestone57 · April 3, 2010, 4:07am

Good idea but not for Haiku, only for Linux. I prefer Haiku focus on multimedia desktop OS with future support for x86-64 & ARM.

There are many more things with greater importance to finish first that will benefit lots (majority) of users.
http://dev.haiku-os.org/wiki/FutureHaikuFeatures

Very few would gain from clustering computers. Scientific and power users (utilizing multiple systems) is small compared to general desktop users who use one computer at a time. Clustering appeals to small number of people overall that would make use of it.

Haiku’s goal today is for attracting desktop users not the scientific and server communities. You can ask developers too and see what they have to say on the general mailing list. They can give better input and inform you where they stand on the clustering issue.

AlienSoldier · April 3, 2010, 4:23am

I am all for that kind of thing. A kit would make lot of sense because there is lot of thing like security, latency, redundancy kind of thing that would be needed.

I could see that integrated in pulse

Sure there is no app that need that currently, but i can see many that could be made to make use of this. And those that think a Desktop have no need for this, i will just point to the 2 best Desktop computer ever, the STNG computer and HAL.

It’s not something for tomorrow, but it sure is something that could be done before 10 years. Stuff like that and extension of the media kit and BMessage to network is what haiku will need soon or else the OS will really have a 90’s feeling.

cb88 · April 3, 2010, 8:59pm

@tomestone57 I clearly explained how clustering was relevant to haiku for Multimedia and Power users alike.

Saying clustering should stay on Linux is also pretty lame thats akin to saying oh you have SMP our OS is so fast we don’t need that sure your OS may be fast but I am talking about better utilization of networked resources encoding things is slow on any OS so just distribute the job making a 3 hour job into a 30 minute or one hour job possibly etc…

@tomestone57 it seems you have the wrong idea about Haiku it isn’t just a multimedia or “web client” OS it is a Desktop OS and clustering is relevant to the desktop not for many simple things but it does enable lots of very cool things that aren’t just for wiz bang.

@AlienSoldier You mentioned STNG… what if you could have a persistent desktop that was tied to your login and not your computer. So if you login simultaneously on your desktop and laptop one the same network they are synced (like synergy) if you could migrate programs from one computer to the other by dragging it across to the other screen. Perhaps if you logged off the desktop/laptop it could automatically migrate all your programs over to your laptop/desktop. When both are running it could load ballance by being able to render apps running on the laptop/desktop on the desktop/laptop remotly.

I believe that would be called process migration and I think it is possible since solaris and linux can do VM migration which is basically migrating a whole virtual environment onto a separate computer/prosessor. http://en.wikipedia.org/wiki/Process_migration

I think most of the talk about R2 has been about GUI changes or other similar changes … there is no reason to limit the discussion to those sorts of things think outside the box you know.

cb88 · April 4, 2010, 2:54am

@tomestone57 if all you have is a negative word to say I’d rather not hear it. First off Most of what I have been talking about can be done in user space so does not add blot if you don’t use it. If you go the openMosix route it could add a bit of bloat I have no idea how much but couldn’t be much. I understand how Linux has become bloated but you also have to consider the vast amount of things Linux supports I mean it supports practically everything you can imagine I am not suggesting that at all.

You have made your opinion quite clear no need to keep pushing this thread to the top of the bucket which just gives it more visibility anyway.

Also you seemed to completely gloss over my actual experience with clustering it was total crap hours of configuration for a simple installation what I advocated was a plug & play cluster just plug in the PCs of any arch running haiku and they would be recognized as nodes by Haiku that is the problem with Linux the software has to be configured separately by adding a clusterkit you make it so your clustering software is a function of the OS and not some hacked on cludge like Linux which is what it is clustering on Linux is not straight forward and there are TONS of bugs in all the clustering suites I have came across I think this is because they follow the Linux mentality of changing their implementations just to optimize a bit more in the process breaking everything. Why do you not see more clusters? Because they are hard to setup and run that is why!

Maybe my idea doesn’t fit perfectly with haiku but I think it fits pretty well in a lot of interesting ways,even for home users, if you can see that or not.

BTW: I never mentioned 3d graphics or animation for the home user. I mentioned video encoding primarily but that is just an obvious case.

tonestone57 · April 4, 2010, 4:53am

>@tonestone57 if all you have is a negative word to say I’d rather not hear it.
Just because you strongly agree with something does not mean everyone else has to also. The only way to show disagreement is by pointing out faults (negatives). Or how else would you disagree with someone?

>You have made your opinion quite clear no need to keep pushing this thread to the top of the >bucket which just gives it more visibility anyway.
Stop replying to me and I will stop pushing this thread to the top. I did not realize you were with the forum police here! People are allowed to disagree and express their opinions and comments openly and respectfully. I have provided my points and argument no matter that you disagree and dislike what I have to say.

>Also you seemed to completely gloss over my actual experience with clustering it was total crap
I have already told you twice to take this to the mailing list and get developers to hear you out and respond. What are you afraid of? Instead you continue to rant in the forum about how Haiku should support distributed computing because you know best. Maybe I am right or maybe wrong but you should get developers input to find out for sure where they stand.

>Why do you not see more clusters? Because they are hard to setup and run that is why!
No, because clusters are useless for majority of users as I pointed out two or so times. Only small, select group of individuals benefit. If many end users wanted to cluster, as you say, then developers would have worked harder to make clusters easier to install and use.

>Maybe my idea doesn’t fit perfectly with haiku but I think it fits pretty well in a lot of >interesting ways,even for home users, if you can see that or not.
I strongly disagree - already said not for home users.

>BTW: I never mentioned 3d graphics or animation for the home user. I mentioned video encoding >primarily but that is just an obvious case.
I have done video encoding many years back. Today’s computers are very fast that clustering would make little sense unless you were encoding lots (many) or multiple movies simultaneously. You really think people are encoding so many movies at same time that they’ll need a cluster? Most will rip 2 or 3 movies max per week or download them off internet. You really believe many end users will run 2, 3 or 4 computers at once just to encode a single movie faster? Or even have that many computers to use?

Distributed Computing is cool concept (from geek perspective) but has very limited and practical use for most computer users. Only makes sense for those strongly involved (daily work) in:
3d graphics/design, cg animation, video/audio encoding, number crunching (usually science related project)
Clustering benefits from multiple or very large jobs and when time (deadline) is very important.

humdinger · April 4, 2010, 8:05am

I have no experience whatsoever with clustering and distributed computing. I don’t think it’s in heavy demand right now. As always, it’s all a question of are there developers interested enough to put in the work. For the last ten years the Haiku devs had barely enough manpower to get R1 on the road. If there’s a dev team that can propose and implement a ClusterKit that does not interfere with the regular Haiku goals (simple, fast, desktop-focused), I’m sure such an effort would be welcomed.

Maybe there’s a stepping stone towards clustering. With more and more people posessing a smartphone, a netbook, a home entertainment system, an older laptop and a kick ass multicore desktop, transparently synchronizing everything would be a perfect goal for a desktop OS. Going from there to actually doing distributed work on all available computers would be the logical next step.

In any case, unless someone new steps up to work on it, I think the Haiku devs are spread too thin at this time.
And if you, cb88, are that new someone, I agree with tonestone to start discussing the issue on the dev mailing list.

Regards,
Humdinger

The123king · April 4, 2010, 8:47am

what I advocated was a plug & play cluster just plug in the PCs of any arch running haiku and they would be recognized as nodes by Haiku

I’m no developer, but i’m sure that cross-architecture multi-processing is near impossible without a significant performance hit, without some form of hardware assistance like in Loongson.

cb88 · April 4, 2010, 1:31pm

Thats where llvm comes in… Its already used in gallium for instance to compile shaders for the GPU if I understand correctly in any case it would become less important if more than one computer on the cluster was of the same architecture as only one would have to compile the code. It is also fairly likely that CPU intensive code resides in a small portion of the code and could be compiled quickly if not the chances are it would only be compiled once.

Just because it was impossible 3 or 4 years ago with GCC doesn’t mean it is impossible now as far as that goes java is used for clustering (and it may even make sense in some cases due to java’s jit compiler… clearly llvm/c++ would be better than java though)

I tried to make it clear but apparently I have been misunderstood the user should not have to know anything about cluster to use this…It wouldn’t be a “cluster” per say as in a stack of computers I understand very few home users will want that. Tweaking via a GUI would also be ideal for people that wanted to. If it couldn’t be automatic and practically a no brainier then I agree leave it on Linux I however do not believe that to be the case.

I just put this idea out there to see if someone else liked it a few have a few haven’t as much though I found all your comments to be constructive in some way. I may try myself to get PVM running on haiku since it is supposed to be very portable… I haven’t much free time but its also a long time until any such feature would be expected so no worries.

tonestone57 · April 4, 2010, 2:28pm

I just put this idea out there to see if someone else liked it a few have a few haven’t

It was not your cluster idea that I was attacking or putting down - that is where you misunderstood me. Your idea may be sound and very good (regarding clustering) but more the practical use of clustering (Who would use it. Who would benefit from it. What applications would make use of it., etc.). Clustering has very limited scope and application in real world. That was my point. It only speeds up the areas I stated in my previous post.

In University, 10 years ago, I had a colleague who went on and on about distributed computing and how it would revolutionize computing and be the next big thing, etc., etc. He was excited about clustering. 10 years later and clustering still has not caught on with average (and even power) computer users. Back then processors were slow, movie encoding took many hours (ie: 5 to 7 hours), but today we have fast, powerful multi-core systems that can do things quickly. ie: you can encode a movie in less than 2 hours, run it in background in low priority or create a batch job and encode multiple movies overnight. Back then our computer lab was running SETI and it was (and is) exciting to know that you could utilize multiple computers by network to create one SUPER computer. The concept is cool but no real practical use for many users.

Your ideas relating to how clustering could work may have good merits but I see very little point for doing clustering on Haiku (only very few users would really use clustering).

A cool idea that is also very practical would be an update utility that utilized Rsync - it would compare (system) files based on date (or CRC) with remote server and if they did not match, would compute binary diff and binary patch local file to latest version. (ie: would provide very low data transmission, binary diffing, binary patching). But this is off-topic and better discussed in another thread.

cb88 · September 23, 2017, 4:57am

The best place to take cues for good design of clustering systems is probably plan9 written in a dialect of C but still much more advanced than Linux in many respects especially clustering it’s practically designed in…

AndrewZ · September 23, 2017, 10:53pm

Plan 9 is cool technology but pretty much dead. Yes, it would do distributed computing very easily. First you would have to write your application in Plan 9. Then it would only run on Plan 9. Plan 9 is not POSIX compliant. You cannot just port an application to Plan 9. It is not compatible.

There are C++ frameworks like MPI that would also provide cluster functionality. First you port MPI to Haiku. Then you write your application in MPI. Now your application is clustered on Haiku. Win! Get rich!

I wrote a SMP 3D rendering application for Haiku. It would use as many cores as you had. I wrote up an article on this and submitted it to the Haiku editorial board. I could not even get it published on haiku-os.org.

I thought about what it would take to create a distributed rendering version of this code. I spec’ed out the blueprints for it. For distributing across a single subnet it would be easy. For distributing across the whole internet gets more complicated because you need a public command node. That’s more TCP/IP than I know.

Pete · September 24, 2017, 1:36am

Sorry – didn’t notice I was catching up on a 7-year old topic , but I’ll leave my reply in place…

Just FYI, there’s been a cross-platform “BMessage” protocol around for a long time. “MUSCLE” (MUSCLE 9.35) provides a framework for handling messages striuctured similarly to BMessages on any OS, and across the net. It’s the basis of BeShare, and is in active use by Meyer Sound for their audio systems. [It’s open source.] I’m using it in my own "cluster’ (well… private net – of Haiku/BeOS, Linux, and RaspberryPi ) to provide me with a ‘reminder’ app that is always available – runs on the Pi 24/7, and is accessed via MUSCLE message from any of my machines. The code is in Python, because there’s a library for that. Normally messages go point-to-point, but there’s a ‘muscled’ server that can “reflect” incoming messages to everyone. (That’s how BeShare works.)

I suspect there are a few unrealized uses for the protocol…