. CPU utilitzation other questions GPU acceleration?

First off i’d like to say. what a great OS you have here. Just got it booted on a nice modern machine. wow its so much better then windows in so many regards to user interaction. Its how a Os should be.

Color me impressed,this is a great open source OS.

anyways I noticed that in a few of the demo apps its shows my cpu usages at 100% on a quad core machine but the tracker CPU monitor disagrees. I also noticed that threads seemed to not be scheduling correclty. I can turn off all but one core in the demo apps and it makes no difference on the execution of the openGL demos teapot.

I don’t know if this is a bug or a optimization issue.

The CPU is a 9550 AMD phenom. I am unsure of why but it also seems like the total cpu utilitzation is rather low around 15% when the teapot demo shows less. same with the other graphics application. the name escapes me currently.

I have another question. Is the haiku team considering implementing GPU acceleration ? Not for graphics but for general computing ?

First of all, adding GPU acceleration for general computing is not something the core developers are working on. There’s nothing to stop anyone from adapting some CUDA libraries and working with them. But don’t expect to see this happen otherwise.

Regarding CPU utilization: Under Demos, bring up Pulse. This gives a true indication of your SMP utilization. The Mandelbrot demo is SMP aware, I think GLTeapot is also. Mandelbrot computes so quickly I couldn’t max out my Core2Duo.

[quote=AndrewZ]First of all, adding GPU acceleration for general computing is not something the core developers are working on. There’s nothing to stop anyone from adapting some CUDA libraries and working with them. But don’t expect to see this happen otherwise.

Regarding CPU utilization: Under Demos, bring up Pulse. This gives a true indication of your SMP utilization. The Mandelbrot demo is SMP aware, I think GLTeapot is also. Mandelbrot computes so quickly I couldn’t max out my Core2Duo.[/quote]

I don’t think adding cuda librarys is really a good idea to nvidia specific. But there is really no better time to bring GPU number crunching to a OS then now especially given hiakus aggresive threading engine.If we operate the GPU “in a non host control type state like is typical with most graphics applications” we can put large math work like transcoding video into a pipline “given Haikus light ram footprint this is easeir then with windows” and have the cpu send over tasks when it has time. Creating a round robin of feed from OS and CPU to ram,acess by GPU wiating on stuff to do,return back to ram,CPU writes to disk. this way the CPU is only pipping info to the GPU “with a algorythm set” and the cpu is picking up the results.

Maybe create a fork and put it in a holding pattern for those of us interested in working on it… But this has braoder implications in terms of acess to video cards.It would be the single enhancement that makes haiku a giant killer and becuase the code can be modified and is open source. We could get alot of help with this.

I don’t have nearly the coding skills to implement this. But I am learning.

As to the CPU utilitzation I was checking with pulse. Is it possiable debugging code might be slowing it down ? I also seem frame limited on the GLteapot. The max is around 300fpsc regardless of how many cores I enable.

either there is a bottle neck or the opengl has a frame limitation to save resources. I dunno.

I mention CUDA because it is a mature library with high performance. If there is another lib that also supports ATI, even better. At this point, this capability will be DIY.

Just so you know, rendering the teapot is a simple image, I would not expect it to max out a quad CPU. You can make it a little more CPU intensive by adding perspective, fog, and adding more teapots (under File).

[quote=AndrewZ]I mention CUDA because it is a mature library with high performance. If there is another lib that also supports ATI, even better. At this point, this capability will be DIY.

Just so you know, rendering the teapot is a simple image, I would not expect it to max out a quad CPU. You can make it a little more CPU intensive by adding perspective, fog, and adding more teapots (under File).[/quote]

Most of it has to do with the fact that cuda allows for a few things that ATI doesn’t. Not really wanting to get into a symentatical debate about the benfits either way. both GPU’s can do the same things cuda just makes it easier.

That siad we need a good kernel level hardware driver to make that work.

I get over 300 fps on GLTeapot. About 330 I believe - Core i3. :slight_smile:

OpenGL uses software renderer so only good for a couple of simple demos and games - no hardware acceleration. I have not really checked but either renderer not multi-threaded or limited to use certain % of CPU (ie: 25% combined).

Use Pulse to enable/disable cores and Activity Monitor to look at CPU use. Also, add CPU combined which averages CPU use across multiple cores.

There are very few multi-threaded applications that max out your system. One application is Handbrake (download it). Chart has checkbox for 1 or 2 threads and can max out dual core system. 7zip, mediaconverter should also be multi-threaded but untested by me for CPU use. jam & make with -j switch will max out your cores too (ie: building Haiku on Haiku; jam -j4)

Only multi-threaded applications that use lots of computing power (calculations) will max out your cores!

No GPU acceleration for R1. Maybe for R2 or R3. OpenCL would look like a good choice rather than CUDA. (more open & supports everyone but for somewhat newer graphics hardware - I think the same Nvidia hardware for CUDA too).

On a netbook with a dual core, it peaks at 22% normally, but the performance is okay… One thing I might add is if Haiku could take advantage of the GPU and the CPU as separate components entirely without the end user really knowing, sort of like how in OS X Apple® had come up with Quartz™ Extreme. It was a lot more than ripple graphics and eye candy–underneath, the compositioner would handle the graphics, while the computer was available to perform CPU tasks separately, as not to bother the processor. Or, could we come up with a sort of ‘swap space’ between the two components? :slight_smile: Silly idea–but lower-end machines do this with hardware–maybe with software, we could, in this case. :slight_smile: One idea’d be to spin in this idea into the next kernel…

Tones and anyone else,

I modified the Haiku Mandelbrot demo to use all 4 CPUs. Please give it a run and report results:
http://haikuware.com/directory/view-details/multimedia/video/miscellaneous/mandelbrotsmp

Thanks,
Andrew

I will try it out when I get the chance. Good of you to do this Andrew. :slight_smile:

Instead of doing it for 4 CPUs - you should actually try to find # of cores in system & use that value. Those with i7s will show as 8 cores in Haiku.
Look here for an idea:
http://dev.haiku-os.org/browser/haiku/trunk/src/apps/aboutsystem/AboutSystem.cpp#L533

The important part is:
systemInfo.cpu_count
(which checks how many cores in someones computer)

Hopefully you can use that so CPU count is not hard coded into the application. If a system comes with 32 cores, the application would scale properly by creating threads equal to # of cores by using above core/cpu count.

I actually was just posting an idea on the CPU and GPU earlier… :slight_smile: and haven’t done anything to my netbook yet. :slight_smile: I did, however, take a look at the AboutSystem source.

Tones,

Thanks for the cpu_count reference, that’s useful. There are a number of things to explore before making the SMP automatic. My goal is to create an app other people can use as an example for N cores. I have been looking at SMP code examples. I looked at 3 programs and there were 3 completely different ways to set it up. MandelbrotSMP uses a quick and dirty technique. You replicate the code for each thread. Ugly but it works for now.

Sounds terrific Andrew.

That is good approach. Start easy and work your way from there. Great to see you getting involved. I look forward to seeing some of your fixes for Mandelbrot in Haiku!!!

I was hoping cpu count will help you out.

Which programs did you look at for SMP code so far?

Mandelbrot - quick and dirty approach, a copy of the same code for each thread
XaoS - 10 different C thread functions, overly complicated
Old BeOS newsletter - beautiful elegant C++ code but doesn’t work :-/

I looked at other code like GLTeapot and Haiku3D. They are multi-threaded but not SMP.

I have a really good demo in mind but it will take some work.

[quote=AndrewZ]Mandelbrot - quick and dirty approach, a copy of the same code for each thread
XaoS - 10 different C thread functions, overly complicated
Old BeOS newsletter - beautiful elegant C++ code but doesn’t work :-/

I looked at other code like GLTeapot and Haiku3D. They are multi-threaded but not SMP.

I have a really good demo in mind but it will take some work.[/quote]

definately seeing higher cpu utilization now.turning this into a functional computing advantage will be the real trick. at 8092 iterations my machine crunchs this in about 10 seconds