[ARTICLE] reverse-engineering AGX (Apple M1 GPU) drivers: the Impossible Bug

In high-end games the limiting factor is usually fragment processing and less vertex processing. One pixel on screen is typically touched 10x or more times during a frame rendering. If you reduce the image size by 50% you require 4x less shader pipes for fragment processing. This helps a lot. Reducing scene complexity affects vertex processing but not fragment processing (given deferred rendering but that’s pretty much standard nowadays). Hence you still need to run the same amount of shader pipes for fragment processing which does not help to improve performance. You can also try using cheaper shaders which slightly improves performance. The cost of setting up shader pipes to run your fragment shader bottle-necks you there so the gain is small. Disabling certain screen space effects does help (SSR for example is a greedy one) but nothing boosts you more than consuming 25% or less shader pipes skipping setup and running costs. That’s why these techniques gain popularity true to the old game developer wisdom “the fastest geometry to render is the one you don’t render at all”… or in this case “the fastest pixel to shader is the one you don’t shade at all”.

That said if you have a simple game, potentially using forward rendering and only 1 or 2 post-processing effects then this won’t gain you much. If you have though a heavy game eating the most recent high-end GPU models for breakfast and run it on 4K monitor or larger then you do gain (:smiley: okay… that’s a heavy example but happens).

2 Likes

Unfortunately I’m not so profoundly experienced that I can answer you, but I know, because I have seen many and many papers (I am very fascinated by these techniques, precisely because they are so sophisticated as to seem magic.) from both nvidia and others, that the use of A.I. , of deep neural networks, are not limited only to reconstructing detailed frames at higher resolutions, but are also capable of reconstructing entire “invented-imagined” pieces and making them credible, because they use large databases of images and objects to learn how to do these reconstructions …
I mean, the A.I. they reconstruct the frames with a certain kind of “artificial artistic eye”, now I can’t tell you if this is the case with MetalFX, probably not, but I am sure, that Apple, and also Other actors like Apple, will very soon make use of these techniques … take a look on this Youtube channel Two Minute of Papers to see the potential that is already in the research and development phase, and I am sure that you too will start to think like me.

I prefer to do as a term of comparison what the rendering engines do to render image-frames to a fifth or even more than the time it would take without these AI denoisers, because I am sure that these are the ways that are used to collect great performances , the difference is that in video games these techniques must be used and optimized in order to achieve high performance in realtime … I know with certainty that both Apple, Nvidia and AMD, send their engineers to collaborate in the development of their compatible components for their gpu, not only for the prestige of having blender that exploits their techniques, but also as research and development, what I noticed years ago were avant-garde experiments, today they are starting to become the norm in their gpu.

UPDATE:

A couple months ago, Asahi Lina joined our team and took on the challenge of reverse engineering the M1 GPU hardware interface and writing a driver for it. In this short time, she has already built a prototype driver good enough to run real graphics applications and benchmarks, building on top of the existing Mesa work. The proof of concept uses m1n1 via a USB connection and runs the driver remotely, so it is bottlenecked by USB bandwidth, but she has also demonstrated that the GPU proper renders the GLMark2 phong shaded bunny scene at over 1000FPS, at 1080p resolution. This fully open source stack passes 94% of the dEQP-GLES2 test suite. Not bad!

Source: M2 is here! July 2022 Release & Progress Report - Asahi Linux

1 Like