Some progress report: I improved architecture by introducing reference counted GPU buffer class (BReference<BufferObject>
) and GPU memory manager. Memory manager can allocate memory of 3 types: VRAM mappable to CPU, VRAM not mappable to CPU, CPU memory mapped to GPU (GTT). GTT buffer is based on Haiku area.
I implemented GART page table that maps CPU memory to GPU GTT memory range. I used Haiku Poke driver to get area pages physical address and write it to GART page table. So CPU mamory mapping is implemented completely in userland without special kernel driver. I tested that GPU DMA engine can write to GTT memory.
Also I implemented and tested indirect buffers (IB). Indirect buffers allows to execute commands on ring buffer without cpying it to ring buffer. Instead execute indirect buffer command is written to ring buffer with intirect buffer adress and size as parameter. Vulkan driver prepares and sends commands in indirect buffers.
I currently experience some instability problems: sometimes DMA write operations have no effect and sometimes DMA engine completely stops until reboot. I probably doing something wrong and miss some initialization code.
Linux driver have firmware files loaded during GPU initialization. I am not sure what that firmwares are doing exactly, I currently don’t use it.
GTT buffer test. bufAdr
is Haiku area address mapped to GPU by GART and written by GPU DMA engine.
/dev/graphics/framebuffer
signature: framebuffer.accelerant
/dev/graphics/intel_extreme_000200
signature: intel_extreme.accelerant
/dev/graphics/radeon_hd_010000
signature: radeon_hd.accelerant
RADEON_GET_PRIVATE_DATA
gSharedInfo->frame_buffer_size: 0x40000
regs[DMA_CNTL]: 0x8210400
regs[DMA_RB_CNTL]: 0x1015
regs[DMA_IB_CNTL]: 0x1
regs[DMA_STATUS_REG]: 0x44c83d57({0, 1, 2, 4, 6, 8, 10, 11, 12, 13, 19, 22, 23, 26, 30})
regs[DMA_RB_RPTR_ADDR]: 0
rptr: 384
wptr: 384
disabling rings
init ring
regs[DMA_CNTL]: 0x8210400
regs[DMA_RB_CNTL]: 0x1015
regs[DMA_IB_CNTL]: 0x1
regs[DMA_STATUS_REG]: 0x44c83d57({0, 1, 2, 4, 6, 8, 10, 11, 12, 13, 19, 22, 23, 26, 30})
regs[DMA_RB_RPTR_ADDR]: 0
rptr: 0
wptr: 0
TestGart()
(1)
GartMap()
0: 0x12e1b8000
0x1000: 0x12e1ba000
0x2000: 0x12e1bb000
0x3000: 0x12e1bc000
0x4000: 0x12e1bd000
0x5000: 0x12e1be000
0x6000: 0x12e1bf000
0x7000: 0x12e1c0000
0x8000: 0x12e1c1000
0x9000: 0x12e1c2000
0xa000: 0x12e1c3000
0xb000: 0x12e1c4000
0xc000: 0x12e1c5000
0xd000: 0x12e1c6000
0xe000: 0x12e1c7000
0xf000: 0x12e1c8000
bufAdr: 0xa680c59000
buf->gpuPhysAdr: 0x80000000
(2)
(3)
(1) bufAdr[0]: -1
(1) bufAdr[1]: -1
(1) bufAdr[2]: -1
(1) bufAdr[3]: -1
(1) bufAdr[4]: -1
(1) bufAdr[5]: -1
(1) bufAdr[6]: -1
(1) bufAdr[7]: -1
[!] attempts expired when waiting fence
regs[DMA_STATUS_REG]: 0x44c83d57({0, 1, 2, 4, 6, 8, 10, 11, 12, 13, 19, 22, 23, 26, 30})
rptr: 192
wptr: 192
*fenceAdr: 1
fenceVal: 1
(2) bufAdr[0]: -1
(2) bufAdr[1]: -1
(2) bufAdr[2]: -1
(2) bufAdr[3]: -1
(2) bufAdr[4]: -1
(2) bufAdr[5]: -1
(2) bufAdr[6]: -1
(2) bufAdr[7]: -1
rptr: 192
wptr: 192
(1) bufAdr[8]: -1
(1) bufAdr[9]: -1
(1) bufAdr[10]: -1
(1) bufAdr[11]: -1
(1) bufAdr[12]: -1
(1) bufAdr[13]: -1
(1) bufAdr[14]: -1
(1) bufAdr[15]: -1
[!] attempts expired when waiting fence
regs[DMA_STATUS_REG]: 0x44c83d57({0, 1, 2, 4, 6, 8, 10, 11, 12, 13, 19, 22, 23, 26, 30})
rptr: 384
wptr: 384
*fenceAdr: 2
fenceVal: 2
(2) bufAdr[8]: -1
(2) bufAdr[9]: -1
(2) bufAdr[10]: -1
(2) bufAdr[11]: -1
(2) bufAdr[12]: -1
(2) bufAdr[13]: -1
(2) bufAdr[14]: -1
(2) bufAdr[15]: -1
rptr: 384
wptr: 384
(1) bufAdr[16]: -1
(1) bufAdr[17]: -1
(1) bufAdr[18]: -1
(1) bufAdr[19]: -1
(1) bufAdr[20]: -1
(1) bufAdr[21]: -1
(1) bufAdr[22]: -1
(1) bufAdr[23]: -1
(2) bufAdr[16]: 16
(2) bufAdr[17]: 17
(2) bufAdr[18]: 18
(2) bufAdr[19]: 19
(2) bufAdr[20]: 20
(2) bufAdr[21]: 21
(2) bufAdr[22]: 22
(2) bufAdr[23]: 23
rptr: 576
wptr: 576
(1) bufAdr[24]: -1
(1) bufAdr[25]: -1
(1) bufAdr[26]: -1
(1) bufAdr[27]: -1
(1) bufAdr[28]: -1
(1) bufAdr[29]: -1
(1) bufAdr[30]: -1
(1) bufAdr[31]: -1
(2) bufAdr[24]: 24
(2) bufAdr[25]: 25
(2) bufAdr[26]: 26
(2) bufAdr[27]: 27
(2) bufAdr[28]: 28
(2) bufAdr[29]: 29
(2) bufAdr[30]: 30
(2) bufAdr[31]: 31
rptr: 768
wptr: 768