Render timings

Welcome!

This is going to be my first blog post and I am going talk about the timing for the rendering part of the handheld game console (I really need a better project name!). I still optimize it so the exact timings might change as the sw evolves.

The screenshot is a bit too small but let’s go through it and it should be clearer.

I am using Saleae Logic analysator to measure how much time the software spent in each rendering part and I do this by toggling a LED output. The whole rendering process takes 1.45 ms. That is including copying the framebuffer to the LCD which is the last part of the rendering phase. The copy takes approx 1.23 ms to do. I probably overclock the transfer a bit but the display seems to accept that. Note that the transfer from framebuffer to LCD is done using DMA2D in the background, freeing the CPU for its other tasks. During that transfer the CPU is not allowed to change the framebuffer since that will destroy the content and create visible artifacts but it can do other things like check for button presses, USB requests etc.

So the first thing that is done when entering the render phase is to ensure the DMA2D is free to use and no copying is ongoing. After that check, the object list for the scene is iterated and the different objects are rendered to the framebuffer. The content of the object list is set up by the Lua script. In this case there is 8 objects to be rendered. The initialize part of the Lua script looks like this:

rect1upper = Rectangle.new(0x0ff0, rect1_x, 0, width, 40) 
rect1lower = Rectangle.new(0x0ff0, rect1_x, 128-40, width, 40)
rect2upper = Rectangle.new(0xff00, rect2_x, 0, width, 40)
rect2lower = Rectangle.new(0xff00, rect2_x, 128-40, width, 40)
cir1 = Circle.new(0xff0, 50, 50, 20)
bird = Rectangle.new(0x0fff, 10, player_y, 20, 20)
level_text = Text.new(16, 0xffff, 10, 20, \'tjo\')

Note that there is only 7 objects added to the scene as the background filling is done automatically and therefor no need to add that in the Lua script.

This is what is displayed on the screen:

Measuring the timing from the Logic analyzator gives the following result:

  • Filling the background: 87.6 us
  • Drawing the 4 rectangles: 8.9 us per rectangle
  • Drawing the circle: 53.4 us
  • Drawing the 5th rectangle: 7.9 us (<- this one is smaller than four first)
  • Draw the text string: 37.6 us

The drawing of the rectangles is done using the DMA2D which is why that is faster than e.g. drawing a circle where each pixels are rendered individually.

Lämna ett svar