The MCU I am using, STM32H745IG, has a lot of memories, both as in size but also as type.
To start, it has two independent flash banks. Each bank is 1 Mbyte and can store both code and data. The first flash bank (starting at address 0x800000) is used by the Cortex-M7 and the second bank (starting at address 0x810000) is dedicated to the Cortex-M4. In the current firmware (which is far from ready) I am using 110 Kbytes for the Cortex-M7 and 220 Kbytes for the Cortex-M4. The larger part for M4 is the Lua interpreter.
I have also attached a external single QUAD SPI flash memory to store game and game data. Graphics take a lot of space and the flash banks will not be enough but it will also makes it easier to download new games to the unit without risking erasing the firmware. The size of the QUAD SPI is 64 Mbytes but in next hardware version I will likely reduce it to 32 Mbytes and a dual-flash solution to increase the speed even more.
The device also comes with a lot of RAM, 1 Mbytes but you will not get a nice linear memory as the RAM is divided in several different sizes, types but also located on different internal buses which means that not all memories are accessible everywhere.
Lets start with the biggest one, the AXI-SRAM. It is the single largest block of memory, 512 Kbytes and located on the AXI bus (thus the name). The AXI bus is 64 bits width and one of the fastest memory in the device. I use this memory to store two framebuffers. One framebuffer is used for rendering graphics and the second one for transferring the buffer to the LCD. The LCD is quite slow so utilizing two framebuffers, the firmware can draw graphics on one buffer while reading the second buffer and push data to the LCD.
The SRAM1 block is 128 Kbytes. This block is used for stack and heap data for the Cortex-M4.
The second SRAM, SRAM2, is currently not used but I am planning to use it for caching bitmap data that is used a lot in the game. Using SRAM will be a lot quicker than reading every time from the external flash memory.
SRAM3 is a shared memory and one of the smallest, 32 Kbytes. I use this to synchronize communication between the M4 and M7.
The last SRAM, SRAM4, is also shared between the MCUs. The SRAM4 contains game specific data such as the rendering pipe line.
Finally, we have the ITCM which I do not use and the DTCM which is the location for storing stack and heap data for the M7. These two memories are extremly fast and tightly connected to the M7.
|FLASH1||1 Mbytes||Code and data for M7|
|FLASH2||1 Mbytes||Code and data for M4|
|SRAM1||128 Kbytes||Stack & heap for M4|
|SRAM2||128 Kbytes||Planned as cache for bitmap graphics|
|SRAM3||32 Kbytes||Shared memory M4 <-> M7 for communication|
|SRAM4||64 Kbytes||Shared memory M4 <-> M7 for game data|
|AXI SRAM||512 Kbytes||2 x Framebuffer|
|ITCM||64 Kbytes||Not current used|
|DTCM||64 Kbytes||Stack & Heap for M7|
|QUADSPI||64 MBytes||Storing of game and game data|