You know, it would probably be better to understand what it is you are doing.
I imagine you have some font that assigns a monochrome 8x16 picture to each code point. So for capital A you would start with something like
Code:
--------
--------
--------
--####--
-#----#-
-#----#-
-#----#-
-#----#-
-######-
-#----#-
-#----#-
-#----#-
-#----#-
--------
--------
--------
and end at
Code:
static const uint8_t cap_a[16] = {0x00, 0x00, 0x00, 0x3c, 0x42, 0x42, 0x42, 0x42, 0x7e, 0x42, 0x42, 0x42, 0x42, 0x00, 0x00, 0x00};
Now what do you do with that? What do you actually want? You want to take eight pixels in a row and paint each of them white if the corresponding bit in the font is set, else paint them black (yes, yes, more generally, foreground and background color, respectively, but let us start simply for now). For 8bpp, 0xFF is a white pixel and 0x00 is a black one, but for 32bpp, it is 0xFFFFFFFF and 0x00000000 respectively. So on an 8bpp framebuffer, you want to turn each byte of font data into eight bytes of framebuffer data, thus turning each byte into 64 bits. But on a 32bpp FB, you want to turn each byte into 32 bytes of FB data.
So, for a 32-bpp FB, the easiest is probably to model the FB as a pitch x height array of 32-bit units. The nice thing about pointers into arrays is also that they are also simultaneously spans of the array, so you could write this like something like:
Code:
void render_char_32bpp(uint32_t *fb, size_t pix_x, size_t pix_y, const uint8_t font[static 16], size_t fb_pitch)
{
uint32_t *line = fb + pix_y * fb_pitch + pix_x;
for (size_t i = 0; i < 16; i++) {
for (size_t j = 0; j < 8; j++) {
line[j] = ((font[i] >> (7-j)) & 1)? WHITE_PIXEL : BLACK_PIXEL;
}
line += fb_pitch;
}
}
Of course, that is nowhere near the end. For one, you could save the X and Y parameters by making the initial calculation of "line" external and handing that over to the function to start with. That way, a possible string rendering function would be able to just keep track of that pointer itself, and would only have to advance one pointer between the characters. Also, the function should probably stop rendering when reaching FB width or height. As it stands, it must be called with all parameters firmly in bounds, or else it'll write all over some other memory. Although, if the callers can guarantee that they will only ever call the function with correct parameters then that checking will not be necessary.
That routine above is the principle idea for all framebuffers, the only difference is how you model the FB. In a 32bpp FB, it might be as array of 32-bit units. In an 8bpp FB, you might use an array of 8-bit units. Only a minor change is needed to also give it foreground and background colors as parameters, rather than hardcoding white and black in these roles, and then you have your working version.
I suspect it might be fast enough that way already, because any firmware worth its salt is going to set the memory type for the frame buffers as write-combining. Thus it really doesn't matter much if you access everything as 8-bits, 16-bits, or 32-bits, it's getting combined to 64 bytes anyway.