Getting back into the swing of things over the festive period I thought I'd have a dabble in writing my own rendering code then started to get a bit obsessed with performance in regards to VBOs (compiled sprites). I'll start off by posting some numbers.
=================================
glDrawElements(GL_TRIANGLES,...
30.7% - EAGLContext presentRenderbuffer
21.7% - SPDisplayObjectContainer
9.9% - SPQuad
5.8% - TiledMap
4.6% - SPImage
=================================
glDrawElements(GL_TRIANGLE_STRIP,...
33.6% - EAGLContext presentRenderbuffer
19.7% - SPDisplayObjectContainer
8.5% - SPQuad
5.3% - TiledMap
3.8% - SPImage
Now, GL_TRIANGLE_STRIP was using degenerates for each quad 6 indices, either end indices duplicated just like compiled sprite does, now I'm not sure if I'm reading instruments incorrectly, but it would suggest a triangle list is quicker to render than a degenerate strip that degenerates every quad. Or does presentRenderbuffer cause a vysnc so the bigger number the better?
Does anyone actually make heavy use of compiled sprites which they could test changing over to triangle lists, it's only a 3 line change but I'm interested in results, unless someone has already profiled. I'm guessing presentRenderbuffer uses some immediate mode as well as it seems a lot to kick off a render list, and it's (on an iPod 4th gen at least) eating a third of cpu cycles.
Anyway, I'd appreciated if anyone could clarify / shed some light on the above as I'm enjoying playing around with this at the moment. 🙂