Batches of hate

sitelogoSo I’ve been a busy little bee the last couple of days adding a couple new features to Gorgon.

One of the things I’ve been wondering about is how I could possibly improve performance.  To understand, I should explain how Gorgon does its “thing”.  When you draw a sprite to the screen using the Draw() method the actual data doesn’t go to the current render target (screen for our purposes) right away.  What happens is the vertices for that sprite are added to a dynamic vertex buffer.  If the next sprite you draw has the same texture and states as the previous (which will most likely be the case if you perform batching properly) it will just add that sprite to the dynamic buffer and the process continues over and over until the end of the frame.  When the end of frame is reached the buffer is drawn to the screen and the buffer is ‘reset’, that is, all data in it is overwritten with our next frame.  This is all well and good if you only use the same texture and render states (Smoothing, Blending, etc…)   But let’s say we have 3 sprites.  The first 2 sprites share a texture and the last uses a seperate texture.  When the first two are drawn they get added to the vertex buffer and then when the 3rd sprite is drawn the system detects a change in state (in this case it’s the texture) and the buffer is flushed and process starts over with our third sprite.  As you can imagine this can be very inefficient, but if you batch sensibly you’ll see excellent speeds.

So I got to wondering… How can I use this batch methodology to display data even faster?  Dynamic vertex buffers are quick, but they’re snail-like compared to a static vertex buffer.  Of course, the caveat of the static buffer is that it’s just that:  static and incredibly slow to fill over and over in a real-time situation.  However I thought that maybe we wouldn’t need to make changes to the buffer if all we’re doing is displaying a group of sprites as a background or drawing lots of text and we can take advantage of the static vertex buffer.

Thus, the Batch object was created.  Basically it batches your sprites/text sprites in such a way that the system can draw a LOT faster.  More accurately it acts like a snapshot of your sprites.  By calling the AddRenderable method you can attach any sprites that support batching (via the internal Renderable.GetVertices method) and it will store the retrieved vertex data in a static vertex buffer.  The system will even optimize the order of the vertices such that the most common shared textures get rendered first and will group the vertices by texture.  Because it groups by texture a single batch object is capable of  store multiple and drawing objects with differing textures (this incurs a small penalty of course).  The setup is obviously not the fastest thing in the world, but once the buffers are built and running the results are impressive:

Left: The immediate mode (Dynamic Vertex Buffer) Draw function from TextSprite.
Right: The batched mode (Static Vertex Buffer) which uses the exact same TextSprite but draws it via the Batch object.

Notice the HUGE speed difference.  Naturally nothing comes for free and there are caveats to using this thing.  Namely your batched sprites can’t be transformed after they’ve been drawn (unless you refresh the batch and that is too slow for real-time framerates) and any transformations affect the whole batch as a single entity, the same for states like smoothing or blending.  So you’re wondering what use is this?  Well, as you see, if you have a lot of text you can draw it very quickly.  Or, if you have a tile map where the tiles are static you can basically blit (and transform) the entire map at very little cost.

Edit
I’ve posted a video on youtube showing this stupid thing in action with over 100,000 sprites with a large block of text:

https://www.youtube.com/watch?v=-PRiXEZjiVs