Monday, October 21, 2013

Particle Systems using Stream-Out in DirectX 11 and SlimDX

Particle systems are a technique commonly used to simulate chaotic phenomena, which are not easy to render using normal polygons.  Some common examples include fire, smoke, rain, snow, or sparks.  The particle system implementation that we are going to develop will be general enough to support many different effects; we will be using the GPU’s StreamOut stage to update our particle systems, which means that all of the physics calculations and logic to update the particles will reside in our shader code, so that by substituting different shaders, we can achieve different effects using our base particle system implementation.

The code for this example was adapted from Chapter 20 of Frank Luna’s Introduction to 3D Game Programming with Direct3D 11.0, ported to C# and SlimDX.  The full source for the example can be found at my GitHub repository, at, under the ParticlesDemo project.

Below, you can see the results of adding two particles systems to our terrain demo.  At the center of the screen, we have a flame particle effect, along with a rain particle effect.


The Particle Vertex Structure

We will represent our particles as simple points.  In our geometry shader, we will then expand these points out into billboarded quads or lines, similar to the effect we used for billboarded trees in our BillBoard Demo.  In addition to an initial position vector, we will specify an initial velocity for the particle.  Based on these variables, the age of the particle, and a constant acceleration that we will define for each particle system, we can then derive the final position of the particle when we render the particle.  The particle systems that we will implement here will use two distinct classes of particles, which will be indicated by the Type member of the Particle structure.  Emitter particles (Type=0), will not actually be drawn, but will instead spawn flare particles (Type=1) periodically.  These flare particles are the particles that will actually be drawn to simulate the particle effect.  If we desired, we could create more types of particles, to simulate more complicated effects.

public struct Particle {
public Vector3 InitialPos;
public Vector3 InitialVel;
public Vector2 Size;
public float Age;
public uint Type;

public static readonly int Stride = Marshal.SizeOf(typeof (Particle));

As with our other vertex structures, we will need to define corresponding InputLayoutElement[] and InputLayout objects to bind the C# vertex structures to our shader inputs. As always, these objects will be added to our InputLayoutDescriptions and InputLayouts static classes.

// InputLayoutDescriptions.cs
public static readonly InputElement[] Particle = {
new InputElement("POSITION", 0, Format.R32G32B32_Float, 0, 0, InputClassification.PerVertexData, 0),
new InputElement("VELOCITY", 0, Format.R32G32B32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),
new InputElement("SIZE", 0, Format.R32G32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),
new InputElement("AGE", 0, Format.R32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),
new InputElement("TYPE", 0, Format.R32_UInt, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),

// InputLayouts.cs::InitAll()
try {
var tech = Effects.FireFX;
if (tech != null) {
var passDesc = tech.StreamOutTech.GetPassByIndex(0).Description;
Particle = new InputLayout(device, passDesc.Signature, InputLayoutDescriptions.Particle);
} catch (Exception ex) {
Console.WriteLine(ex.Message + ex.StackTrace);
Particle = null;

The ParticleSystem Class

Our ParticleSystem class encapsulates the data necessary to draw and manage a single instance of a particle effect.  Contained within the ParticleSystem class is a reference to the ParticleEffect shader wrapper that will be used to update and render the particle system, along with the vertex buffers and values to control the shader effect.  This provides us a very simple interface to update and draw the particle system, as we will soon see.  Once again, because we are managing SlimDX buffers, we will subclass our DisposableClass base class and provide an appropriate Dispose() method to clean up these buffers.

public class ParticleSystem:DisposableClass {
private bool _disposed;

// Maximum number of particles that can be created
private int _maxParticles;
// on the first run, we need to use a different vertex buffer to initialize the system
private bool _firstRun;

// used as a seed to index into the random-value texture
private float _gameTime;
// The time since the last update of the system
private float _timeStep;
// How long the system has existed
public float Age { get; private set; }

// The camera eye position. Passed to the shader to align the billboarded lines/quads
public Vector3 EyePosW { get; set; }
// Used to set the position in world-space of the particle emitter
public Vector3 EmitPosW { get; set; }
// Used to set the initial direction of emitted particles, if the direction varies
public Vector3 EmitDirW { get; set; }

// The particles effect shader for this system
private ParticleEffect _fx;

// A vertex buffer containing the original emitter particles
private Buffer _initVB;
// vertex buffer to hold the particles to be drawn
private Buffer _drawVB;
// vertex buffer to receive the particles generated by the stream-out shader
private Buffer _streamOutVB;

// a texture array to contain the sprites to be applied to the drawn particles
private ShaderResourceView _texArraySRV;
// a texture containing random floats, used to supply the shader with random values
private ShaderResourceView _randomTexSRV;

public ParticleSystem() {
_firstRun = true;
EmitDirW = new Vector3(0,1, 0);
protected override void Dispose(bool disposing) {
if (!_disposed) {
if (disposing) {
Util.ReleaseCom(ref _initVB);
Util.ReleaseCom(ref _drawVB);
Util.ReleaseCom(ref _streamOutVB);
_disposed = true;

After we have created a new ParticleSystem, we need to initialize it, to assign the ParticleEffect, shader texture resources, and create the vertex buffers for the particle system. The maxParticles parameter here specifies an upper bound on the number of particles that can be live at one time in the system; this value is used to allocate space in the draw and stream-out vertex buffers, so it is important to be sure that the particle effect shader does not create more particles than this limit.

public void Init(Device device, ParticleEffect fx, ShaderResourceView texArraySRV, ShaderResourceView randomTexSRV, int maxParticles) {
_maxParticles = maxParticles;
_fx = fx;
_texArraySRV = texArraySRV;
_randomTexSRV = randomTexSRV;


Creating the vertex buffers for the ParticleSystem is relatively straightforward. The _initVB vertex buffer is created with a single emitter particle. The _drawVB and _streamOutVB buffers are not initialized with any data; rather we simply allocate space for up to _maxParticles particle vertices. These buffers will be populated by our shader effect’s stream-out technique. Note that we specify both BindFlags.VertexBuffer and BindFlags.StreamOutput for the _drawVB and _streamOutVB buffers; as you will see shortly when we show the Draw() method, these buffers will be ping-ponged between being input and output to the stream-out technique, so we need to create them with both bind flags.

private void BuildVB(Device device) {
var vbd = new BufferDescription(

var p = new Particle {
Age = 0,
Type = 0

_initVB = new Buffer(device, new DataStream(new[]{p}, true, true), vbd);

vbd.SizeInBytes = Particle.Stride*_maxParticles;
vbd.BindFlags = BindFlags.VertexBuffer | BindFlags.StreamOutput;

_drawVB = new Buffer(device, vbd);
_streamOutVB = new Buffer(device, vbd);

We will provide a method to reset the particle system. This method resets the Age member of the particle system, and toggles the _firstRun flag, which will force the particle system to be drawn using the _initVB buffer on the next Draw() call, effectively resetting the particle system to its initial emitter particle.

Updating the particle system simply sets the _timeStep variable and advances the Age counter.

public void Reset() {
_firstRun = true;
Age = 0;
public void Update(float dt, float gameTime) {
_gameTime = gameTime;
_timeStep = dt;

Age += dt;

Drawing the ParticleSystem

To draw the ParticleSystem, we will follow the following process:

  1. Bind the appropriate shader variables to our ParticleEffect shader.
  2. Bind the input vertex buffer and stream-out buffer for our stream-out technique. 
  3. Draw the particles using the stream-out technique.  This technique will only update the particles, creating new particles from the emitters and killing particles that are older than the maximum age.  The resulting particles are output to the stream-out buffer.
  4. Next, we disable the StreamOut stage of the GPU, in preparation for actually rendering the particles.  We then swap (or ping-pong) the _drawVB and _streamOutVB, so that we will be rendering the updated particles created by the stream-out technique, and so that on the next draw call, we will have the correct input particles for the stream-out technique.
  5. Next, we draw the particles using the particle effects Draw technique.  Because this buffer was populated by the StreamOut stage, we don’t know exactly how many vertices are contained, however, Direct3D maintains this count, so we can use the DrawAuto method to draw the entire contents of the vertex buffer.
public void Draw(DeviceContext dc, CameraBase camera) {
var vp = camera.ViewProj;

// set shader variables

dc.InputAssembler.InputLayout = InputLayouts.Particle;
dc.InputAssembler.PrimitiveTopology = PrimitiveTopology.PointList;

var stride = Particle.Stride;
const int offset = 0;

// bind the input vertex buffer for the stream-out technique
// use the _initVB when _firstRun = true
dc.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(_firstRun ? _initVB : _drawVB, stride, offset));
// bind the stream-out vertex buffer
dc.StreamOutput.SetTargets(new StreamOutputBufferBinding(_streamOutVB, offset));

// draw the particles using the stream-out technique, which will update the particles positions
// and output the resulting particles to the stream-out buffer
var techDesc = _fx.StreamOutTech.Description;
for (int p = 0; p < techDesc.PassCount; p++) {
if (_firstRun) {
dc.Draw(1, 0);
_firstRun = false;
} else {
// the _drawVB buffer was populated by the Stream-out technique, so we don't
// know how many vertices are contained within it. Direct3D keeps track of this
// internally, however, and we can use DrawAuto to draw everything in the buffer.
// Disable stream-out

// ping-pong the stream-out and draw buffers, since we will now want to draw the vertices
// populated into the buffer that was bound to stream-out
var temp = _drawVB;
_drawVB = _streamOutVB;
_streamOutVB = temp;

// draw the particles using the draw technique that will transform the points to lines/quads
dc.InputAssembler.SetVertexBuffers(0, new VertexBufferBinding(_drawVB, stride, offset));
techDesc = _fx.DrawTech.Description;
for (var p = 0; p < techDesc.PassCount; p++) {

ParticleEffect Class

All of our particle effect shaders will follow a common interface, so that we can use a single C# wrapper class for all of them.  There is nothing particularly novel about this wrapper class; it follows the same conventions we have used for our other shader effect wrapper classes, so I will present the code here without elaboration.

public class ParticleEffect : Effect {
public readonly EffectTechnique StreamOutTech;
public readonly EffectTechnique DrawTech;

private readonly EffectMatrixVariable _viewProj;
private readonly EffectScalarVariable _timeStep;
private readonly EffectScalarVariable _gameTime;
private readonly EffectVectorVariable _eyePosW;
private readonly EffectVectorVariable _emitPosW;
private readonly EffectVectorVariable _emitDirW;
private readonly EffectResourceVariable _texArray;
private readonly EffectResourceVariable _randomTex;

public ParticleEffect(Device device, string filename) : base(device, filename) {
StreamOutTech = FX.GetTechniqueByName("StreamOutTech");
DrawTech = FX.GetTechniqueByName("DrawTech");

_viewProj = FX.GetVariableByName("gViewProj").AsMatrix();
_gameTime = FX.GetVariableByName("gGameTime").AsScalar();
_timeStep = FX.GetVariableByName("gTimeStep").AsScalar();

_eyePosW = FX.GetVariableByName("gEyePosW").AsVector();
_emitPosW = FX.GetVariableByName("gEmitPosW").AsVector();
_emitDirW = FX.GetVariableByName("gEmitDirW").AsVector();

_texArray = FX.GetVariableByName("gTexArray").AsResource();
_randomTex = FX.GetVariableByName("gRandomTex").AsResource();

public void SetViewProj(Matrix m) {
public void SetGameTime(float f) {
public void SetTimeStep(float f) {
public void SetEyePosW(Vector3 v) {
public void SetEmitPosW(Vector3 v) {
public void SetEmitDirW(Vector3 v) {
public void SetTexArray(ShaderResourceView tex) {
public void SetRandomTex(ShaderResourceView tex) {


We will be creating two different particle effects, one to simulate fire, and one to simulate rain, so we will need to add two ParticleEffect instances to our static Effects class, with the appropriate shader files.

// Effects.cs::InitAll()
try {
FireFX = new ParticleEffect(device, "FX/Fire.fxo");
} catch (Exception ex) {
try {
RainFX = new ParticleEffect(device, "FX/Rain.fxo");
} catch (Exception ex) {

Particle System Shaders

All of the shader effects that we create using this particle system implementation will follow a common template, although the logic for spawning, updating and rendering the particles may be different.  Firstly, all of these shaders will have the shader variables that we have referenced in our ParticleEffect wrapper class:

cbuffer cbPerFrame
float3 gEyePosW;

// for when the emit position/direction is varying
float3 gEmitPosW;
float3 gEmitDirW;

float gGameTime;
float gTimeStep;
float4x4 gViewProj;
// Array of textures for texturing the particles.
Texture2DArray gTexArray;

// Random texture used to generate random numbers in shaders.
Texture1D gRandomTex;

Next, we will generally have a constant buffer that contains effect-specific constants. Usually, we will at least have a vector specifying the constant acceleration used in the particle physics update calculations.  This will vary, depending on the particular particle system.  For effects that transform particles into billboard quads, we will also have a float2 array specifying the texture coordinates for the generated quad vertices.

cbuffer cbFixed
// Net constant acceleration used to accerlate the particles.
float3 gAccelW = {0.0f, 7.8f, 0.0f};

// Texture coordinates used to stretch texture over quad
// when we expand point particle into a quad.
float2 gQuadTexC[4] =
float2(0.0f, 1.0f),
float2(1.0f, 1.0f),
float2(0.0f, 0.0f),
float2(1.0f, 0.0f)

Next, we will have some common sampler, depth/stencil and blend states. The samLinear sampler is a simple linear texture sampler, similar to those we have used previously; we will use this both to sample the random data texture and the diffuse texture for the particle texture. The DisableDepth DepthStencilState is used to disable writing to the depth/stencil buffer during our stream-out technique. The NoDepthWrites DepthStencilState is likewise used to prevent writing to the depth buffer when we are rendering particles with our draw technique. Lastly, most of our particle effects will be drawn using some type of alpha-blending; the AdditiveBlending state below is used in our Fire.fx shader to accumulate color where the particles are densest. Depending on the particle effect desired, other types of blending may be more appropriate.

SamplerState samLinear
AddressU = WRAP;
AddressV = WRAP;

DepthStencilState DisableDepth
DepthEnable = FALSE;
DepthWriteMask = ZERO;

DepthStencilState NoDepthWrites
DepthEnable = TRUE;
DepthWriteMask = ZERO;

BlendState AdditiveBlending
AlphaToCoverageEnable = FALSE;
BlendEnable[0] = TRUE;
SrcBlend = SRC_ALPHA;
DestBlend = ONE;
BlendOp = ADD;
SrcBlendAlpha = ZERO;
DestBlendAlpha = ZERO;
BlendOpAlpha = ADD;
RenderTargetWriteMask[0] = 0x0F;

We will also define some common functions for sampling the random texture and generating normalized and non-normalized vectors.

float3 RandUnitVec3(float offset)
// Use game time plus offset to sample random texture.
float u = (gGameTime + offset);

// coordinates in [-1,1]
float3 v = gRandomTex.SampleLevel(samLinear, u, 0).xyz;

// project onto unit sphere
return normalize(v);

float3 RandVec3(float offset)
// Use game time plus offset to sample random texture.
float u = (gGameTime + offset);

// coordinates in [-1,1]
float3 v = gRandomTex.SampleLevel(samLinear, u, 0).xyz;

return v;

Finally, we define the HLSL counterpart for our Particle vertex structure, as well as some particle type constants.

#define PT_EMITTER 0
#define PT_FLARE 1

struct Particle
float3 InitialPosW : POSITION;
float3 InitialVelW : VELOCITY;
float2 SizeW : SIZE;
float Age : AGE;
uint Type : TYPE;


Our Fire.fx shader generates a fireball-like effect.  The particles emitted will accelerate upwards from their starting position, and we will fade the opacity of each particle as it ages.  First, we will define our stream-out technique, which we will use to emit and update our particles.  Note that we need to use the special function ConstructGSWithSO() in order to create a geometry shader for stream-out.  The parameters to this function are the geometry shader object created using the normal CompileShader() call, while the second is a string describing the semantics and format of the streamed-out vertices.  Note that to use stream-out only (i.e., not render to the backbuffer), we need to both set the pixel shader to null, and disable the depth buffer, using the DisableDepth depth/stencil state.

GeometryShader gsStreamOut = ConstructGSWithSO( 
CompileShader( gs_4_0, StreamOutGS() ),
";; SIZE.xy; AGE.x; TYPE.x" );

technique11 StreamOutTech
pass P0
SetVertexShader( CompileShader( vs_4_0, StreamOutVS() ) );
SetGeometryShader( gsStreamOut );

// disable pixel shader for stream-out only

// we must also disable the depth buffer for stream-out only
SetDepthStencilState( DisableDepth, 0 );

The vertex shader for the fire stream-out technique is just a simple pass-through shader. The geometry shader takes as input a particle, and will output zero, one or two particles. First, the age of the particle is advanced by the gTimeStep variable, which, if you recall, should be our application frame-time. If the particle is not an emitter, we check that it’s age is less than 1 second; if the particle has expired, we drop it and do not output the particle. If the particle is an emitter, we check to see if we should emit a new flare particle, which we initialize with a random initial velocity. We always output the emitters, as otherwise the particle effect would eventually die, once all the created flare particles have expired.

Particle StreamOutVS(Particle vin)
return vin;

// The stream-out GS is just responsible for emitting
// new particles and destroying old particles. The logic
// programed here will generally vary from particle system
// to particle system, as the destroy/spawn rules will be
// different.
void StreamOutGS(point Particle gin[1],
inout PointStream<Particle> ptStream)
gin[0].Age += gTimeStep;

if( gin[0].Type == PT_EMITTER )
// time to emit a new particle?
if( gin[0].Age > 0.005f )
float3 vRandom = RandUnitVec3(0.0f);
vRandom.x *= 0.5f;
vRandom.z *= 0.5f;

Particle p;
p.InitialPosW =;
p.InitialVelW = 4.0f*vRandom;
p.SizeW = float2(3.0f, 3.0f);
p.Age = 0.0f;
p.Type = PT_FLARE;


// reset the time to emit
gin[0].Age = 0.0f;

// always keep emitters
// Specify conditions to keep particle; this may vary from system to system.
if( gin[0].Age <= 1.0f )

Our Fire.fx DrawTech renders the particles generated by the StreamOutTech, calculating the final particle positions, transforming the points into camera-oriented quads, and texturing the resulting quads with the fireball texture. Note that we need to se the AdditiveBlending blend state and the NoDepthWrites depth/stencil state for this effect.

technique11 DrawTech
pass P0
SetVertexShader( CompileShader( vs_4_0, DrawVS() ) );
SetGeometryShader( CompileShader( gs_4_0, DrawGS() ) );
SetPixelShader( CompileShader( ps_4_0, DrawPS() ) );

SetBlendState(AdditiveBlending, float4(0.0f, 0.0f, 0.0f, 0.0f), 0xffffffff);
SetDepthStencilState( NoDepthWrites, 0 );

The DrawTech vertex shader calculates the position of the particle as a function of the particle’s age, using simple physics. We also calculate an opacity factor to fade the particle out as it ages.

struct VertexOut
float3 PosW : POSITION;
float2 SizeW : SIZE;
float4 Color : COLOR;
uint Type : TYPE;

VertexOut DrawVS(Particle vin)
VertexOut vout;

float t = vin.Age;

// constant acceleration equation
vout.PosW = 0.5f*t*t*gAccelW + t*vin.InitialVelW + vin.InitialPosW;

// fade color with time
float opacity = 1.0f - smoothstep(0.0f, 1.0f, t/1.0f);
vout.Color = float4(1.0f, 1.0f, 1.0f, opacity);

vout.SizeW = vin.SizeW;
vout.Type = vin.Type;

return vout;

The DrawTech geometry shader expands the non-emitter particles into camera-facing quads, outputting the resulting vertices as a triangle strip. The DrawTech pixel shader then samples the effect diffuse texture, multiplying the sampled color by the opacity value computed by the vertex shader.

struct GeoOut
float4 PosH : SV_Position;
float4 Color : COLOR;
float2 Tex : TEXCOORD;

// The draw GS just expands points into camera facing quads.
void DrawGS(point VertexOut gin[1],
inout TriangleStream<GeoOut> triStream)
// do not draw emitter particles.
if( gin[0].Type != PT_EMITTER )
// Compute world matrix so that billboard faces the camera.
float3 look = normalize( - gin[0].PosW);
float3 right = normalize(cross(float3(0,1,0), look));
float3 up = cross(look, right);

// Compute triangle strip vertices (quad) in world space.
float halfWidth = 0.5f*gin[0].SizeW.x;
float halfHeight = 0.5f*gin[0].SizeW.y;

float4 v[4];
v[0] = float4(gin[0].PosW + halfWidth*right - halfHeight*up, 1.0f);
v[1] = float4(gin[0].PosW + halfWidth*right + halfHeight*up, 1.0f);
v[2] = float4(gin[0].PosW - halfWidth*right - halfHeight*up, 1.0f);
v[3] = float4(gin[0].PosW - halfWidth*right + halfHeight*up, 1.0f);

// Transform quad vertices to world space and output
// them as a triangle strip.
GeoOut gout;
for(int i = 0; i < 4; ++i)
gout.PosH = mul(v[i], gViewProj);
gout.Tex = gQuadTexC[i];
gout.Color = gin[0].Color;

float4 DrawPS(GeoOut pin) : SV_TARGET
return gTexArray.Sample(samLinear, float3(pin.Tex, 0))*pin.Color;


This is getting a little long, so I won’t go over the Rain.fx particle shader in detail.  If you are interested, you can peruse the shader code at

Generating a Random Texture

The last piece of the puzzle that we have not yet discussed is generating the random texture that we will input to the shader effect.  HLSL does not have a random number generation function, so if we want to use random numbers in our shader code, we have to upload the random values ourselves.  The easiest way to do this is by creating a 1D texture and populating that texture with random pixel data.  We will add a function to our Util class, CreateRandomTexture1DSRV(), which will generate a texture containing 1024 random 4D vectors.

public static ShaderResourceView CreateRandomTexture1DSRV(Device device) {
var randomValues = new List<Vector4>();
for (int i = 0; i < 1024; i++) {
randomValues.Add(new Vector4(MathF.Rand(-1.0f, 1.0f), MathF.Rand(-1.0f, 1.0f), MathF.Rand(-1.0f, 1.0f), MathF.Rand(-1.0f, 1.0f)));
var texDesc = new Texture1DDescription() {
ArraySize = 1,
BindFlags = BindFlags.ShaderResource,
CpuAccessFlags = CpuAccessFlags.None,
Format = Format.R32G32B32A32_Float,
MipLevels = 1,
OptionFlags = ResourceOptionFlags.None,
Usage = ResourceUsage.Immutable,
Width = 1024
var randTex = new Texture1D(device, texDesc, new DataStream(randomValues.ToArray(), false, false));

var viewDesc = new ShaderResourceViewDescription() {
Format = texDesc.Format,
Dimension = ShaderResourceViewDimension.Texture1D,
MipLevels = texDesc.MipLevels,
MostDetailedMip = 0
var randTexSRV = new ShaderResourceView(device, randTex, viewDesc);
ReleaseCom(ref randTex);
return randTexSRV;


Particle effects. There is some weirdness with texture sampling on the particular GPU I am using, so some of the flame particles act oddly.


Next Time…

Next time, we’ll take a look at shadow mapping, which is a technique for generating dynamic shadows on arbitrary scene geometry.  This is a much more powerful, albeit complex, method of generating shadows than the simple planar shadows that we implemented earlier.


  1. This demo fails att the line:
    UnwalkableSRV = ShaderResourceView.FromFile(terrainRenderer._device, "textures/unwalkable.png");
    as that file is not included in the project.

  2. Yup, it's a little bit of code rot - . It should be fixed now; thanks for finding it and letting me know.
    What I should have done for each of these tutorials is version the main engine project for each example