Richards Software Ramblings: Skinned Models in DirectX 11 with SlimDX and Assimp.Net

Sorry for the hiatus, I’ve been very busy with work and life the last couple weeks. Today, we’re going to look at loading meshes with skeletal animations in DirectX 11, using SlimDX and Assimp.Net in C#. This will probably be our most complicated example yet, so bear with me. This example is inspired by Chapter 25 of Frank Luna’s Introduction to 3D Game Programming with Direct3D 11.0, although with some heavy modifications. Mr. Luna’s code uses a custom animation format, which I found less than totally useful; realistically, we would want to be able to load skinned meshes exported in one of the commonly used 3D modeling formats. To facilitate this, we will again use the .NET port of the Assimp library, Assimp.Net. The code I am using to load and interpret the animation and bone data is heavily based on Scott Lee’s Animation Importer code, ported to C#. The full source for this example can be found on my GitHub repository, at https://github.com/ericrrichards/dx11.git under the SkinnedModels project. The meshes used in the example are taken from the example code for Carl Granberg’s Programming an RTS Game with Direct3D.

Skeletal animation is the standard way to animate 3D character models. Generally, a character model will be represented by two structures: the exterior vertex mesh, or skin, and a tree of control points specifying the joints or bones that make up the skeleton of the mesh. Each vertex in the skin is associated with one or more bones, along with a weight that determines how much influence the bone should have on the final position of the skin vertex. Each bone is represented by a transformation matrix specifying the translation, rotation and scale that determines the final position of the bone. The bones are defined in a hierarchy, so that each bone’s transformation is specified relative to its parent bone. Thus, given a standard bipedal skeleton, if we rotate the upper arm bone of the model, this rotation will propagate to the lower arm and hand bones of the model, analogously to how our actual joints and bones work.

Animations are defined by a series of keyframes, each of which specifies the transformation of each bone in the skeleton at a given time. To get the appropriate transformation at a given time t, we linearly interpolate between the two closest keyframes. Because of this, we will typically store the bone transformations in a decomposed form, specifying the translation, scale and rotation components separately, building the transformation matrix at a given time from the interpolated components. A skinned model may contain many different animation sets; for instance, we’ll commonly have a walk animation, and attack animation, an idle animation, and a death animation.

The process of loading an animated mesh can be summarized as follows:

Extract the bone hierarchy of the model skeleton.
Extract the animations from the model, along with all bone keyframes for each animation.
Extract the skin vertex data, along with the vertex bone indices and weights.
Extract the model materials and textures.

To draw the skinned model, we need to advance the animation to the correct frame, then pass the bone transforms to our vertex shader, where we will use the vertex indices and weights to transform the vertex position to the proper location.

SkinnedModel Class

Our SkinnedModel class will be our high-level container for an animated model. This class is very similar to our BasicModel class, with added functionality for animation. It would probably be possible to make our SkinnedModel class a subclass of BasicModel, but at present, I don’t have any need to do so. The declaration and member variables for the SkinnedModel class are:

public class SkinnedModel : DisposableClass {
    private MeshGeometry _modelMesh;
    public MeshGeometry ModelMesh { get { return _modelMesh; } }

    private readonly List<MeshGeometry.Subset> _subsets;
    public int SubsetCount { get { return _subsets.Count; } }

    private readonly List<PosNormalTexTanSkinned> _vertices;
    private readonly List<short> _indices;

    protected internal SceneAnimator Animator { get; private set; }

    public List<Material> Materials { get; private set; }
    public List<ShaderResourceView> DiffuseMapSRV { get; private set; }
    public List<ShaderResourceView> NormalMapSRV { get; private set; }

    public BoundingBox BoundingBox { get; private set; }
    private Vector3 _min;
    private Vector3 _max;

    private bool _disposed;
}

The only changes from our BasicModel declaration are that we will need to use a different vertex structure, that has support for bone indices and weights, and the SceneAnimator, which is a new class we will implement to store and control the animation sets for the model. Our new vertex structure is defined as:

public struct PosNormalTexTanSkinned {
    public Vector3 Pos;
    public Vector3 Normal;
    public Vector2 Tex;
    public Vector4 Tan;
    public float Weight;
    public BonePalette BoneIndices;

    public static readonly int Stride = Marshal.SizeOf(typeof(PosNormalTexTanSkinned));

    public PosNormalTexTanSkinned(Vector3 pos, Vector3 norm, Vector2 uv, Vector3 tan, float weight, byte[] boneIndices) {
        Pos = pos;
        Normal = norm;
        Tex = uv;
        Tan = new Vector4(tan, 0);
        Weight = weight;
        BoneIndices = new BonePalette();
        for (int index = 0; index < boneIndices.Length; index++) {
            switch (index) {
                case 0:
                    BoneIndices.B0 = boneIndices[index];
                    break;
                case 1:
                    BoneIndices.B1 = boneIndices[index];
                    break;
                case 2:
                    BoneIndices.B2 = boneIndices[index];
                    break;
                case 3:
                    BoneIndices.B3 = boneIndices[index];
                    break;
            }
        }
                
    }
}
public struct BonePalette {
    public byte B0, B1, B2, B3;
}

The BonePalette portion is probably a little hacky, but I had a devil of a time getting a proper int array to pass into the shader code correctly. Note that this implementation limits us to 256 bones per model, which is well above the typical number of bones that you would be able to use realistically; Shader Model 4.0 allows a great deal more shader constants than previous versions, so you could use more bones in theory, but most game models should not have more than 256 bones. Also note that we are only going to support two bone weights per vertex (the second weight will be calculated by using 1.0f – weight), for simplicity. To support 4 bones per vertex, you would need to change the Weight member to a Vector3 and make the corresponding changes.

The InputLayout for this vertex structure needs to be added to our InputLayouts static class, as follows:

public static readonly InputElement[] PosNormalTexTanSkinned = {
    new InputElement("POSITION", 0, Format.R32G32B32_Float, 0, 0, InputClassification.PerVertexData, 0),
    new InputElement("NORMAL", 0, Format.R32G32B32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0), 
    new InputElement("TEXCOORD", 0, Format.R32G32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),
    new InputElement("TANGENT", 0, Format.R32G32B32A32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData,0 ),
    new InputElement("BLENDWEIGHT", 0, Format.R32_Float, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0),
    new InputElement("BLENDINDICES", 0, Format.R8G8B8A8_UInt, InputElement.AppendAligned, 0, InputClassification.PerVertexData, 0), 
};

var passDesc = tech.Light1SkinnedTech.GetPassByIndex(0).Description;
PosNormalTexTanSkinned = new InputLayout(device, passDesc.Signature, InputLayoutDescriptions.PosNormalTexTanSkinned);

Loading the Model

Loading the model is very similar to our previous BasicModel loading code. The additional flipTexY optional parameterr was added because the models I am using appear to have their UV texture coordinates flipped for some reason. We load the animation data first, by creating a SceneAnimator object, and calling its Init() method, passing in the imported Assimp Scene object. We’ll get into this a little bit later. Again, we need to process each submesh of the object, and extract the subset information, vertices, indices, and material data. Finally, we initialize the ModelMesh data with the extracted subsets, vertices and indices, to create the necessary DirectX buffers.

public SkinnedModel(Device device, TextureManager texMgr, string filename, string texturePath, bool flipTexY = false) {
    // initialize collections
    _subsets = new List<MeshGeometry.Subset>();
    _vertices = new List<PosNormalTexTanSkinned>();
    _indices = new List<short>();
    DiffuseMapSRV = new List<ShaderResourceView>();
    NormalMapSRV = new List<ShaderResourceView>();
    Materials = new List<Material>();
            
    var importer = new AssimpImporter();
#if DEBUG
    importer.AttachLogStream(new ConsoleLogStream());
    importer.VerboseLoggingEnabled = true;
#endif
    var model = importer.ImportFile(filename, PostProcessSteps.GenerateSmoothNormals | PostProcessSteps.CalculateTangentSpace );
    
    // Load animation data
    Animator = new SceneAnimator();
    Animator.Init(model);

    // create our vertex-to-boneweights lookup
    var vertToBoneWeight = new Dictionary<uint, List<VertexWeight>>();
    // create bounding box extents
    _min = new Vector3(float.MaxValue);
    _max = new Vector3(float.MinValue);

    foreach (var mesh in model.Meshes) {
        ExtractBoneWeightsFromMesh(mesh, vertToBoneWeight);
        var subset = new MeshGeometry.Subset {
            VertexCount = mesh.VertexCount,
            VertexStart = _vertices.Count,
            FaceStart = _indices.Count / 3,
            FaceCount = mesh.FaceCount
        };
        _subsets.Add(subset);

        var verts = ExtractVertices(mesh, vertToBoneWeight, flipTexY);
        _vertices.AddRange(verts);
        // extract indices and shift them to the proper offset into the combined vertex buffer
        var indices = mesh.GetIndices().Select(i => (short)(i + (uint)subset.VertexStart)).ToList();
        _indices.AddRange(indices);

        // extract materials
        var mat = model.Materials[mesh.MaterialIndex];
        var material = mat.ToMaterial();
        Materials.Add(material);

        // extract material textures
        var diffusePath = mat.GetTexture(TextureType.Diffuse, 0).FilePath;
        if (!string.IsNullOrEmpty(diffusePath)) {
            DiffuseMapSRV.Add(texMgr.CreateTexture(Path.Combine(texturePath, diffusePath)));
        }
        var normalPath = mat.GetTexture(TextureType.Normals, 0).FilePath;
        if (!string.IsNullOrEmpty(normalPath)) {
            NormalMapSRV.Add(texMgr.CreateTexture(Path.Combine(texturePath, normalPath)));
        } else {
            // for models created without a normal map baked, we'll check for a texture with the same 
            // filename as the diffure texture, and _nmap suffixed
            // this lets us add our own normal maps easily
            var normalExt = Path.GetExtension(diffusePath);
            normalPath = Path.GetFileNameWithoutExtension(diffusePath) + "_nmap" + normalExt;
            if (File.Exists(Path.Combine(texturePath, normalPath))) {
                NormalMapSRV.Add(texMgr.CreateTexture(Path.Combine(texturePath, normalPath)));
            }
        }
    }
    BoundingBox = new BoundingBox(_min, _max);
    _modelMesh = new MeshGeometry();
    _modelMesh.SetSubsetTable(_subsets);
    _modelMesh.SetVertices(device, _vertices);
    _modelMesh.SetIndices(device, _indices);
}

As we process each mesh, we need to create a mapping between the vertex indices and the bones with which the vertex is associated. Assimp loads this information as the reverse mapping, that is, each bone in the mesh contains a list of the vertices and weights associated to that bone. Our ExtractBoneWeightsFromMesh() function creates this lookup.

private void ExtractBoneWeightsFromMesh(Mesh mesh, IDictionary<uint, List<VertexWeight>> vertToBoneWeight) {
    foreach (var bone in mesh.Bones) {
        var boneIndex = Animator.GetBoneIndex(bone.Name);
        // bone weights are recorded per bone in assimp, with each bone containing a list of the vertices influenced by it
        // we really want the reverse mapping, i.e. lookup the vertexID and get the bone id and weight
        // We'll support up to 4 bones per vertex, so we need a list of weights for each vertex
        foreach (var weight in bone.VertexWeights) {
            if (vertToBoneWeight.ContainsKey(weight.VertexID)) {
                vertToBoneWeight[weight.VertexID].Add(new VertexWeight((uint) boneIndex, weight.Weight));
            } else {
                vertToBoneWeight[weight.VertexID] = new List<VertexWeight>(
                    new[] {new VertexWeight((uint) boneIndex, weight.Weight)}
                );
            }
        }
    }
}

Once we have generated this vertex-to-boneID lookup, we can extract the vertex data from the Assimp mesh and transform the vertices to our vertex structure. We also keep a running maximum and minimum extent for the mesh, so that we can create the bounding box after extracting all vertex data.

private IEnumerable<PosNormalTexTanSkinned> ExtractVertices(Mesh mesh, IReadOnlyDictionary<uint, List<VertexWeight>> vertToBoneWeights, bool flipTexY) {
    var verts = new List<PosNormalTexTanSkinned>();
    for (var i = 0; i < mesh.VertexCount; i++) {
        var pos = mesh.HasVertices ? mesh.Vertices[i].ToVector3() : new Vector3();
        _min = Vector3.Minimize(_min, pos);
        _max = Vector3.Maximize(_max, pos);
                
        var norm = mesh.HasNormals ? mesh.Normals[i] : new Vector3D();

        var tan = mesh.HasTangentBasis ? mesh.Tangents[i] : new Vector3D();
        var texC = new Vector3D();
        if (mesh.HasTextureCoords(0)) {
            var coord = mesh.GetTextureCoords(0)[i];
            if (flipTexY) {
                coord.Y = -coord.Y;
            }
            texC = coord;
        }
        var weights = vertToBoneWeights[(uint) i].Select(w => w.Weight).ToArray();
        var boneIndices = vertToBoneWeights[(uint) i].Select(w => (byte) w.VertexID).ToArray();

        var v = new PosNormalTexTanSkinned(pos, norm.ToVector3(), texC.ToVector2(), tan.ToVector3(), weights.First(), boneIndices);
        verts.Add(v);
    }
    return verts;
}

Loading Animation and Bone Data

Now, we have loaded all of the vertex, index and material data for our mesh, but we have not discussed loading the mesh bones and animations. For that, we need to look at our SceneAnimator class. Our SceneAnimator class stores the model skeleton and the animation set data. Its definition is as follows:

public class SceneAnimator {
    private Bone _skeleton;
    private readonly Dictionary<string, Bone> _bonesByName;
    private readonly Dictionary<string, int> _bonesToIndex;
    private readonly Dictionary<string, int> _animationNameToId;
    private readonly List<Bone> _bones;
    public List<AnimEvaluator> Animations { get; private set; }
    private int CurrentAnimationIndex { get; set; }
    public bool HasSkeleton { get { return _bones.Count > 0; } }
    public string AnimationName { get { return Animations[CurrentAnimationIndex].Name; } }
    public float AnimationSpeed { get { return Animations[CurrentAnimationIndex].TicksPerSecond; } }
    public float Duration {
        get { return Animations[CurrentAnimationIndex].Duration/ Animations[CurrentAnimationIndex].TicksPerSecond; }
    }

    public SceneAnimator() {
        _skeleton = null;
        CurrentAnimationIndex = -1;
        _bonesByName = new Dictionary<string, Bone>();
        _bonesToIndex = new Dictionary<string, int>();
        _animationNameToId = new Dictionary<string, int>();
        _bones = new List<Bone>();
        Animations = new List<AnimEvaluator>();
    }
}

Some of these members deserve some further explanation.

_skeleton – This member contains the root bone of our bone hierarchy tree.
_bonesByName – This provides us an easy way to lookup a bone by name.
_bonesToIndex – This provides us an easy way to lookup the index of a bone in our _bones list by the bone name.
_animationNameToId – This allows us to easily lookup an animation index in our Animations list by the animation clip name.
_bones – This is a flattened version of our bone hierarchy.
Animations – This contains each animation clip that is defined for the model. We’ll look at the AnimEvaluator class in more detail a little later.

Bone Structure

We’ll need to define a simple structure to maintain our model bones. This structure will maintain a handful of different transformation matrices, the name of the bone, and references to the bone’s parent and children in the bone hierarchy tree. Maintaining both the child and parent references allows us to traverse the bone hierarchy both top-to-bottom and bottom-to-top.

public class Bone {
    public string Name { get; set; }
    // Bind space transform
    public Matrix Offset { get; set; }
    // local matrix transform
    public Matrix LocalTransform { get; set; }
    // To-root transform
    public Matrix GlobalTransform { get; set; }
    // copy of the original local transform
    public Matrix OriginalLocalTransform { get; set; }
    // parent bone reference
    public Bone Parent { get; set; }
    // child bone references
    public List<Bone> Children { get; private set; }
    public Bone() {
        Children = new List<Bone>();
    }
}

Initializing the SceneAnimator from the Assimp Scene

Our SceneAnimator.Init() function takes the imported Assimp Scene object and extracts the bones and animations from the model. Our first step is to clear any pre-existing animation and bone data, in the event that we are re-initializing the animator. Next, we extract the bone hierarchy information using our CreateBoneTree function, starting at the root node of the scene.

public void Init(Scene scene) {
    if (!scene.HasAnimations) {
        return;
    }
    Release();
    _skeleton = CreateBoneTree(scene.RootNode, null);

    // more...

Our CreateBoneTree() function walks the Assimp Scene node hierarchy. Not all nodes in this hierarchy will actually be bones (the .X file format appears to usually have one leaf devoted to bones, and one containing the skin mesh…), but we’re going to extract all of the nodes into our bone hierarchy regardless. Sometimes, you will find nodes in this hierarchy which are not named; this will mess up our name-to-bone dictionary, even though we will probably never need to access these bones, so we give them temporary names in this case. Next, we extract the local transform of the node; this is the offset of this node relative to its parent node. Assimp uses a different column-row orientation than DirectX, so we need to transpose this matrix. After we determine the local transform of the bone, we need to walk back up the bone hierarchy to the root, in order to determine the global transformation of the bone in model space. Finally, we need to create the bone trees for each child node recursively.

private int _i;
private Bone CreateBoneTree(Node node, Bone parent) {
            
    var internalNode = new Bone {
        Name = node.Name, Parent = parent
    };
    if (internalNode.Name == "") {
        internalNode.Name = "foo" + _i++;
    }

    _bonesByName[internalNode.Name] = internalNode;
    var trans = node.Transform;
    trans.Transpose();
    internalNode.LocalTransform = trans.ToMatrix();
    internalNode.OriginalLocalTransform = internalNode.LocalTransform;
    CalculateBoneToWorldTransform(internalNode);

    for (var i = 0; i < node.ChildCount; i++) {
        var child = CreateBoneTree(node.Children[i], internalNode);
        if (child != null) {
            internalNode.Children.Add(child);
        }
    }
    return internalNode;
}

private static void CalculateBoneToWorldTransform(Bone child) {
    child.GlobalTransform = child.LocalTransform;
    var parent = child.Parent;
    while (parent != null) {
        child.GlobalTransform *= parent.LocalTransform;
        parent = parent.Parent;
    }
}

Next, we loop through each bone in the scene according to the meshes to create our flattened version of the bone hierarchy. In Assimp, the mesh bones and the scene nodes are not the same, but we can lookup the scene node from the mesh bone by name. Once we have found the correct Bone for this Assimp Bone, we store the offset matrix (which transforms the bone from mesh space to bone space), and add the bone to our _bones list. For some reason, I found that I had a number of bones that I needed for the animation to render properly that were not actually attached to the meshes and so did not have offset matrices, hence the final foreach loop. I am not sure if this is really correct, but it seems to work, and gets me a number of bones that matches the number of bones in the animations in my models.

// continuing Init()...
foreach (var mesh in scene.Meshes) {
    foreach (var bone in mesh.Bones) {
        Bone found;
        if (!_bonesByName.TryGetValue(bone.Name, out found)) continue;

        var skip = (from t in _bones let bname = bone.Name where t.Name == bname select t).Any();
        if (skip) continue;

        found.Offset = Matrix.Transpose(bone.OffsetMatrix.ToMatrix());
        _bones.Add(found);
        _bonesToIndex[found.Name] = _bones.IndexOf(found);
    }
    var mesh1 = mesh;
    foreach (var bone in _bonesByName.Keys.Where(b => mesh1.Bones.All(b1 => b1.Name != b) && b.StartsWith("Bone"))) {
        _bonesByName[bone].Offset = _bonesByName[bone].Parent.Offset;
        _bones.Add(_bonesByName[bone]);
        _bonesToIndex[bone] = _bones.IndexOf(_bonesByName[bone]);
    }
}
// more...

Extracting Animations

After we have extracted the bone information, we need to extract the animation information. For this, we will create a helper function, ExtractAnimations(), which will grab the Assimp Animations present in the Scene and convert them into our AnimEvaluator objects.

// Init() continued...
ExtractAnimations(scene);

// more Init()...

private void ExtractAnimations(Scene scene) {
    foreach (var animation in scene.Animations) {
        Animations.Add(new AnimEvaluator(animation));
    }
    for (var i = 0; i < Animations.Count; i++) {
        _animationNameToId[Animations[i].Name] = i;
    }
    CurrentAnimationIndex = 0;
}

AnimEvaluator Class

Our AnimEvaluator class represents one animation clip of the model. This class maintains some metadata about the animation, such as its name, duration, and the number of frames per second it should be run at, along with a list of AnimationChannels. An AnimationChannel represents the keyframes of the animation for a single bone in the model. We are also going to pre-compute the interpolated transformations for each bone for each frame of the animation; this will allow us to simply lookup the bone matrices of a given frame of the animation, rather than recalculating from the keyframes each time we update the animation.

public class AnimEvaluator {
    public string Name { get; private set; }
    private List<AnimationChannel> Channels { get; set; }
    public bool PlayAnimationForward { get; set; }
    private float LastTime { get; set; }
    public float TicksPerSecond { get; set; }
    public float Duration { get; private set; }
    private List<MutableTuple<int, int, int>> LastPositions { get; set; }
    public List<List<Matrix>> Transforms { get; private set; }

    public AnimEvaluator(Animation anim) {
        LastTime = 0.0f;
        TicksPerSecond = anim.TicksPerSecond > 0.0f ? (float)anim.TicksPerSecond : 920.0f;
        Duration = (float)anim.DurationInTicks;
        Name = anim.Name;
        Channels = new List<AnimationChannel>();
        foreach (var channel in anim.NodeAnimationChannels) {
            var c = new AnimationChannel {
                Name = channel.NodeName,
                PositionKeys = channel.PositionKeys.ToList(),
                RotationKeys = channel.RotationKeys.ToList(),
                ScalingKeys = channel.ScalingKeys.ToList()
            };
            Channels.Add(c);
        }
        LastPositions = Enumerable.Repeat(new MutableTuple<int, int, int>(0, 0, 0), anim.NodeAnimationChannelCount).ToList();
        Transforms = new List<List<Matrix>>();
        PlayAnimationForward = true;
    }

As I mentioned, each AnimationChannel in the AnimEvaluator stores the animation keyframes for a given bone in the model. These keyframes are decomposed by Assimp as position, rotation and scaling keys, with a time position and transformation value. Our AnimationChannel class simply holds all of the keyframe values for a given bone.

public class AnimationChannel {
    public string Name { get; set; }
    public List<VectorKey> PositionKeys { get; set; }
    public List<QuaternionKey> RotationKeys { get; set; }
    public List<VectorKey> ScalingKeys { get; set; }
}

Precomputing the Animation Frames

Once we have extracted the animation keyframes into our AnimEvaluator objects, we next need to precompute the bone transforms for each animation frame for each animation. We are going to compute our animations at 30 frames per second. For smoother animations, you could instead use a higher frame-per-second value, but 30 FPS works pretty well for my models.

// SceneAnimator.Init continued...
const float timestep = 1.0f / 30.0f;
    for (var i = 0; i < Animations.Count; i++) {
        SetAnimationIndex(i);
        var dt = 0.0f;
        for (var ticks = 0.0f; ticks < Animations[i].Duration; ticks += Animations[i].TicksPerSecond/30.0f) {
            dt += timestep;
            Calculate(dt);
            var trans = new List<Matrix>();
            for (var a = 0; a < _bones.Count; a++) {
                var rotMat = _bones[a].Offset * _bones[a].GlobalTransform;
                trans.Add(rotMat);
            }
            Animations[i].Transforms.Add(trans);
        }
    }
    Console.WriteLine("Finished loading animations with " + _bones.Count + " bones");
}

The Calculate() function here does the heavy lifting. In a nutshell, it advances the appropriate AnimEvaluator to the appropriate set of keyframes for the time specified, which sets the bone LocalTransforms according to the animation, then updates the bone GlobalTransforms by walking back up the bone hierarchy multiplying by the bone’s parent transforms, using the CalculateBoneToWorldTransform() function presented earlier. We also need to update the transformations for each child bone recursively.

private void Calculate(float dt) {
    if ((CurrentAnimationIndex < 0) | (CurrentAnimationIndex >= Animations.Count)) {
        return;
    }
    Animations[CurrentAnimationIndex].Evaluate(dt, _bonesByName);
    UpdateTransforms(_skeleton);
}

private static void UpdateTransforms(Bone node) {
    CalculateBoneToWorldTransform(node);
    foreach (var child in node.Children) {
        UpdateTransforms(child);
    }
}

Our AnimEvaluator Evaluate() function looks pretty hairy, but is not actually that complex. Effectively, what we are doing here is finding the appropriate keyframe pairs to interpolate between, getting the interpolated translation, rotation and scale, and then creating the combined transformation matrix for each bone in the mesh. The hairiness is introduced by some optimizations that assume that we will be iterating through the keyframes from first to last, so that we can save our last position and search forward through the keys from there, rather than starting at the beginning of the lists.

public void Evaluate(float dt, Dictionary<string, Bone> bones) {
    dt *= TicksPerSecond;
    var time = 0.0f;
    if (Duration > 0.0f) {
        time = dt % Duration;
    }
    for (int i = 0; i < Channels.Count; i++) {
        var channel = Channels[i];
        if (!bones.ContainsKey(channel.Name)) {
            Console.WriteLine("Did not find the bone node " + channel.Name);
            continue;
        }
        // interpolate position keyframes
        var pPosition = new Vector3D();
        if (channel.PositionKeys.Count > 0) {
            var frame = (time >= LastTime) ? LastPositions[i].Item1 : 0;
            while (frame < channel.PositionKeys.Count - 1) {
                if (time < channel.PositionKeys[frame + 1].Time) {
                    break;
                }
                frame++;
            }
            if (frame >= channel.PositionKeys.Count) {
                frame = 0;
            }

            var nextFrame = (frame + 1) % channel.PositionKeys.Count;

            var key = channel.PositionKeys[frame];
            var nextKey = channel.PositionKeys[nextFrame];
            var diffTime = nextKey.Time - key.Time;
            if (diffTime < 0.0) {
                diffTime += Duration;
            }
            if (diffTime > 0.0) {
                var factor = (float)((time - key.Time) / diffTime);
                pPosition = key.Value + (nextKey.Value - key.Value) * factor;
            } else {
                pPosition = key.Value;
            }
            LastPositions[i].Item1 = frame;

        }
        // interpolate rotation keyframes
        var pRot = new Assimp.Quaternion(1, 0, 0, 0);
        if (channel.RotationKeys.Count > 0) {
            var frame = (time >= LastTime) ? LastPositions[i].Item2 : 0;
            while (frame < channel.RotationKeys.Count - 1) {
                if (time < channel.RotationKeys[frame + 1].Time) {
                    break;
                }
                frame++;
            }
            if (frame >= channel.RotationKeys.Count) {
                frame = 0;
            }
            var nextFrame = (frame + 1) % channel.RotationKeys.Count;

            var key = channel.RotationKeys[frame];
            var nextKey = channel.RotationKeys[nextFrame];
            key.Value.Normalize();
            nextKey.Value.Normalize();
            var diffTime = nextKey.Time - key.Time;
            if (diffTime < 0.0) {
                diffTime += Duration;
            }
            if (diffTime > 0) {
                var factor = (float)((time - key.Time) / diffTime);
                pRot = Assimp.Quaternion.Slerp(key.Value, nextKey.Value, factor);
            } else {
                pRot = key.Value;
            }
            LastPositions[i].Item1= frame;

        }
        // interpolate scale keyframes
        var pscale = new Vector3D(1);
        if (channel.ScalingKeys.Count > 0) {
            var frame = (time >= LastTime) ? LastPositions[i].Item3 : 0;
            while (frame < channel.ScalingKeys.Count - 1) {
                if (time < channel.ScalingKeys[frame + 1].Time) {
                    break;
                }
                frame++;
            }
            if (frame >= channel.ScalingKeys.Count) {
                frame = 0;
            }
            LastPositions[i].Item3 = frame;
        }

        // create the combined transformation matrix
        var mat = new Matrix4x4(pRot.GetMatrix());
        mat.A1 *= pscale.X; mat.B1 *= pscale.X; mat.C1 *= pscale.X;
        mat.A2 *= pscale.Y; mat.B2 *= pscale.Y; mat.C2 *= pscale.Y;
        mat.A3 *= pscale.Z; mat.B3 *= pscale.Z; mat.C3 *= pscale.Z;
        mat.A4 = pPosition.X; mat.B4 = pPosition.Y; mat.C4 = pPosition.Z;

        // transpose to get DirectX style matrix
        mat.Transpose();
        bones[channel.Name].LocalTransform = mat.ToMatrix();
    }
    LastTime = time;
}

Getting the Bone Matrices at a given point in time

At this point, we have loaded all of our model mesh data, our bones, our animation, materials, and computed the frames of all of our animations. To actually animate a model using the animation transforms, we need a way to get the appropriate batch of bone matrices for a given point in time. To do that, we will add a method to our SceneAnimator, GetTransforms(dt). This function will in turn query the currently-selected AnimEvaluator, which will determine the proper frame of the animation to return, properly clamping the time position to the animation duration.

// SceneAnimator
public List<Matrix> GetTransforms(float dt) {
    return Animations[CurrentAnimationIndex].GetTransforms(dt);
}

// AnimEvaluator
public List<Matrix> GetTransforms(float dt) {
    return Transforms[GetFrameIndexAt(dt)];
}

private int GetFrameIndexAt(float dt) {
    dt *= TicksPerSecond;
    var time = 0.0f;
    if (Duration > 0.0f) {
        time = dt % Duration;
    }
    var percent = time / Duration;
    if (!PlayAnimationForward) {
        percent = (percent - 1.0f) * -1.0f;
    }
    var frameIndexAt = (int)(Transforms.Count * percent);
    return frameIndexAt;
}

SkinnedModelInstance Class

As you can probably imagine, if you have made it this far with me, loading a SkinnedModel is a fairly expensive operation. We might want to have hundreds or thousands of instances of a skinned model present in our scenes, so loading each individually would be prohibitively expensive, as well as very wasteful of memory; all the animation frames for a given model would be duplicated for each SkinnedModel. Thus, similar to our BasicModelInstance class, we will create a SkinnedModelInstance wrapper class, which will allow us to render many unique instances of a given skinned model, while only needing to load the model once.

The SkinnedModelInstance class will contain a reference to a SkinnedModel, the current animation time, a world transformation matrix to position and orient the mesh in world-space, and the active animation clip name. We’ll also add a queue that will allow us to play animations in sequence, and a flag to allow the queue to loop. We’ll provide access to the bone transforms through a readonly property that returns the transforms from the SkinnedModel SceneAnimator at the current time position.

public class SkinnedModelInstance {
    private readonly SkinnedModel _model;
    private float _timePos;
    public Matrix World { get; set; }
    public string ClipName {
        get { return _clipName; }
        set {
            _clipName = _model.Animator.Animations.Any(a => a.Name == value) ? value : "Still";
            _model.Animator.SetAnimation(_clipName);
            _timePos = 0;
        }
    }
    private string _clipName;

    // these are the available animation clips
    public IEnumerable<string> Clips { get { return _model.Animator.Animations.Select(a => a.Name); } } 

    private readonly Queue<string> _clipQueue = new Queue<string>();
    public bool LoopClips { get; set; }
        
    // the bone transforms for the mesh instance
    private List<Matrix> FinalTransforms  { get { return _model.Animator.GetTransforms(_timePos); }}
        
    public SkinnedModelInstance(string clipName, Matrix transform, SkinnedModel model) {
        World = transform;
        _model = model;
        ClipName = clipName;
    }
}

Updating the SkinnedModelInstance

Updating the model instance is very simple; all we need to do is advance the _timePos variable by the new timestep. Aside from that, we check if the time has overrun the current animation duration, and either move on to the next clip in the queue, or set the model to a default animation.

public void Update(float dt) {
    _timePos += dt;

    if (_timePos > _model.Animator.Duration) {
        if (_clipQueue.Any()) {
            ClipName = _clipQueue.Dequeue();
            if (LoopClips) {
                _clipQueue.Enqueue(ClipName);
            }
        } else {
            ClipName = "Still";
        }
    }
}

Drawing the SkinnedModelInstance

Drawing a SkinnedModelInstance is almost the same as for our BasicModelInstance. The only difference is that we need to upload the bone transforms to the shader. Note that this function assumes that the global shader variables (dirLights, eyePosW, etc) are setup correctly, and that the DeviceContext’s InputAssembler stage is properly configured.

public void Draw(DeviceContext dc, Matrix viewProj, EffectPass pass) {
            
    var world = World;
    var wit = MathF.InverseTranspose(world);
    var wvp = world * viewProj;

    Effects.NormalMapFX.SetWorld(world);
    Effects.NormalMapFX.SetWorldInvTranspose(wit);
    Effects.NormalMapFX.SetWorldViewProj(wvp);
    Effects.NormalMapFX.SetTexTransform(Matrix.Identity);

    Effects.NormalMapFX.SetBoneTransforms(FinalTransforms);
            
    for (int i = 0; i < _model.SubsetCount; i++) {
        Effects.NormalMapFX.SetMaterial(_model.Materials[i]);
        Effects.NormalMapFX.SetDiffuseMap(_model.DiffuseMapSRV[i]);
        Effects.NormalMapFX.SetNormalMap(_model.NormalMapSRV[i]);
                
        pass.Apply(dc);
        _model.ModelMesh.Draw(dc, i);
    }
}

SkinnedModel Vertex Shader

We’re not quite done yet. We still need to implement the shader code to deform our mesh vertices according to the bone matrices. I’m going to simply add this new vertex shader to our Normal Mapping shader, since the changes are fairly minimal. First, we need to add a new constant buffer to hold our bone matrices:

cbuffer cbSkinned
{
    float4x4 gBoneTransforms[96];
};

We also need to define a new vertex shader input structure, to match our PosNormTexTanSkinned vertex struct. Behold the SkinnedVertexIn:

struct SkinnedVertexIn
{
    float3 PosL    : POSITION;
    float3 NormalL : NORMAL; 
    float2 Tex    : TEXCOORD;
    float4 Tan        : TANGENT;
    float Weight0  : BLENDWEIGHT; 
    int4 BoneIndex : BLENDINDICES;
};

Lastly, we need to write a new vertex shader, to apply the bone skinning transformations to the input vertex, according to the bone weights. Using only two weights makes this a very simple linear interpolation. After that, we just do our normal transformations and populate a VertexOut structure that is the same as for our previous vertex shader.

VertexOut SkinnedVS(SkinnedVertexIn vin)
{
    VertexOut vout;
    
    // first bone weight
    float weight0 = vin.Weight0;
    // calculate second bone weight
    float weight1 = 1.0f - weight0;

    // offset position by bone matrices, using weights to scale
    float4 p     = weight0 * mul(float4(vin.PosL, 1.0f), gBoneTransforms[vin.BoneIndex[0]]);
    p += weight1 * mul(float4(vin.PosL, 1.0f), gBoneTransforms[vin.BoneIndex[1]]);
    p.w = 1.0f;

    // offset normal by bone matrices, using weights to scale
    float4 n     = weight0 * mul(float4(vin.NormalL, 0.0f), gBoneTransforms[vin.BoneIndex[0]]);
    n += weight1 * mul(float4(vin.NormalL, 0.0f), gBoneTransforms[vin.BoneIndex[1]]);
    n.w = 0.0f;

    // offset tangent by bone matrices, using weights to scale
    float4 t     = weight0 * mul(float4(vin.Tan.xyz, 0.0f), gBoneTransforms[vin.BoneIndex[0]]);
    t += weight1 * mul(float4(vin.Tan.xyz, 0.0f), gBoneTransforms[vin.BoneIndex[1]]);
    t.w = 0.0f;

 
    // Transform to world space space.
    vout.PosW     = mul(p, gWorld).xyz;
    vout.NormalW  = mul(n, gWorldInvTranspose).xyz;
    vout.TangentW = float4(mul(t, (float3x3)gWorld), vin.Tan.w);
    // Transform to homogeneous clip space.
    vout.PosH = mul(p, gWorldViewProj);
    
    // Output vertex attributes for interpolation across triangle.
    vout.Tex = mul(float4(vin.Tex, 0.0f, 1.0f), gTexTransform).xy;

    return vout;
}

Finally, we need to define new shader techniques using this new vertex shader. I have implemented skinned versions of all the techniques previously defined, but I’ll only show one here. The only change is to substitute the new SkinnedVS() for VS() in the SetVertexShader call.

technique11 Light3TexSkinned
{
    pass P0
    {
        SetVertexShader( CompileShader( vs_4_0, SkinnedVS() ) );
        SetGeometryShader( NULL );
        SetPixelShader( CompileShader( ps_4_0, PS(3, true, false, false, false) ) );
    }
}

Adding the necessary effect handles to the NormalMapEffect wrapper class should be relatively straightforward; see the code at my GitHub repository for the full implementation. The one gotcha is the implementation of the SetBoneTransforms() function. There, you will have to make sure that you use SetMatrixArray(), rather than SetMatrix(), to set the EffectMatrixVariable.

public void SetBoneTransforms(List<Matrix> bones) {
    _boneTransforms.SetMatrixArray(bones.ToArray());
}

The Results:

Skinned models animating through different animation clips

Next Time

If you’ve made it all the way here, congratulations! I hope that this was not too confusing, and please let me know if there is anything here that could be better explained. Next up, I’m going to circle back to Chapter 20 and look at particle systems. Hopefully, there won’t be as much lag time on that one, since it is quite a bit simpler than skinned meshes, and I’ve got all the code done already. After that will come shadow mapping, and then hopefully screen-space ambient occlusion. At that point, we’ll have covered the whole book! I might circle back and look at some of the more interesting exercises that I’ve skipped over, but I’m not totally sure what the next steps after that will be. I have a ton more material from Carl Granberg’s two books that I’d like to look at, as well as a couple of AI books that have some interesting material. At some point, I will want to take a hard refactoring push and rework some things into a more robust engine, as well as add some other polish, like sound effects, music, networking, a menu system, etc.

Richards Software Ramblings

Pages

Tuesday, October 15, 2013

Skinned Models in DirectX 11 with SlimDX and Assimp.Net