Quantcast
Viewing latest article 7
Browse Latest Browse All 10

3D Matrices Update Optimization

4×4 matrices are the heart of any 3D engine as far as math is concerned. And in any engine, how those matrices are computed and made available through the API are two critical points regarding both performances and ease of development. Minko was quite generous regarding the second point, making it easy and simple to access and watch local to world (and world to local) matrices on any scene node. Yet, the update strategy of those matrices was.. naïve, to say the least.

TL;DR

There is a new 3D transforms API available in the dev branch that provides a 50000% 25000% boost on scene nodes’ matrices update in the best cases, making it possible to display 50x 25x more animated objects. You can read more about the changes on Answers.

The Problem

Here is how it worked until very recently: each scene node was the target of a TransformController. The job of this controller was to listen to the changes of the ISceneNode.transform property in order to update the localToWorld and worldToLocal properties accordingly. Of course, as those properties rely on the parent’s transform, the controller also listen to it and that’s where trouble starts: modifying the transform of a node will trigger a lot of signals executions, implying a lot of overhead. The worse case scenario is vertex skinning: the skeleton has to be updated top to bottom (because there is no other way to do it really…). And each update on a node will trigger updates on all of its descendants, leading to a factorial complexity and a huge overhead. The good point with this method is that all the scene nodes transforms are always available and “synchronized” with each other. The very bad point is that it does a lot of unnecessary computations using recursive signals that imply a huge function calls overhead.

The Solution

We had three priorities:
  1. Provide a new, robust and fast 3D transformation matrices computation strategy.
  2. Keep API changes to a minimum.
  3. Make sure we don’t remove any feature.
To fasten the local to world matrices computation, we decided to update them in a batched fashion: all the local to world matrices of a (sub)scene had to be updated in a single method call using a single loop. It forbids updating the (sub)scene as a tree data structure, since it would mean recursion – and a big function call overhead – or using stacks – which is not the strong point of AS3. Therefore, we had to linearize the (sub)scene in order to get a list of transforms. To do this, we simply do a breadth first traversal of the (sub)scene tree when it has changed in order to get a flat list of matrices. We also store a few other things in other lists – such as the number of children or the id of parent of each node – in order to preserve the data we need to traverse that very list as a linearized tree structure. This job is done by the TransformController.updateTransformsList() method:
private function updateTransformsList() : void
{
    var root 	: ISceneNode 	= _target.root;
    var nodes 	: Vector. 	= new [root];
    var nodeId 	: uint 		= 0;

    _nodeToId = new Dictionary(true);
    _transforms = new [];
    _localToWorldTransformsInitialized = new [];
    _localToWorldTransforms = new [];
    _worldToLocalTransforms = new [];
    _numChildren = new [];
    _firstChildId = new [];
    _idToNode = new [];
    _parentId = new [-1];

    while (nodes.length)
    {
        var node 	: ISceneNode 	= nodes.shift();
        var group 	: Group 	= node as Group;

        _nodeToId[node] = nodeId;
        _idToNode[nodeId] = node;
        _transforms[nodeId] = node.transform;
        _localToWorldTransforms[nodeId] = new Matrix4x4().lock();
        _localToWorldTransformsInitialized[nodeId] = false;

        if (group)
        {
            var numChildren 	: uint = group.numChildren;
            var firstChildId 	: uint = nodeId + nodes.length + 1;

            _numChildren[nodeId] = numChildren;
            _firstChildId[nodeId] = firstChildId;
            for (var childId : uint = 0; childId < numChildren; ++childId)
            {
                _parentId[uint(firstChildId + childId)] = nodeId;
                nodes.push(group.getChildAt(childId));
            }
        }
        else
        {
            _numChildren[nodeId] = 0;
            _firstChildId[nodeId] = 0;
        }

        ++nodeId;
    }

    _worldToLocalTransforms.length = _localToWorldTransforms.length;
    _invalidList = false;
}
In order to avoid unnecessary computations, we decided to update them on a frame-to-frame basis. To make sure this update happens just before rendering and that all matrices are actually up to date, we’ve added the Scene.renderingBegin signal. As you might have guesses, this signals is simply executed right before the scene starts the actual rendering operations when Scene.render() is called (so it’s called after Scene.enterFrame, which is the signal which should be used to update the scene). This is the job of the TransformController.updateLocalToWorld() method:
private function updateLocalToWorld(nodeId : uint = 0) : void
{
    var numNodes 		: uint 		= _transforms.length;
    var childrenOffset		: uint		= 1;
    var rootLocalToWorld	: Matrix4x4	= _localToWorldTransforms[nodeId];
    var rootTransform		: Matrix4x4	= _transforms[nodeId];
    var root			: ISceneNode	= _idToNode[childId];
    
    if (rootTransform._hasChanged || !_localToWorldTransformsInitialized[nodeId])
    {
        rootLocalToWorld.copyFrom(rootTransform);
        
        if (nodeId != 0)
            rootLocalToWorld.append(_localToWorldTransforms[_parentId[nodeId]]);
        
        rootTransform._hasChanged = false;
        _localToWorldTransformsInitialized[nodeId] = true;
        root.localToWorldTransformChanged.execute(root, rootLocalToWorld);
    }
    
    for (; nodeId < numNodes; ++nodeId)
    {
        var localToWorld 	: Matrix4x4	= _localToWorldTransforms[nodeId];
        var numChildren		: uint		= _numChildren[nodeId];
        var firstChildId	: uint		= _firstChildId[nodeId];
        var lastChildId		: uint		= firstChildId + numChildren;
        var isDirty		: Boolean	= localToWorld._hasChanged;
        
        localToWorld._hasChanged = false;
        
        for (var childId : uint = firstChildId; childId < lastChildId; ++childId)
        {
            var childTransform		: Matrix4x4		= _transforms[childId];
            var childLocalToWorld	: Matrix4x4		= _localToWorldTransforms[childId];
            var childIsDirty		: Boolean		= isDirty || childTransform._hasChanged
                || !_localToWorldTransformsInitialized[childId];
            
            if (childIsDirty)
            {
                var child	: ISceneNode	= _idToNode[childId];
                
                childLocalToWorld
                .copyFrom(childTransform)
                    .append(localToWorld);
                
                childTransform._hasChanged = false;
                _localToWorldTransformsInitialized[childId] = true;
                child.localToWorldTransformChanged.execute(child, childLocalToWorld);
            }
        }
    }
}
You can read the changelog and more details about the list of API changes on Aerys Answers.

Future Improvements

We could make the TransformController.updateLocalToWorld() method a bit faster by avoiding taking identity matrices into account or do a special case for matrices that have nothing but a translation. Memory wise, we could also avoid using the _localToWorldTransformsInitialized vector all together and check whether _localToWorld[nodeId] is null or not.

Viewing latest article 7
Browse Latest Browse All 10

Trending Articles