Sublayers stack modification and prim cache, I'm missing something obvious

Hi,

I am trying to figure out a proper way to manage following situation. We have a sublayer stack in a persistent stage that gets rendered using USDImagingGLEngine using Vulkan backend (USD version is 24.03). Between frames the stack composition can change: sublayers might get removed or added etc.

What I’m seeing is that for example if a new layer is added to the stack or existing layer removed and new added in its place, all existing prims are removed from cache and then added back during repopulation, which causes for example shader recompilation because material prims are also removed. Here’s a log snippet from USDIMAGING_CHANGES messages from a very minimal stage for reproduction:

[Repopulate] Populating </> on stage rootLayer
[Repopulate] Root path: </>
[Add HdPrim Info] </mysphere> adapter=UsdImagingSphereAdapter
[Add dependency] </mysphere> -> </mysphere>
[Add HdPrim Info] </Camera/Camera> adapter=UsdImagingCameraAdapter
[Add dependency] </Camera/Camera> -> </Camera/Camera>
[Repopulate] 2 variability tasks in worker
---- RENDERED FRAME 1 ------

[Objects Changed] Notice recieved from stage with root layer @anon:000002957CCAE990:rootLayer@
 - Resync queued: /
[usdPathsToResync] Updating cache map for /
[Resync Prim]: </>
[_GatherDependencies] Found entry in flattened cache for / with 2 paths
  - affected child prim: </Camera/Camera>
  - affected child prim: </mysphere>
[Remove Rprim] </mysphere>
[Remove Sprim] </Camera/Camera>
[Remove PrimInfo] </Camera/Camera>
[Remove PrimInfo] </mysphere>
[Remove dependency] </Camera/Camera> -> </Camera/Camera>
[Remove dependency] </mysphere> -> </mysphere>
[Repopulate] Populating </> on stage rootLayer
[Repopulate] Root path: </Camera/Camera>
[Repopulate] Root path: </mysphere>
[Add HdPrim Info] </Camera/Camera> adapter=UsdImagingCameraAdapter
[Add dependency] </Camera/Camera> -> </Camera/Camera>
[Add HdPrim Info] </mysphere> adapter=UsdImagingSphereAdapter
[Add dependency] </mysphere> -> </mysphere>
[Repopulate] 2 variability tasks in worker
---- RENDERED FRAME 2 ------

Before rendering frame 2 a new layer was added to the stack that added an override for the XForm of /mysphere.

Minimal code that can reproduce it (actual renderer setup and invocation details are omitted):

auto rootLayer = pxr::SdfLayer::CreateAnonymous("rootLayer");
auto rootStage = pxr::UsdStage::Open(rootLayer);

// Sphere
auto sphereLayer = pxr::SdfLayer::CreateAnonymous("SphereLayer");
auto sphereStage = pxr::UsdStage::Open(sphereLayer);
auto sphere = pxr::UsdGeomSphere::Define(sphereStage, pxr::SdfPath{"/mysphere"});

// Camera and its XForm
auto cameraLayer = pxr::SdfLayer::CreateAnonymous("CameraLayer");
auto cameraStage = pxr::UsdStage::Open(cameraLayer);
auto cameraXFormPrim = cameraStage->DefinePrim(pxr::SdfPath{"/Camera"}, pxr::UsdGeomTokens->Xform);
auto camera = pxr::UsdGeomCamera::Define(cameraStage, pxr::SdfPath{"/Camera/Camera"});
auto camXform = pxr::UsdGeomXformable{cameraXFormPrim};
pxr::UsdGeomXformCommonAPI(camXform).SetScale(pxr::GfVec3f{1.0f});
pxr::UsdGeomXformCommonAPI(camXform).SetRotate(pxr::GfVec3f{0.0f});
pxr::UsdGeomXformCommonAPI(camXform).SetTranslate(pxr::GfVec3f{0.0f, 0.0f, 10.0f});

// Sphere XForm override layer
auto transformLayer = pxr::SdfLayer::CreateAnonymous("TransformLayer");
auto transformStage = pxr::UsdStage::Open(transformLayer);
auto opNameToken = pxr::TfToken{"MyTransform"};
auto xFormPrim = transformStage->OverridePrim(pxr::SdfPath("/mysphere"));
auto refXform = pxr::UsdGeomXformable{xFormPrim};
pxr::UsdGeomXformOp scaleXform = refXform.AddXformOp(pxr::UsdGeomXformOp::TypeScale, pxr::UsdGeomXformOp::PrecisionFloat, opNameToken);
scaleXform.Set(pxr::GfVec3f{0.2f}, pxr::UsdTimeCode::Default());

rootLayer->InsertSubLayerPath(sphereLayer->GetIdentifier());
rootLayer->InsertSubLayerPath(cameraLayer->GetIdentifier());

// Render the stage
renderEngine->render(rootStage);

// Another layer added to sublayers stack
rootLayer->InsertSubLayerPath(transformLayer->GetIdentifier());

// Render the stage again, all prims are removed and then repopulated
renderEngine->render(rootStage);

It seems I’m missing something very obvious because adding a layer to subLayers stack probably shouldn’t cause total prim cache repopulation? Or if it is by design, what would be a better approach in this kind of situation where we want to modify the subLayers stack of a stage but not struggle with the slowness it causes? Most problematic for us the shader recompilation that happens due to material prims getting recreated. What is most strange is that even prims that have no relation whatsoever to the changes new layer introduces are also removed.

Only workaround we have found at the moment is to keep a set of persistent layers in the subLayers stack and transfer the content from incoming layers stack into it (using either TransferContent or SdfCopySpec), but it makes it suboptimal due to constant copying and gets messy when layer count changes.

Hi @HendrikProosa, I wouldn’t call it “by design” so much as a recognized limitation arising from competing design decisions that were higher priority and difficult to reconcile, and it turns out to be a complicated set of design concerns to resolve (adding and removing layers “live” to layerStacks is something very few artists in our pipeline need to do.

There is good news on the horizon, though. We’ve finally been able to prioritize coming back to the performance problem you note (which also affects layer muting, which if you’re not familiar with, may address some of the workflow problems you’re tackling?). Our next software release (tentatively 24.11) should have some big improvements for these kinds of operations.

1 Like

Thank you for the clarification @spiff ! I hope future releases can help in this regard!

My understanding of the intricacies of USD internals is pretty limited, what’s the main logical/technical difference between lets say swapping out a layer which describes the same over with changed parameters (for example transform changes) vs transfering the content of new layer into a layer that already is in the subLayers stack? In former case all prims are recreated, in the latter only a change for affected prim is applied.

I’ll also outline our use case a bit so maybe someone can chime in with how they have approached this problem. We use USD in our proprietary toolset to implement some 3D related functionality. Our tools are composed using operator graphs in similar manner to how Nuke and Houdini do it from artist’s perspective. Layers are used for capturing the modifications each graph operator applies to its incoming scene. These layers are finally linked as sublayers to root layer of a stage that then gets rendered. Technically it all seems to work fine, but since we also have a material operators which creates material networks, recreation of the material prims causes very long shader recompilation times (several seconds) which is a few orders of magnitude longer than stage update and actual rendering takes.

USD’s “change processing” was designed to respond to fine-grained edits to existing layers (e.g. changing a single property value, or adding a piece of metadata) and potentially batching those small changes together for the composition engine to process and propagate to affected prims/properties/etc.

TransferContent is basically very simple: it iterates over every field of one layer, creating an edit in the destination layer to copy that field in. This leverages the change-processing algorithm well. However, if your “source” layer has even a single subLayer in it, you will get the same “slow path” that you are seeing when adding a layer in yourself. Adding/removing (non-empty) layers has two challenges that up until now we have not tackled:

  1. You’re no longer dealing with a single layer, potentially, because you need to process all of the nested sublayers of that layer to find all the fine-grained changes to process.
  2. The processing is strongly tied to mutations to the layer. When you are adding in a layer and want to know how it will affect the scene, none of the specs contributing changes are being mutated - they’re just coming into being by virtue of a new layer being considered. There was some significant refactoring needed to generalize the algorithm to be able to handle both cases… plus some extra subtleties because we do not want to stream the entire contents of multi-GB crate files into memory just to avoid recomposition costs, so we had to generalize things yet further… and as a result, the granularity of change notification to properties from adding/removing/muting/unmuting a layer will be somewhat coarser than when TransferContent’ing (resulting potentially in over-invalidation of properties), but that should not affect your primary use-case.

I am curious also what renderer you are using, because hdStorm populates a cache of compiled glsl shader programs, and I would hope that in a case like yours where the Materials actually haven’t changed, that we would just find the needed shaders already compiled.

Thanks again for the explanations!

We are using hdStorm with Vulkan backend. What I’m seeing is that although compiled shader program is initially in the resource cache, since material gets removed and then recreated, the shader program also gets garbage collected from resourceRegistry and next time lookup for it fails even though the hash of the material program itself is exactly the same.

Above is what made me suspect there might be something I’m missing. Is there a way to group sublayer editing commands in a way that would prevent this kind of behavior? Meaning, could I somehow group the edits I make to prevent prim and/or material removals due to “premature” garbage collection and other mechanics? I tried introducing SdfChangeBlock but it didn’t seem to make a difference.