I am utilizing pxr.UsdUtils.ComputeAllDependencies in a tool I have built for submitting USD renders to the GridMarkets platform from Houdini LOPs and as part of the process I need to preflight the files referenced in the USD stage.
Usually this goes pretty quickly, but I have a job which I had to write the individual USD files for each frame due to large VDB files which are being referenced crashing Houdini while writing them. After I finished writing those, I used the USD Stitch ROP to stitch all of the USD files on disk into a single time coded file which referenced all of the VDB files that had been written for the render. Unfortunately, now when attempting to execute the aforementioned function the code takes on the order of 10’s of minutes to parse the stitched stage.
I loaded up the Houdini Command Line tools and manually executed the function to be sure and I left it running since the memory kept increasing and there was still processing going on and eventually it came back with the correct file list, but it took WAY too long to be reasonable for a production tool.
I am pretty sure this is an edge case and I am doing something heavy handed in my setup which could be handled much more efficiently, but users/artists will always end up doing something to find an edge case and I need to be sure I can deal with it. Or at the very least advise users with the issue how to do better so it can be processed.
I can share the USD file or attempt to create a smaller case which has the same issue if needed.
I have just been using regular old USD, and no, the USD file that was stitched together is only about 4GB, not altogether that large, but the VDB files that are referenced get up to around 300MB/frame in some parts. I was assuming that is why Houdini decided to eat 64GB of ram and seg fault when I was attempting to write out without using the “Flush Memory Between Frames” option on the USD ROP node.
Would it be possible for you to export just the clip layers from Houdini, and then call usdstitchclips from the command-line, so we can better tell whether the problem is coming from the core USD code, or Houdini?
@spiff Apologies for my absence. I have been dealing with a BUNCH of other stuff and I am now circling back to this as it has reared its ugly head again.
This time I am attempting to refine my pipeline and alleviate this problem. I have managed to get the files written out without having to use the USD Stitch ROP in Houdini. I exported the caches to VDB natively, then wrote out a USD file to load them, then loaded that as a layer in LOPs to run everything. Even with all of that, I still end up with the VDBs being attached into the output USD instead of being referenced. Not sure that is a USD problem so much as a Houdini problem. Still figuring out how to set all of that up correctly.
Anyway, all of the VDBs are being referenced into the output USD file. This file is the one which is being read by pxr.UsdUtils.ComputeAllDependencies, and it is still taking a long time to process, even though there are only 240 unique references in it, and they are referenced in 4 places. I dumped a .usda to check the references and there are 960 references to .vdb files, which means 4*240. I am attaching said .usda to here for investigation.
Sorry for the gap here, @Adalast … can you explain what you mean by “being attached into the output USD instead of being referenced”?
ComputeAllDependencies should not be trying to open any of the VDB files, just basically statting them, so I am definitely puzzled by the behavior you’re seeing. Using an ArResolverScopedCache around the call to ComputeAllDependencies would ensure we only stat each file once, but still, something seems off! Our ability to repro may be lacking without the actual VDB files, which doesn’t seem practical, but we’ll have someone at least have a try with the file you provided.