Anyone seeing issues with USD on M1 Macs?

Hi everyone, we’re seeing an intermittent crash on M1 Mac builds of USD, and I’m just wondering if anyone else is seeing anything similar. Basically we have a bit of python code that gets a USD file path and runs:

        s = Usd.Stage.CreateInMemory()
        prim = s.DefinePrim('/asset')
        prim.GetReferences().AddReference(asset_file_path)

During the AddReference call, there is (rarely, but sometimes) a crash. Attached at the bottom of the message is the stack trace of the crashing (worker) thread and the main thread at the time of the crash, starting at the AddReference call.

This is USD 23.08 (with a few modifications - GitHub - sideeffects/USD at dev_houdini20.0beta to be precise). Built as an arm64 binary on OSX 13 with clang14.0 (at least that’s what the build machine is telling me).

Are we alone in this, or has anyone else seen similar issues?

Thanks,
Mark


-- TRACEBACK BEGIN --

2   libHoudiniUT.dylib                  0x00000001098ec56c stackTrace(UTsignalHandlerArg) + 272

3   libHoudiniUT.dylib                  0x00000001098ec100 signalCallback(UTsignalHandlerArg) + 320

4   libHoudiniUT.dylib                  0x0000000109c797c4 UT_Signal::processSignal(int, __siginfo*, void*) + 112

5   libsystem_platform.dylib            0x00000001a2f5ea84 _sigtramp + 56

6   libpxr_pcp.dylib                    0x000000010e1e3dd8 pxrInternal_v0_23__pxrReserved__::PcpPrimIndex_Graph::New(pxrInternal_v0_23__pxrReserved__::TfRefPtr<pxrInternal_v0_23__pxrReserved__::PcpPrimIndex_Graph> const&) + 176

7   libpxr_pcp.dylib                    0x000000010e25e880 pxrInternal_v0_23__pxrReserved__::Pcp_BuildPrimIndex(pxrInternal_v0_23__pxrReserved__::PcpLayerStackSite const&, pxrInternal_v0_23__pxrReserved__::PcpLayerStackSite const&, int, bool, bool, bool, pxrInternal_v0_23__pxrReserved__::PcpPrimIndex_StackFrame*, pxrInternal_v0_23__pxrReserved__::PcpPrimIndexInputs const&, pxrInternal_v0_23__pxrReserved__::PcpPrimIndexOutputs*) + 1280

8   libpxr_pcp.dylib                    0x000000010e25d754 pxrInternal_v0_23__pxrReserved__::PcpComputePrimIndex(pxrInternal_v0_23__pxrReserved__::SdfPath const&, pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::PcpLayerStack> const&, pxrInternal_v0_23__pxrReserved__::PcpPrimIndexInputs const&, pxrInternal_v0_23__pxrReserved__::PcpPrimIndexOutputs*, pxrInternal_v0_23__pxrReserved__::ArResolver*) + 572

9   libpxr_pcp.dylib                    0x000000010e1fc788 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool) + 480

10  libpxr_pcp.dylib                    0x000000010e1fefa4 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool)::'lambda'()::operator()() const + 60

11  libpxr_pcp.dylib                    0x000000010e1feeec pxrInternal_v0_23__pxrReserved__::WorkDispatcher::_InvokerTask<pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool)::'lambda'()>::execute() + 36

12  libtbb.dylib                        0x00000001025f46b0 tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::process_bypass_loop(tbb::internal::context_guard_helper<false>&, tbb::task*, long) + 440

13  libtbb.dylib                        0x00000001025f3d4c tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) + 188

14  libtbb.dylib                        0x00000001025eccf8 tbb::internal::arena::process(tbb::internal::generic_scheduler&) + 252

15  libtbb.dylib                        0x00000001025ec3d4 tbb::internal::market::process(rml::job&) + 40

16  libtbb.dylib                        0x00000001025e6564 tbb::internal::rml::private_worker::run() + 284

17  libtbb.dylib                        0x00000001025e643c tbb::internal::rml::private_worker::thread_routine(void*) + 12

18  libsystem_pthread.dylib             0x00000001a2f2ffa8 _pthread_start + 148

19  libsystem_pthread.dylib             0x00000001a2f2ada0 thread_start + 8

-- TRACEBACK END --

72011: Fatal error: Segmentation fault (sent by pid 0)

-- TRACEBACK BEGIN --

2   libHoudiniUT.dylib                  0x00000001098ec56c stackTrace(UTsignalHandlerArg) + 272

3   libHoudiniUT.dylib                  0x00000001098ec1dc signalCallback(UTsignalHandlerArg) + 540

4   libHoudiniUT.dylib                  0x0000000109c797c4 UT_Signal::processSignal(int, __siginfo*, void*) + 112

5   libsystem_platform.dylib            0x00000001a2f5ea84 _sigtramp + 56

6   libpxr_pcp.dylib                    0x000000010e1fbffc pxrInternal_v0_23__pxrReserved__::PcpPrimIndex::operator=(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex&&) + 84

7   libpxr_pcp.dylib                    0x000000010e1fdd7c pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_PublishOneOutput(std::__1::pair<pxrInternal_v0_23__pxrReserved__::SdfPathTable<pxrInternal_v0_23__pxrReserved__::PcpPrimIndex>::NodeHandle, pxrInternal_v0_23__pxrReserved__::PcpPrimIndexOutputs>&&, bool) + 160

8   libpxr_pcp.dylib                    0x000000010e1fd700 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_PublishOutputs() + 132

9   libpxr_pcp.dylib                    0x000000010e1fcc00 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool) + 1624

10  libpxr_pcp.dylib                    0x000000010e1fefa4 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool)::'lambda'()::operator()() const + 60

11  libpxr_pcp.dylib                    0x000000010e1feeec pxrInternal_v0_23__pxrReserved__::WorkDispatcher::_InvokerTask<pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::_ComputeIndex(pxrInternal_v0_23__pxrReserved__::PcpPrimIndex const*, pxrInternal_v0_23__pxrReserved__::SdfPath, bool)::'lambda'()>::execute() + 36

12  libtbb.dylib                        0x00000001025f46b0 tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::process_bypass_loop(tbb::internal::context_guard_helper<false>&, tbb::task*, long) + 440

13  libtbb.dylib                        0x00000001025f3d4c tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) + 188

14  libpxr_work.dylib                   0x00000001025b2678 pxrInternal_v0_23__pxrReserved__::WorkDispatcher::Wait() + 60

15  libpxr_pcp.dylib                    0x000000010e1fc518 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::RunAndWait()::'lambda'()::operator()() const + 240

16  libtbb.dylib                        0x00000001025ef038 tbb::interface7::internal::isolate_within_arena(tbb::interface7::internal::delegate_base&, long) + 92

17  libpxr_pcp.dylib                    0x000000010e1fa194 pxrInternal_v0_23__pxrReserved__::PcpCache::_ParallelIndexer::RunAndWait() + 72

18  libpxr_pcp.dylib                    0x000000010e1f9540 pxrInternal_v0_23__pxrReserved__::PcpCache::_ComputePrimIndexesInParallel(std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfPath>> const&, std::__1::vector<std::__1::shared_ptr<pxrInternal_v0_23__pxrReserved__::PcpErrorBase>, std::__1::allocator<std::__1::shared_ptr<pxrInternal_v0_23__pxrReserved__::PcpErrorBase>>>*, pxrInternal_v0_23__pxrReserved__::PcpCache::_UntypedIndexingChildrenPredicate, pxrInternal_v0_23__pxrReserved__::PcpCache::_UntypedIndexingPayloadPredicate, char const*, char const*) + 1384

19  libpxr_usd.dylib                    0x000000011b3d8424 pxrInternal_v0_23__pxrReserved__::UsdStage::_ComposePrimIndexesInParallel(std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfPath>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, pxrInternal_v0_23__pxrReserved__::Usd_InstanceChanges*) + 692

20  libpxr_usd.dylib                    0x000000011b42d3c8 void pxrInternal_v0_23__pxrReserved__::UsdStage::_RecomposePrims<std::__1::map<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>, std::__1::less<pxrInternal_v0_23__pxrReserved__::SdfPath>, std::__1::allocator<std::__1::pair<pxrInternal_v0_23__pxrReserved__::SdfPath const, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>>>>>(std::__1::map<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>, std::__1::less<pxrInternal_v0_23__pxrReserved__::SdfPath>, std::__1::allocator<std::__1::pair<pxrInternal_v0_23__pxrReserved__::SdfPath const, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>>>>*) + 584

21  libpxr_usd.dylib                    0x000000011b3f6db0 void pxrInternal_v0_23__pxrReserved__::UsdStage::_Recompose<std::__1::map<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>, std::__1::less<pxrInternal_v0_23__pxrReserved__::SdfPath>, std::__1::allocator<std::__1::pair<pxrInternal_v0_23__pxrReserved__::SdfPath const, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>>>>>(pxrInternal_v0_23__pxrReserved__::PcpChanges const&, std::__1::map<pxrInternal_v0_23__pxrReserved__::SdfPath, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>, std::__1::less<pxrInternal_v0_23__pxrReserved__::SdfPath>, std::__1::allocator<std::__1::pair<pxrInternal_v0_23__pxrReserved__::SdfPath const, std::__1::vector<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::SdfChangeList::Entry const*>>>>>*) + 1000

22  libpxr_usd.dylib                    0x000000011b3f1ab0 pxrInternal_v0_23__pxrReserved__::UsdStage::_ProcessPendingChanges() + 172

23  libpxr_usd.dylib                    0x000000011b3f8298 pxrInternal_v0_23__pxrReserved__::UsdStage::_HandleLayersDidChange(pxrInternal_v0_23__pxrReserved__::SdfNotice::LayersDidChangeSentPerLayer const&) + 3544

24  libpxr_usd.dylib                    0x000000011b42f904 pxrInternal_v0_23__pxrReserved__::TfNotice::_StandardDeliverer<pxrInternal_v0_23__pxrReserved__::TfNotice::_Deliverer<pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::UsdStage>, pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::SdfLayer>, void (pxrInternal_v0_23__pxrReserved__::UsdStage::*)(pxrInternal_v0_23__pxrReserved__::SdfNotice::LayersDidChangeSentPerLayer const&), pxrInternal_v0_23__pxrReserved__::SdfNotice::LayersDidChangeSentPerLayer>>::_SendToListener(pxrInternal_v0_23__pxrReserved__::TfNotice const&, pxrInternal_v0_23__pxrReserved__::TfType const&, pxrInternal_v0_23__pxrReserved__::TfWeakBase const*, void const*, std::type_info const&, std::__1::vector<pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::TfNotice::Probe>, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::TfNotice::Probe>>> const&) + 120

25  libpxr_tf.dylib                     0x000000010e36b648 pxrInternal_v0_23__pxrReserved__::Tf_NoticeRegistry::_Deliver(pxrInternal_v0_23__pxrReserved__::TfNotice const&, pxrInternal_v0_23__pxrReserved__::TfType const&, pxrInternal_v0_23__pxrReserved__::TfWeakBase const*, void const*, std::type_info const&, std::__1::vector<pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::TfNotice::Probe>, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::TfWeakPtr<pxrInternal_v0_23__pxrReserved__::TfNotice::Probe>>> const&, std::__1::pair<std::__1::list<pxrInternal_v0_23__pxrReserved__::TfNotice::_DelivererBase*, std::__1::allocator<pxrInternal_v0_23__pxrReserved__::TfNotice::_DelivererBase*>>*, std::__1::__list_iterator<pxrInternal_v0_23__pxrReserved__::TfNotice::_DelivererBase*, void*>> const&) + 192

26  libpxr_tf.dylib                     0x000000010e36aef4 pxrInternal_v0_23__pxrReserved__::Tf_NoticeRegistry::_Send(pxrInternal_v0_23__pxrReserved__::TfNotice const&, pxrInternal_v0_23__pxrReserved__::TfType const&, pxrInternal_v0_23__pxrReserved__::TfWeakBase const*, void const*, std::type_info const&) + 1228

27  libpxr_tf.dylib                     0x000000010e397e44 pxrInternal_v0_23__pxrReserved__::TfNotice::_Send(pxrInternal_v0_23__pxrReserved__::TfWeakBase const*, void const*, std::type_info const&) const + 112

28  libpxr_sdf.dylib                    0x0000000113b4a058 pxrInternal_v0_23__pxrReserved__::Sdf_ChangeManager::_SendNotices(pxrInternal_v0_23__pxrReserved__::Sdf_ChangeManager::_Data*) + 916

29  libpxr_sdf.dylib                    0x0000000113b49a4c pxrInternal_v0_23__pxrReserved__::Sdf_ChangeManager::_CloseChangeBlock(pxrInternal_v0_23__pxrReserved__::SdfChangeBlock const*, void const*) + 52

30  libpxr_usd.dylib                    0x000000011b39bce8 pxrInternal_v0_23__pxrReserved__::Usd_ListEditImpl<pxrInternal_v0_23__pxrReserved__::UsdReferences, pxrInternal_v0_23__pxrReserved__::SdfListEditorProxy<pxrInternal_v0_23__pxrReserved__::SdfReferenceTypePolicy>>::Add(pxrInternal_v0_23__pxrReserved__::UsdReferences const&, pxrInternal_v0_23__pxrReserved__::SdfReference const&, pxrInternal_v0_23__pxrReserved__::UsdListPosition) + 460

31  libpxr_usd.dylib                    0x000000011b39c284 pxrInternal_v0_23__pxrReserved__::UsdReferences::AddReference(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, pxrInternal_v0_23__pxrReserved__::SdfLayerOffset const&, pxrInternal_v0_23__pxrReserved__::UsdListPosition) + 168

Maybe someone who’s experienced the same can chime in, but we’ve been running USD on Arm in production for a few years without issues and without patches, including (recently) 23.8. maybe it can serve as a datapoint at least.

Edit: it shows up if running the USD code in many iterations inside the same session

I just tried this

import platform, sys
print(sys.version_info)
print(platform.processor())

from pxr import Usd
print(Usd.GetVersion())
s = Usd.Stage.CreateInMemory()
prim = s.DefinePrim('/asset')

prim.GetReferences().AddReference("/Users/dhruvgovil/Downloads/gramophone.usdz")

print("Works")

and got

sys.version_info(major=3, minor=10, micro=8, releaselevel='final', serial=0)
arm
(0, 23, 8)
Works

I tested on both macOS 13 and 14, compiled with both Xcode 14 and 15’s clang

Okay doing some more sleuthing here (hooray for Rez builds) with 40k iterations each

  1. I tried going back to USD 22.8 and stepping forward, and can’t reproduce the crash for USD 22.8 through 23.2
  2. I tried stepping back to 23.5 and get this but I don’t appear to get a segfault
Warning: in _ReportErrors at line 2885 of /Users/dhruvgovil/Projects/usd/pxr/usd/usd/stage.cpp -- In </asset>: Could not open asset @/Users/dhruvgovil/Downloads/gramophone.usdz@ for reference introduced by @anon:0x109e04800:tmp.usda@</asset> -- Corrupt asset @/Users/dhruvgovil/Downloads/gramophone.usdz[gramophone.usdc]@ - ignoring invalid specs: spec </> repeated, spec at index 2 has empty path, spec at index 3 has empty path, spec at index 4

... (repeated for every number in between)

spec at index 414 has empty path.. (recomposing stage on stage @anon:0x109e04800:tmp.usda@ <0x10b808200>)
  1. I get the segfault on 23.8 sporadically. Running the same compiled universal lib with arch -x86_64 doesn’t seem to reproduce the error.

This isn’t super scientific by any means. I’m just stepping through my different versions of USD where my dependency tree hasn’t changed, so only USD should have changed.

If I’m guessing it’s some regression in 23.5/23.8 that’s perhaps using some kind of semaphore that’s dependent on x86 ordering ?

@mtucker could you file an issue with Pixar?

1 Like

Wow, thanks for the in-depth investigation Dhruv! I can certainly file an issue on github.

1 Like