Help with USD Audio APIs

Thats great! the API Doc is difficult to understand for me

And I tried to used python to add audio information to usdc files then zip them with usdz tool, but the embeded sound still is silent.

I thought it is possible that VT value issue, because I did not use it on my python code, but the doc request that it needs Vt value, but I don’t understand what is Vt value and how to use it

Below is the code

from pxr import Usd, UsdGeom, UsdMedia, Sdf, Vt

stage_ref = Usd.Stage.Open('2.usdc')

prim = stage_ref.GetPrimAtPath('/audio')

audio_prim = UsdMedia.SpatialAudio(prim)

audio_filepath_attr = audio_prim.CreateFilePathAttr("Tyrannosaur.mp3")
aural_mode_attr = audio_prim.CreateAuralModeAttr("nonSpatial")	
playback_mode_attr = audio_prim.CreatePlaybackModeAttr("loopFromStartToEnd")


This is the result usdz file after zipping with usdzip
image5.usdz (5.8 MB)

And I just want to know if there is particular api doc for python, because the official API doc is sooooo difficult for me to understand, especially I’m a beginner of both USD and Coder who has no knowledge about C languages

Hi @micsir
Welcome to the forum! I moved your post to a new topic so it can get more eyes on it, as it was only tangentially related to the previous topic.

I’m not at my computer right now to test your code, but I can help answer your questions:

  1. Unfortunately there aren’t Python specific documentation. It’s not ideal, but learning to convert from the C++ APIs is the best option right now. There is the ability to generate Python typing for autocomplete during the build which can help a bit.

  2. You mentioned that the audio is silent. What application are you trying to play in? Audio support is dependent on the app and right now only very few applications support it. QuickLook on the Mac/iPhone are the few that do

  3. A VtValue is what USD uses to wrap around all the attribute value types. In Python it often auto converts to the actual type , but for C++ it means you don’t need a function for every single possible type, and instead can have one wrapper type.

Thanks @dhruvgovil

The device I used is iPhone 13, the 16.8 iOS system, when I open the USDZ file with QuickLook, there is silent, I will read the doc again to see if I can find the solution, regarding the VtValue, I’m not sure if it is convenient for you to provide a Python code, that will be more helpful for me to understand it.

Such as, as the API doc stated, the method to create file path attribute is USDMEDIA_API UsdAttribute CreateFilePathAttr (VtValue const &defaultValue=VtValue(), bool writeSparsely=false) const. How should I do it with Python code and Vt Library


In Python this would usually auto convert. The file path attribute is a string and for this you can basically provide it any python string and it should be fine. Your current code does that already and should be fine


It also the attributes defined fine in your usdz export:

def Xform "audio"
    uniform token auralMode = "nonSpatial"
    uniform asset filePath = @path\to\my\downloads\image5.usdz[0/Tyrannosaur.mp3]@
    uniform token playbackMode = "loopFromStartToEnd"

However, note the def Xform "audio". I suspect the issue might be there and that the USDZ player does not recognize the SpatialAudio because it’s not defined like that.

Could you try:

from pxr import UsdMedia

UsdMedia.SpatialAudio.Define(stage, "/audio")

So that it applies the correct type definition, generating something like this:

def SpatialAudio "audio"

Which would make your code like this:

from pxr import Usd, UsdMedia

stage_ref = Usd.Stage.Open('2.usdc')

audio_prim = UsdMedia.SpatialAudio.Define(stage_ref, "/audio")

audio_filepath_attr = audio_prim.CreateFilePathAttr("Tyrannosaur.mp3")
aural_mode_attr = audio_prim.CreateAuralModeAttr("nonSpatial")  
playback_mode_attr = audio_prim.CreatePlaybackModeAttr("loopFromStartToEnd")


Whether that solves the issue I’m not sure - admittedly I’m not all too familiar with the USD API yet either.

1 Like

Hello @dhruvgovil @BigRoyNL , I checked the doc again and have resolved the issue, thanks

The SpatialAudio prim should under the Xform prim and set the playback mode as “loopFromStage”

This is the doc,def%20Xform%20%22Sounds%22,-%7B%0A%20%20%20%20def%20SpatialAudio

1 Like

NVIDIA recently published these Python API docs: UsdMedia module — pxr-usd-api 105.0.2 documentation. They’re programmatically translated from the C++ docs so it’s not perfect and in this particular case it still requires that C++ understanding of VtValue.

You might find it useful in other cases though. Let me know if you use it and have any feedback.

1 Like

@mati-nvidia is that something that could be added to the mainline USD repo? Since its programatic, it would be nice to be able to have them synced with the Pixar builds or generate locally too.

PRs are already submitted for the Python docstring improvements courtesy of @elrond79: Pull requests · PixarAnimationStudios/OpenUSD · GitHub. I think the Sphinx work to generate the docs after that would be on the Pixar side.


Thanks, but I prefer another one article from NVIDIA, although it’s very basic, but it helps me a lot, especially a beginner like me :rofl: