Questions about VtArray designs/usages to avoid unnecessary COW checks

Hi team,

We notice that when using VtArray, inappropriate usages of operator[], emplace_back, push_back, etc. can cuase quantities of unnecessary COW checks. Since it’s easy to be misused but can result some performance downgrade especially when editing the scene, we’d like to know if there is any chance to improve the VtArray designs.

Take operator[] as an example, we made a simple examination using google benchmark:

BENCHMARK_DEFINE_F(HydraFixture, TestVtArray1)(benchmark::State& state)
{
    const int testAmount = state.range(0);
    for (auto _ : state)
    {
        VtIntArray intArray(testAmount);
        for (int i = 0; i < testAmount; ++i)
        {
            intArray[i] = i;
        }
        
        benchmark::DoNotOptimize(intArray);
    }
}
BENCHMARK_REGISTER_F(HydraFixture, TestVtArray1)
    ->Range(1 << 10, 1 << 20);

BENCHMARK_DEFINE_F(HydraFixture, TestVtArray2)(benchmark::State& state)
{
    const int testAmount = state.range(0);
    for (auto _ : state)
    {
        VtIntArray intArray(state.threads() * testAmount);
        auto intArrayBegin = intArray.begin();
        for (int i = 0; i < testAmount; ++i)
        {
            *(intArrayBegin + i) = i;
        }
        
        benchmark::DoNotOptimize(intArray);
    }
}
BENCHMARK_REGISTER_F(HydraFixture, TestVtArray2)
    ->Range(1 << 10, 1 << 20);

By simply switching to iterator, we observe a huge difference:

Run on (12 X 24 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB
  L1 Instruction 128 KiB
  L2 Unified 4096 KiB (x12)
Load Average: 2.87, 3.29, 3.73
----------------------------------------------------------------------------
Benchmark                                  Time             CPU   Iterations
----------------------------------------------------------------------------
HydraFixture/TestVtArray1/1024          1702 ns         1702 ns       405955
HydraFixture/TestVtArray1/4096          6511 ns         6511 ns       107638
HydraFixture/TestVtArray1/32768        52701 ns        52700 ns        13431
HydraFixture/TestVtArray1/262144      431759 ns       431731 ns         1657
HydraFixture/TestVtArray1/1048576    1932675 ns      1932633 ns          362
HydraFixture/TestVtArray2/1024           117 ns          117 ns      6516295
HydraFixture/TestVtArray2/4096           485 ns          485 ns      1316086
HydraFixture/TestVtArray2/32768         4569 ns         4569 ns       187843
HydraFixture/TestVtArray2/262144       26763 ns        26762 ns        27422
HydraFixture/TestVtArray2/1048576     263228 ns       263219 ns         2682
Program ended with exit code: 0

We try to replace operator[] in current USD and find out there are everywhere: Comparing dev...adsk/bugfix/vtarray-modification · autodesk-forks/USD · GitHub
Things would be even worse when coming to push_back, which will also result frequent allocation and copy. We’d like to see if VtArray’s designs or usages can be improved to avoid such unnecessary COW checks.

Thank you,
Lumina Wang

3 Likes

Hello. I agree that the copy on write can sometimes be tricky to get right. There are two perhaps less invasive patterns than iterators that you might consider.

If you’re in a read only context, consider using AsConst

If you need to write to the array, consider making a span to trigger any detaches upfront TfMakeSpan.

VtIntArray array = {1, 2, 3};
auto array_view = TfMakeSpan(array); // detach happens here
for (int i = 0; i<array.size(); i++) {
    array_view[i] *= 2; // no cow checks needed because operating on the span
}

Keeping read only VtArrays const and interfacing with writable VtArray’s via TfSpan can avoid most cow issues without too many code changes. Hope that’s helpful.

Hi Matthew,

Thank you for your suggestion! Yeah this can solve new potential COW checks but as I mentioned in the description, the big problem is it’s now widely misused across the whole USD: Comparing dev…adsk/bugfix/vtarray-modification · autodesk-forks/USD · GitHub

And operator[] is not the only call that will result in repeated COW checks. Similar things keep on happening in emplace_back, push_back, etc. Many devs may treat them as cheap as the std ones but actually they’re not. That’s why we’re wondering if VtArray’s designs can be improved to avoid such unnecessary COW checks. (For example, VtArray is supposed to be “const” by default. Any write requests explict transition and the COW checks can be solved at that point.)

Thank you,
Lumina Wang

1 Like

I quite like that suggestion of Const by default so we can avoid most CoW checks. I really don’t think most people are aware of the CoW nature of VtArray and even those who are often forget.