Fix Metal vertex format lookup logic and reduce memory used by MVKPixelFormats lookups. #2105

billhollings · 2023-12-28T00:37:29Z

Remove MVKPixelFormats::_mtlFormatDescIndicesByMTLVertexFormats and
index into _mtlVertexFormatDescriptions using MTLVertexFormat directly.
Fix assertion to test MTLVertexFormat < _mtlVertexFormatCount.
Recognize every MTLPixelFormat value can be held in uint16_t.
Reduce array sizes to minimum to hold mapped MTLPixelFormat values, and rely on assertions to validate if additional formats are added in future.

This fixes the assertion issues first identified in PR #1940.

- Remove MVKPixelFormats::_mtlFormatDescIndicesByMTLVertexFormats and index into _mtlVertexFormatDescriptions using MTLVertexFormat directly. - Fix assertion to test MTLVertexFormat < _mtlVertexFormatCount.

spnda · 2023-12-28T01:08:25Z

Is there any particular reason we don't use something like std::vector here instead of these statically sized C-arrays? This would get rid of all of this size nonsense, which was the only reason #1940 seemed correct but wasn't. And it would allow seamless addition of new formats in the future, while also avoiding any magic values like for example static const uint32_t _mtlPixelFormatCoreCount = MTLPixelFormatX32_Stencil8 + 2; to make older Xcode versions happy. The optimization value we get by saving one allocation is so minimal here I don't think it should be the deciding factor.

billhollings · 2023-12-28T01:56:04Z

Is there any particular reason we don't use something like std::vector here

You're not wrong.

However, MVKPixelFormats instances hang around forever, and are inherently static, so the design was about statically allocating the appropriate amount of memory. A std::vector will over allocate (unless the capacity is set correctly). We could use an MVKSmallVector with the correct allocation, but we'd want to still have the assertion tests anyway, so that we're not expanding the vector beyond its capacity.

BTW...the constants _vkFormatCoreCount and _mtlPixelFormatCoreCount are not used for allocation. They handle the non-linear nature of VkFormat and MTLPixelFormat values, respectively. Beyond those respective values, the enums take large jumps between values. So we're stuck with those constants at least.

cdavis5e

Just a few minor nitpicks...

MoltenVK/MoltenVK/GPUObjects/MVKPixelFormats.mm

spnda · 2023-12-28T03:48:37Z

However, MVKPixelFormats instances hang around forever, and are inherently static, so the design was about statically allocating the appropriate amount of memory. A std::vector will over allocate (unless the capacity is set correctly). We could use an MVKSmallVector with the correct allocation, but we'd want to still have the assertion tests anyway, so that we're not expanding the vector beyond its capacity.

std::vector::shrink_to_fit can be called at the end of the initialization, which will shrink the allocation to precisely fit as many elements as are currently stored, as no additional memory will ever be needed. It can then function as a statically sized array, which would solve your memory concern. Alternatively, std::vector::reserve could (also) be called with a guess or some statically known value. That wouldn't get rid of the value, but it wouldn't tie the correctness of the program to a single integer.

BTW...the constants _vkFormatCoreCount and _mtlPixelFormatCoreCount are not used for allocation. They handle the non-linear nature of VkFormat and MTLPixelFormat values, respectively. Beyond those respective values, the enums take large jumps between values. So we're stuck with those constants at least.

Yes I know. Counting the formats is inherently finicky, and having to additionally rely on magic values for newer formats makes it even worse. I was also critiquing the weird use of one statically sized array and an unordered_map for core and extension format description lookups. Very similar issue to the vector replacement at eliminating accidental errors, though not as important.

billhollings · 2023-12-28T19:54:21Z

I was also critiquing the weird use of one statically sized array and an unordered_map for core and extension format description lookups.

I agree, it's a different design. 😉

Access to the format descriptions is the core operation of MVKPixelFormats, and is used by essentially all of its functions. So the intention was to make it as fast as possible. The array/map combo design was to handle this:

...
VK_FORMAT_ASTC_12x10_SRGB_BLOCK = 182,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK = 183,
VK_FORMAT_ASTC_12x12_SRGB_BLOCK = 184,
VK_FORMAT_G8B8G8R8_422_UNORM = 1000156000,
VK_FORMAT_B8G8R8G8_422_UNORM = 1000156001,
...

The first 184 elements are consecutive from zero, lending themselves to be put into an array or vector. The remaining elements cannot be handled that way. Also, the linear elements are in general, more commonly used than the later elements, so it would be a shame to put them all in a umap (where access would be something like an order of magnitude slower).

- Add MVKInflectionMap collection to manage lookups based on enums that have a large set of consecutive elements, plus additional enum values that are more sparsely assigned. - Recognize every MTLPixelFormat value can be held in uint16_t. - Reduce inflection-map sizes by calling shrink_to_fit(). - runcts script log completion time (unrelated).

billhollings · 2023-12-31T16:59:40Z

Okay. I decided to have a little design fun to incorporate @spnda's recommendations.

I've added a new MVKInflectionMap class to encapsulate the combo of linear and sparse format enum elements, remove the need for assertions, and allow the excess memory to be trimmed after population.

billhollings · 2024-01-02T16:31:23Z

@cdavis5e @spnda Any further suggestions before I pull this in?

spnda

MVKInflectionMap should probably get a reserve function to avoid a lot of reallocations, which we can then use in initMTLPixelFormatCapabilities and initVkFormatCapabilities in addition to the shrink_to_fit.

MoltenVK/MoltenVK/GPUObjects/MVKPixelFormats.mm

- Add MVKInflectionMap collection to manage lookups based on enums that have a large set of consecutive elements, plus additional enum values that are more sparsely assigned. - Recognize every MTLPixelFormat value can be held in uint16_t. - Reduce inflection-map sizes by calling shrink_to_fit(). - runcts script log completion time (unrelated).

spnda · 2024-01-04T06:51:47Z

@billhollings I also mentioned this in my review, but didn't know where to put it into the diff as a review so just put it into the main review message:

MVKInflectionMap should probably get a reserve function to avoid a lot of reallocations, which we can then use in initMTLPixelFormatCapabilities and initVkFormatCapabilities in addition to the shrink_to_fit.

That hasn't been addressed and I still think it's important. When looping through those two functions the MVKInflectionMap calls push_back() for every element, effectively letting the MVKSmallVector resize over and over again. As the size is know, having a reserve call at the start of the function is important in my opinion.

- Add MVKInflectionMap collection to manage lookups based on enums that have a large set of consecutive elements, plus additional enum values that are more sparsely assigned. - Recognize every MTLPixelFormat value can be held in uint16_t. - Reduce inflection-map sizes by calling shrink_to_fit(). - runcts script log completion time (unrelated).

billhollings · 2024-01-04T17:07:50Z

@billhollings I also mentioned this in my review, but didn't know where to put it into the diff as a review so just put it into the main review message:

MVKInflectionMap should probably get a reserve function to avoid a lot of reallocations, which we can then use in initMTLPixelFormatCapabilities and initVkFormatCapabilities in addition to the shrink_to_fit.

That hasn't been addressed and I still think it's important. When looping through those two functions the MVKInflectionMap calls push_back() for every element, effectively letting the MVKSmallVector resize over and over again. As the size is know, having a reserve call at the start of the function is important in my opinion.

Uggh. Sorry. I missed your initial recommendation for some reason. Good catch.

I've added reserve(), and to future-proof, I've arbitrarily pre-allocated enough capacity for roughly double the current number of formats in each case. shrink_to_fit() will take care of collapsing the excess.

Fix Metal vertex format lookup logic.

fe65485

- Remove MVKPixelFormats::_mtlFormatDescIndicesByMTLVertexFormats and index into _mtlVertexFormatDescriptions using MTLVertexFormat directly. - Fix assertion to test MTLVertexFormat < _mtlVertexFormatCount.

billhollings requested review from cdavis5e and spnda December 28, 2023 00:37

billhollings force-pushed the fix-mtl-fmt-lookup branch from b6607c3 to fc9f9ca Compare December 28, 2023 00:39

cdavis5e requested changes Dec 28, 2023

View reviewed changes

MoltenVK/MoltenVK/GPUObjects/MVKPixelFormats.mm Outdated Show resolved Hide resolved

MoltenVK/MoltenVK/GPUObjects/MVKPixelFormats.mm Outdated Show resolved Hide resolved

billhollings force-pushed the fix-mtl-fmt-lookup branch from fc9f9ca to 88799cf Compare December 31, 2023 16:51

billhollings requested a review from cdavis5e December 31, 2023 17:00

spnda requested changes Jan 3, 2024

View reviewed changes

MoltenVK/MoltenVK/GPUObjects/MVKPixelFormats.mm Outdated Show resolved Hide resolved

billhollings force-pushed the fix-mtl-fmt-lookup branch from 23bd121 to 0654928 Compare January 3, 2024 17:23

billhollings requested a review from spnda January 3, 2024 17:28

cdavis5e approved these changes Jan 3, 2024

View reviewed changes

billhollings force-pushed the fix-mtl-fmt-lookup branch from dc7f166 to a836e18 Compare January 4, 2024 17:02

spnda approved these changes Jan 4, 2024

View reviewed changes

billhollings merged commit 48adb42 into KhronosGroup:main Jan 4, 2024
6 checks passed

billhollings deleted the fix-mtl-fmt-lookup branch January 4, 2024 19:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Metal vertex format lookup logic and reduce memory used by MVKPixelFormats lookups. #2105

Fix Metal vertex format lookup logic and reduce memory used by MVKPixelFormats lookups. #2105

billhollings commented Dec 28, 2023

spnda commented Dec 28, 2023

billhollings commented Dec 28, 2023

cdavis5e left a comment

spnda commented Dec 28, 2023

billhollings commented Dec 28, 2023

billhollings commented Dec 31, 2023 •

edited

Loading

billhollings commented Jan 2, 2024

spnda left a comment •

edited

Loading

spnda commented Jan 4, 2024

billhollings commented Jan 4, 2024

Fix Metal vertex format lookup logic and reduce memory used by MVKPixelFormats lookups. #2105

Fix Metal vertex format lookup logic and reduce memory used by MVKPixelFormats lookups. #2105

Conversation

billhollings commented Dec 28, 2023

spnda commented Dec 28, 2023

billhollings commented Dec 28, 2023

cdavis5e left a comment

Choose a reason for hiding this comment

spnda commented Dec 28, 2023

billhollings commented Dec 28, 2023

billhollings commented Dec 31, 2023 • edited Loading

billhollings commented Jan 2, 2024

spnda left a comment • edited Loading

Choose a reason for hiding this comment

spnda commented Jan 4, 2024

billhollings commented Jan 4, 2024

billhollings commented Dec 31, 2023 •

edited

Loading

spnda left a comment •

edited

Loading