Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve material system data packing #1473

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

VReaperV
Copy link
Contributor

@VReaperV VReaperV commented Dec 25, 2024

Partially implements #1448.

Stage data is allocated for stages rather than surface * stage, which greatly reduces the amount of required data. Texture handles and texture matrix are moved into a different buffer, since stage data ends up being often under 1kb this way. Lightmap/deluxemap handles are moved into a different UBO, because they depend on the drawSurf rather than on shader stage. All 3 are encoded into baseInstance (12 bits for shader stage, 12 bits for textures, and 8 more for lightmaps).

Optimised u_Color by changing it from a vec4 to a uint (the colors we use only have 8-bit precision in engine).
Also added u_ColorModulateColorGen for use with shaders other than cameraEffects: these only use 12 distinct states, encoded as 5 bits, which also allows using a uint here instead of a vec4. Added unpackUnorm4x8() define for devices that don't support ARB_gpu_shader5.

Changed u_TextureMatrix from a mat4 to mat3x2, reducing its size from 16 to just 6 components since only these 6 components have any actual effect, which allows making the TexData struct be exactly 64 bytes.

All of the above combined reduces the memory usage of material system by ~50-100x, and improves its performance and load times.

Also added r_materialSystemSkip cvar, which allows instantaneously switching between core and material renderer (as long as the latter is enabled in the first place), for much faster debugging and performance comparison.

@VReaperV VReaperV added T-Improvement Improvement for an existing feature A-Renderer T-Performance labels Dec 25, 2024
@VReaperV VReaperV force-pushed the material-stages-pack branch 6 times, most recently from 1332e02 to 981f302 Compare December 26, 2024 08:06
@slipher
Copy link
Member

slipher commented Dec 26, 2024

Is there any way to split this up into multiple PRs?

@VReaperV
Copy link
Contributor Author

I believe so, yes.

@VReaperV
Copy link
Contributor Author

Is there any way to split this up into multiple PRs?

Parts of it are now in: #1475, #1476, #1477, and #1478. I think that's as far as I can split it (u_TextureMatrix change would end up with a bunch of merge conflicts here).

`u_Color` was a vec4 that was being set from a `Color` object, which only has 8-bit precision per colour, so using up 4 times more memory than that to transfer it was quite atrocious. Instead, pack it into a uint32_t, then use `unpackUnorm4x8()` to unpack in the shader.

Also adds a macro for unpacking for hardware that doesn't support `GL_ARB_gpu_shader5`.
Add `u_ColorModulateColorGen` for use with shaders other than `cameraEffects`, since the latter can get an arbitrary vec4 for it. All other shaders only use 12 distinct states of `u_ColorModulateColorGen`, encoded as 5 bits. Also adds `u_ColorModulateLightFactor` as a global uniform, to be able to put the `genericMaterial` shader struct into 8 bytes.

Fixes an issue with material shader post-processing that was removing global uniforms that had similar names to a non-global one.
This makes debugging issues that are present on material system but no core renderer much faster, since it can be changed without any sort of restart.
Allocate memory for each stage of used shaders rather than for each stage in all used shaders for each surface that uses them. Change some of the material system functions to use `shaderStage_t*` instead of `drawSurf_t*` etc as input.

Textures are put into a different storage buffer, following the structure of `textureBundle_t`. This allows decreasing the total amount of memory used, and changing the material buffer to be a UBO instead. The lightmap and deluxemap handles are put into another UBO, because they're per-surface, rather than per-shader-stage.

Also moved some duplicate code into a function.

Add `common.glsl` for vertex/fragment shaders.
Only these 6 components (0, 1, 4, 5, 12, 13) actually have an effect. This largely reduces the amount of data being transferred for texture matrices, as well as some useless computations.
@VReaperV VReaperV force-pushed the material-stages-pack branch from 981f302 to 1772eea Compare January 2, 2025 12:33
Also fixes an incorrect matrix constructor in shaders.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants