You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems the code allocates all blocks necessary during decoding, but these blocks are only used temporarily. It would be ideal if instead of allocating separate blocks, the decoded data is written directly to the output instead of these temporary blocks.
An example could be that the blocks to process are divided up so that they are separated evenly on each thread. Each thread allocates a single RawBlock4X4Rgba32 and decodes each block serially using that temporary block. Then the contents of the raw block are written directly to the output image at the proper location.
Not sure if this is worth it, but interested to hear your thoughts.
The text was updated successfully, but these errors were encountered:
There's a lot that could be improved in terms of using system memory and general performance and this would probably be one of the easier ones to fix. Although, I think there's always some sort of a tradeoff between writing performant code and readable code.
I'm not going to add this change in 2.0 yet, but if someone wants to create a pr with a fix that doesn't hurt readability too much, that would be greatly appreciated.
I'll probably also look into more ways to improve performance and memory usage without sacrificing readability some time after 2.0 release.
It seems the code allocates all blocks necessary during decoding, but these blocks are only used temporarily. It would be ideal if instead of allocating separate blocks, the decoded data is written directly to the output instead of these temporary blocks.
An example could be that the blocks to process are divided up so that they are separated evenly on each thread. Each thread allocates a single
RawBlock4X4Rgba32
and decodes each block serially using that temporary block. Then the contents of the raw block are written directly to the output image at the proper location.Not sure if this is worth it, but interested to hear your thoughts.
The text was updated successfully, but these errors were encountered: