Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply transforms in PreProcessor #2467

Open
wants to merge 21 commits into
base: release/v2.0.0
Choose a base branch
from

Conversation

djdameln
Copy link
Contributor

📝 Description

  • Delegate the responsibility of applying the input transforms to the PreProcessor. This is more in line with the other auxiliary components such as post-processing and evaluation, where the full functionality of the aux operation is contained in a single class.
  • To account for datasets with heterogeneous image shapes, we need to perform an additional check in the collate method, and resize the images before collating if not all images have the same shape.
  • Thorough manual testing may be required to check for any unwanted side-effects.

✨ Changes

Select what type of change your PR is:

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🔨 Refactor (non-breaking change which refactors the code base)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔒 Security update

✅ Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • 📋 I have summarized my changes in the CHANGELOG and followed the guidelines for my type of change (skip for minor changes, documentation updates, and test enhancements).
  • 📚 I have made the necessary updates to the documentation (if applicable).
  • 🧪 I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).

For more information about code review checklists, see the Code Review Checklist.

@alexriedel1
Copy link
Contributor

Yes I also thought about the problem and what you now did, adding augmentations to the dataset, is the onyl solution to this..
I also would say it's common and good practice in other frameworks.

@djdameln
Copy link
Contributor Author

Yes I also thought about the problem and what you now did, adding augmentations to the dataset, is the onyl solution to this.. I also would say it's common and good practice in other frameworks.

@alexriedel1 As you've seen, I re-introduced transforms on the dataset/datamodule side, but now named as augmentations.

Motivation:

In addition to the transforms applied by the PreProcessor, we now also have augmentation transforms on the dataset/datamodule side. The two serve different purposes: Augmentations are optional and can be used for enriching the dataset, e.g. random transforms on training set or test-time augmentations on the test set. PreProcessor transforms are model-specific and are used to ensure that the input images are read correctly by the model. Usually this involves resizing and normalization, but other transforms are possible as well.

This way, we keep the model-specific transforms on the model side, which has the advantages mentioned in my earlier comment. At the same time, the augmentation transforms on the datamodule side make it easier for the user to define their custom augmentation pipeline. The main advantage over earlier designs is that the user now only has to define the augmentations, and does not need to include the model-specific transforms such as resizing and normalization (because these are handled separately by the PreProcessor).

Concerning resizing:

  • When no augmentations are supplied, resizing only happens on the PreProcessor side, using the model-specific transforms defined for the model, or any custom resize/transform passed to the PreProcessor by the user. When the user wants to change the image size, as seen by the model, this is the resize transform that they need to change.
  • Applying augmentations to the images at full resolution may be costly, so the user may want to include a Resize transform in their augmentation pipeline to speed up the data loading. In this case, resizing will happen twice. Once when the dataset applies the augmentations before collating, and once by the PreProcessor after collating. This may be computationally expensive as you mentioned earlier, but only when the two resize transforms use a different output size. When the resize transforms use the same size, the second one will be skipped.
  • In the case of heterogeneous image size in the dataset, the user can now just pass a Resize transform to the augmentations argument of the datamodule (or dataset). This will ensure that the dataset resizes the images before collating, and the additional resize operation in the collate method will not be called. (We still keep the resize operation in the collate method as an additional safeguard). This gives the user control over the interpolation method that is used for resizing the images.

Please let me know what you think, any suggestions are more than welcome :)

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@alexriedel1
Copy link
Contributor

alexriedel1 commented Dec 19, 2024

Yes I also thought about the problem and what you now did, adding augmentations to the dataset, is the onyl solution to this.. I also would say it's common and good practice in other frameworks.

@alexriedel1 As you've seen, I re-introduced transforms on the dataset/datamodule side, but now named as augmentations.

Motivation:

In addition to the transforms applied by the PreProcessor, we now also have augmentation transforms on the dataset/datamodule side. The two serve different purposes: Augmentations are optional and can be used for enriching the dataset, e.g. random transforms on training set or test-time augmentations on the test set. PreProcessor transforms are model-specific and are used to ensure that the input images are read correctly by the model. Usually this involves resizing and normalization, but other transforms are possible as well.

This way, we keep the model-specific transforms on the model side, which has the advantages mentioned in my earlier comment. At the same time, the augmentation transforms on the datamodule side make it easier for the user to define their custom augmentation pipeline. The main advantage over earlier designs is that the user now only has to define the augmentations, and does not need to include the model-specific transforms such as resizing and normalization (because these are handled separately by the PreProcessor).

Concerning resizing:

  • When no augmentations are supplied, resizing only happens on the PreProcessor side, using the model-specific transforms defined for the model, or any custom resize/transform passed to the PreProcessor by the user. When the user wants to change the image size, as seen by the model, this is the resize transform that they need to change.
  • Applying augmentations to the images at full resolution may be costly, so the user may want to include a Resize transform in their augmentation pipeline to speed up the data loading. In this case, resizing will happen twice. Once when the dataset applies the augmentations before collating, and once by the PreProcessor after collating. This may be computationally expensive as you mentioned earlier, but only when the two resize transforms use a different output size. When the resize transforms use the same size, the second one will be skipped.
  • In the case of heterogeneous image size in the dataset, the user can now just pass a Resize transform to the augmentations argument of the datamodule (or dataset). This will ensure that the dataset resizes the images before collating, and the additional resize operation in the collate method will not be called. (We still keep the resize operation in the collate method as an additional safeguard). This gives the user control over the interpolation method that is used for resizing the images.

Please let me know what you think, any suggestions are more than welcome :)

Yes I (and most users too I guess) am happy with this! You should make clear in the docs that preprocessing transforms are stored in the model after export while augmentations are not, so people know exactly which one to use for their use case.

Copy link

codecov bot commented Dec 19, 2024

Codecov Report

Attention: Patch coverage is 96.99248% with 4 lines in your changes missing coverage. Please review.

Project coverage is 78.58%. Comparing base (58453ef) to head (03ce134).
Report is 1 commits behind head on release/v2.0.0.

Files with missing lines Patch % Lines
src/anomalib/data/datasets/base/video.py 66.66% 2 Missing ⚠️
src/anomalib/data/datasets/base/image.py 75.00% 1 Missing ⚠️
...malib/models/image/efficient_ad/lightning_model.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                 @@
##           release/v2.0.0    #2467      +/-   ##
==================================================
+ Coverage           78.50%   78.58%   +0.08%     
==================================================
  Files                 303      306       +3     
  Lines               12955    12970      +15     
==================================================
+ Hits                10170    10193      +23     
+ Misses               2785     2777       -8     
Flag Coverage Δ
integration_py3.10 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines -85 to +99
val_split_ratio: float,
train_augmentations: Transform | None = None,
val_augmentations: Transform | None = None,
test_augmentations: Transform | None = None,
augmentations: Transform | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the naming, are we 100% clear that augmentations is the best naming in this case? The reason for asking is the annotation for augmentations is, obviously, Transform.

Would it make absolute sense to users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so there are transforms that can be pre-defined model specific and they are stored in the exported model.
And there are augmentations that are only applied during training.
Both can be defined separately for training, testing and validation.

Maybe it would make sense to call the transforms something like input_image_transforms ? Or something that makes clearer that it is "baked" into the model and the last thing that transforms images before being processed by the model..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the datamodule side, I think augmentations is a more accurate and descriptive name than transforms. The augmentations naming signals that these are meant to define the data augmentation pipeline. transforms is more ambiguous and can refer to both data augmentations and model-specific input image pre-processing.

On the model/pre-processor side, we could think of a more descriptive name than transform, maybeinput_transforms? In any case we should ensure that there is a thorough how-to guide that clearly explains these concepts.


Both can be defined separately for training, testing and validation.

@alexriedel1 Please note that in the latest version of this PR, we only keep 1 transform on the PreProcessor side. Since the model always requires the inputs in the same format, we figured there was no longer a use-case for having separate model transforms between the different subsets (but please let us know if you disagree).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexriedel1 Please note that in the latest version of this PR, we only keep 1 transform on the PreProcessor side. Since the model always requires the inputs in the same format, we figured there was no longer a use-case for having separate model transforms between the different subsets (but please let us know if you disagree).

Yes thats a good idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants