Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warmup on uneven last-batch-size in validate.py #2243

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

eqy
Copy link

@eqy eqy commented Jul 26, 2024

In the spirit of warming up for JIT compilation, add a warmup iteration in case the very last batch has a different size that may unwittingly trigger recompilation

@eqy
Copy link
Author

eqy commented Aug 29, 2024

This also has the effect of including autotuning time for the last batch if it is uneven as torch.backends.cudnn.benchmark = True is set

@rwightman
Copy link
Collaborator

@eqy there's a small issue here, which is a bit to explain which is why this sat...

Validation script should work in streaming mode without a defined length, I believe it used to work but I actually made it a bit too strict (I need to fix). So the partial batch check must catch the situation where dataset length isn't defined.

Comment lines 307/308 in reader_wds.py

        #if not self.num_samples:
        #    raise RuntimeError(f'Invalid split definition, num_samples not specified.')

..and then below should work:

python validate.py --data-dir 'pipe:curl -s -H "Authorization: Bearer $HFT" -f -L https://huggingface.co/datasets/timm/imagenet-1k-wds/resolve/main/' --dataset wds/ --split 'imagenet1k-validation-{00..10}.tar'
Validating in float32. AMP not enabled.
Loading pretrained weights from Hugging Face hub (timm/dpn92.mx_in1k)
[timm/dpn92.mx_in1k] Safe alternative available for 'pytorch_model.bin' (as 'model.safetensors'). Loading weights using safetensors.
Model dpn92 created, param count: 37668392
Data processing configuration for current model + dataset:
	input_size: (3, 224, 224)
	interpolation: bicubic
	mean: (0.48627450980392156, 0.4588235294117647, 0.40784313725490196)
	std: (0.23482446870963955, 0.23482446870963955, 0.23482446870963955)
	crop_pct: 0.875
	crop_mode: center
Test: [   0/0]  Time: 2.018s (2.018s,  126.88/s)  Loss:  0.9466 (0.9466)  Acc@1:  77.734 ( 77.734)  Acc@5:  94.141 ( 94.141)
Test: [  10/0]  Time: 0.329s (0.507s,  504.63/s)  Loss:  0.7221 (0.8023)  Acc@1:  80.469 ( 79.936)  Acc@5:  97.266 ( 95.241)
Test: [  20/0]  Time: 0.330s (0.438s,  584.59/s)  Loss:  0.7316 (0.8055)  Acc@1:  82.812 ( 80.283)  Acc@5:  95.703 ( 94.754)
Test: [  30/0]  Time: 0.328s (0.409s,  626.20/s)  Loss:  0.6582 (0.7941)  Acc@1:  83.203 ( 80.262)  Acc@5:  95.703 ( 95.030)
 * Acc@1 80.079 (19.921) Acc@5 95.031 (4.969)
--result
{
    "model": "dpn92",
    "top1": 80.0791,
    "top1_err": 19.9209,
    "top5": 95.0314,
    "top5_err": 4.9686,
    "param_count": 37.67,
    "img_size": 224,
    "crop_pct": 0.875,
    "interpolation": "bicubic"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants