Incorrect Use of torch.no_grad() in fit_epoch Method in d2l/torch.py::Trainer::fit_epoch #2573

caydenwei · 2023-12-19T16:21:48Z

Hello,

I noticed a potential issue in the fit_epoch method in https://github.com/d2l-ai/d2l-en/blob/master/d2l/torch.py, where loss.backward() is called within a torch.no_grad() block:

self.optim.zero_grad()
with torch.no_grad():
    loss.backward()
    ...

This usage likely prevents the calculation of gradients, as loss.backward() should not be inside a torch.no_grad() block. The correct approach would be:

self.optim.zero_grad()
loss.backward()
...

Here is the original code:

    def fit_epoch(self):
        """Defined in :numref:`sec_linear_scratch`"""
        self.model.train()
        for batch in self.train_dataloader:
            loss = self.model.training_step(self.prepare_batch(batch))
            self.optim.zero_grad()
            with torch.no_grad():
                loss.backward()
                if self.gradient_clip_val > 0:  # To be discussed later
                    self.clip_gradients(self.gradient_clip_val, self.model)
                self.optim.step()
            self.train_batch_idx += 1
        if self.val_dataloader is None:
            return
        self.model.eval()
        for batch in self.val_dataloader:
            with torch.no_grad():
                self.model.validation_step(self.prepare_batch(batch))
            self.val_batch_idx += 1

The text was updated successfully, but these errors were encountered:

Wu-Zongyu · 2023-12-30T22:00:50Z

I think it should be

self.optim.zero_grad() 
    loss.backward() 
    with torch.no_grad():
        self.optim.step()

caydenwei · 2024-01-01T14:16:30Z

I think it should be

self.optim.zero_grad() 
    loss.backward() 
    with torch.no_grad():
        self.optim.step()

Apologies for not being clear earlier. I'm uncertain about the correctness of a specific part of the code found at https://github.com/d2l-ai/d2l-en/blob/master/d2l/torch.py. Here is the original code:

    def fit_epoch(self):
        """Defined in :numref:`sec_linear_scratch`"""
        self.model.train()
        for batch in self.train_dataloader:
            loss = self.model.training_step(self.prepare_batch(batch))
            self.optim.zero_grad()
            with torch.no_grad():
                loss.backward()
                if self.gradient_clip_val > 0:  # To be discussed later
                    self.clip_gradients(self.gradient_clip_val, self.model)
                self.optim.step()
            self.train_batch_idx += 1
        if self.val_dataloader is None:
            return
        self.model.eval()
        for batch in self.val_dataloader:
            with torch.no_grad():
                self.model.validation_step(self.prepare_batch(batch))
            self.val_batch_idx += 1

Brianwind · 2024-10-29T17:35:12Z

Same question. I'm confused why the code still works in the examples (LeNet, etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect Use of torch.no_grad() in fit_epoch Method in d2l/torch.py::Trainer::fit_epoch #2573

Incorrect Use of torch.no_grad() in fit_epoch Method in d2l/torch.py::Trainer::fit_epoch #2573

caydenwei commented Dec 19, 2023 •

edited

Loading

Wu-Zongyu commented Dec 30, 2023

caydenwei commented Jan 1, 2024

Brianwind commented Oct 29, 2024

Incorrect Use of torch.no_grad() in fit_epoch Method in d2l/torch.py::Trainer::fit_epoch #2573

Incorrect Use of torch.no_grad() in fit_epoch Method in d2l/torch.py::Trainer::fit_epoch #2573

Comments

caydenwei commented Dec 19, 2023 • edited Loading

Wu-Zongyu commented Dec 30, 2023

caydenwei commented Jan 1, 2024

Brianwind commented Oct 29, 2024

caydenwei commented Dec 19, 2023 •

edited

Loading