If `action` exists in batch during eval / select_actions, ACT crashes #582

genemerewether · 2024-12-16T18:03:24Z

lerobot/lerobot/common/policies/act/modeling_act.py

Lines 397 to 409 in 66f8736

    
           if self.config.use_vae and self.training: 
        
               assert ( 
        
                   "action" in batch 
        
               ), "actions must be provided when using the variational objective in training mode." 
        
           batch_size = ( 
        
               batch["observation.images"] 
        
               if "observation.images" in batch 
        
               else batch["observation.environment_state"] 
        
           ).shape[0] 
        
           # Prepare the latent for input to the transformer encoder. 
        
           if self.config.use_vae and "action" in batch:

Should this last line here be if self.config.use_vae and self.training and "action" in batch:?

Otherwise you get:

    vae_encoder_input = torch.cat(vae_encoder_input, axis=1)                                           
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                           
RuntimeError: Tensors must have same number of dimensions: got 3 and 2

The text was updated successfully, but these errors were encountered:

genemerewether · 2024-12-17T11:57:38Z

Diffusion policy also doesn't support select_actions called on a batch that has "action" key. This check will never compare true in that case.

        self._queues = populate_queues(self._queues, batch)

        if len(self._queues["action"]) == 0:

reproduce-bot · 2024-12-26T08:26:50Z

The following script is generated by AI Agent to help reproduce the issue:

# /reproduce.py
import torch
from copy import deepcopy
from einops import repeat

class MockConfig:
    def __init__(self, use_vae=True, latent_dim=10):
        self.use_vae = use_vae
        self.latent_dim = latent_dim

class MockPolicy:
    def __init__(self, config, use_robot_state=True):
        self.config = config
        self.training = True
        self.use_robot_state = use_robot_state
        self.vae_encoder_cls_embed = torch.nn.Embedding(1, 10)
        self.vae_encoder_robot_state_input_proj = torch.nn.Linear(5, 10)
        self.vae_encoder_action_input_proj = torch.nn.Linear(5, 10)
        self.vae_encoder_pos_enc = torch.randn(1, 12, 10)
        self.vae_encoder_latent_output_proj = torch.nn.Linear(10, 20)
        self.vae_encoder = torch.nn.TransformerEncoder(
            torch.nn.TransformerEncoderLayer(d_model=10, nhead=2), num_layers=1
        )
    
    def forward(self, batch):
        if self.config.use_vae and self.training:
            assert "action" in batch, "actions must be provided when using the variational objective in training mode."

        batch_size = batch["observation.images"].shape[0]

        if self.config.use_vae and "action" in batch:
            cls_embed = repeat(self.vae_encoder_cls_embed.weight, "1 d -> b 1 d", b=batch_size)
            if self.use_robot_state:
                robot_state_embed = self.vae_encoder_robot_state_input_proj(batch["observation.state"]).unsqueeze(1)
            action_embed = self.vae_encoder_action_input_proj(batch["action"])

            if self.use_robot_state:
                vae_encoder_input = [cls_embed, robot_state_embed, action_embed]
            else:
                vae_encoder_input = [cls_embed, action_embed]
            try:
                vae_encoder_input = torch.cat(vae_encoder_input, axis=1)
            except RuntimeError as e:
                raise AssertionError(e)

            pos_embed = self.vae_encoder_pos_enc.clone().detach()
            cls_joint_is_pad = torch.full((batch_size, 2 if self.use_robot_state else 1), False, device=batch["observation.state"].device)
            key_padding_mask = torch.cat([cls_joint_is_pad, batch["action_is_pad"]], axis=1)
            cls_token_out = self.vae_encoder(vae_encoder_input.permute(1, 0, 2))[0]
            latent_pdf_params = self.vae_encoder_latent_output_proj(cls_token_out)
            mu = latent_pdf_params[:, :self.config.latent_dim]
            log_sigma_x2 = latent_pdf_params[:, self.config.latent_dim:]
            latent_sample = mu + log_sigma_x2.div(2).exp() * torch.randn_like(mu)
        else:
            mu, log_sigma_x2 = None, None
            latent_sample = torch.zeros([batch_size, self.config.latent_dim], dtype=torch.float32).to(batch["observation.state"].device)

def test_reproduce():
    config = MockConfig()
    policy = MockPolicy(config)

    batch = {
        "observation.images": torch.randn(2, 3, 3, 64, 64),
        "observation.state": torch.randn(2, 5),
        "action": torch.randn(2, 4, 5),
        "action_is_pad": torch.tensor([[False, False, False, False], [False, False, False, False]])
    }

    # Modify batch to simulate the issue condition
    batch["action"] = torch.randn(2, 5)  # Incorrect dimensions to trigger the issue

    try:
        policy.forward(batch)
    except RuntimeError as e:
        raise AssertionError(e)

if __name__ == "__main__":
    test_reproduce()

How to run:

python3 /reproduce.py

Expected Result:

/usr/local/lib/python3.10/site-packages/torch/nn/modules/transformer.py:379: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
  warnings.warn(
Traceback (most recent call last):
  File "/reproduce.py", line 41, in forward
    vae_encoder_input = torch.cat(vae_encoder_input, axis=1)
RuntimeError: Tensors must have same number of dimensions: got 3 and 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/reproduce.py", line 77, in <module>
    test_reproduce()
  File "/reproduce.py", line 72, in test_reproduce
    policy.forward(batch)
  File "/reproduce.py", line 43, in forward
    raise AssertionError(e)
AssertionError: Tensors must have same number of dimensions: got 3 and 2

Thank you for your valuable contribution to this project and we appreciate your feedback! Please respond with an emoji if you find this script helpful. Feel free to comment below if any improvements are needed.

Best regards from an AI Agent!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If `action` exists in batch during eval / select_actions, ACT crashes #582

If `action` exists in batch during eval / select_actions, ACT crashes #582

genemerewether commented Dec 16, 2024

genemerewether commented Dec 17, 2024

reproduce-bot commented Dec 26, 2024

If action exists in batch during eval / select_actions, ACT crashes #582

If action exists in batch during eval / select_actions, ACT crashes #582

Comments

genemerewether commented Dec 16, 2024

genemerewether commented Dec 17, 2024

reproduce-bot commented Dec 26, 2024

If `action` exists in batch during eval / select_actions, ACT crashes #582

If `action` exists in batch during eval / select_actions, ACT crashes #582