Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If action exists in batch during eval / select_actions, ACT crashes #582

Open
genemerewether opened this issue Dec 16, 2024 · 2 comments
Open

Comments

@genemerewether
Copy link

if self.config.use_vae and self.training:
assert (
"action" in batch
), "actions must be provided when using the variational objective in training mode."
batch_size = (
batch["observation.images"]
if "observation.images" in batch
else batch["observation.environment_state"]
).shape[0]
# Prepare the latent for input to the transformer encoder.
if self.config.use_vae and "action" in batch:

Should this last line here be if self.config.use_vae and self.training and "action" in batch:?

Otherwise you get:

    vae_encoder_input = torch.cat(vae_encoder_input, axis=1)                                           
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                           
RuntimeError: Tensors must have same number of dimensions: got 3 and 2    
@genemerewether
Copy link
Author

Diffusion policy also doesn't support select_actions called on a batch that has "action" key. This check will never compare true in that case.

        self._queues = populate_queues(self._queues, batch)

        if len(self._queues["action"]) == 0:

@reproduce-bot
Copy link

The following script is generated by AI Agent to help reproduce the issue:

# /reproduce.py
import torch
from copy import deepcopy
from einops import repeat

class MockConfig:
    def __init__(self, use_vae=True, latent_dim=10):
        self.use_vae = use_vae
        self.latent_dim = latent_dim

class MockPolicy:
    def __init__(self, config, use_robot_state=True):
        self.config = config
        self.training = True
        self.use_robot_state = use_robot_state
        self.vae_encoder_cls_embed = torch.nn.Embedding(1, 10)
        self.vae_encoder_robot_state_input_proj = torch.nn.Linear(5, 10)
        self.vae_encoder_action_input_proj = torch.nn.Linear(5, 10)
        self.vae_encoder_pos_enc = torch.randn(1, 12, 10)
        self.vae_encoder_latent_output_proj = torch.nn.Linear(10, 20)
        self.vae_encoder = torch.nn.TransformerEncoder(
            torch.nn.TransformerEncoderLayer(d_model=10, nhead=2), num_layers=1
        )
    
    def forward(self, batch):
        if self.config.use_vae and self.training:
            assert "action" in batch, "actions must be provided when using the variational objective in training mode."

        batch_size = batch["observation.images"].shape[0]

        if self.config.use_vae and "action" in batch:
            cls_embed = repeat(self.vae_encoder_cls_embed.weight, "1 d -> b 1 d", b=batch_size)
            if self.use_robot_state:
                robot_state_embed = self.vae_encoder_robot_state_input_proj(batch["observation.state"]).unsqueeze(1)
            action_embed = self.vae_encoder_action_input_proj(batch["action"])

            if self.use_robot_state:
                vae_encoder_input = [cls_embed, robot_state_embed, action_embed]
            else:
                vae_encoder_input = [cls_embed, action_embed]
            try:
                vae_encoder_input = torch.cat(vae_encoder_input, axis=1)
            except RuntimeError as e:
                raise AssertionError(e)

            pos_embed = self.vae_encoder_pos_enc.clone().detach()
            cls_joint_is_pad = torch.full((batch_size, 2 if self.use_robot_state else 1), False, device=batch["observation.state"].device)
            key_padding_mask = torch.cat([cls_joint_is_pad, batch["action_is_pad"]], axis=1)
            cls_token_out = self.vae_encoder(vae_encoder_input.permute(1, 0, 2))[0]
            latent_pdf_params = self.vae_encoder_latent_output_proj(cls_token_out)
            mu = latent_pdf_params[:, :self.config.latent_dim]
            log_sigma_x2 = latent_pdf_params[:, self.config.latent_dim:]
            latent_sample = mu + log_sigma_x2.div(2).exp() * torch.randn_like(mu)
        else:
            mu, log_sigma_x2 = None, None
            latent_sample = torch.zeros([batch_size, self.config.latent_dim], dtype=torch.float32).to(batch["observation.state"].device)

def test_reproduce():
    config = MockConfig()
    policy = MockPolicy(config)

    batch = {
        "observation.images": torch.randn(2, 3, 3, 64, 64),
        "observation.state": torch.randn(2, 5),
        "action": torch.randn(2, 4, 5),
        "action_is_pad": torch.tensor([[False, False, False, False], [False, False, False, False]])
    }

    # Modify batch to simulate the issue condition
    batch["action"] = torch.randn(2, 5)  # Incorrect dimensions to trigger the issue

    try:
        policy.forward(batch)
    except RuntimeError as e:
        raise AssertionError(e)

if __name__ == "__main__":
    test_reproduce()

How to run:

python3 /reproduce.py

Expected Result:

/usr/local/lib/python3.10/site-packages/torch/nn/modules/transformer.py:379: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
  warnings.warn(
Traceback (most recent call last):
  File "/reproduce.py", line 41, in forward
    vae_encoder_input = torch.cat(vae_encoder_input, axis=1)
RuntimeError: Tensors must have same number of dimensions: got 3 and 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/reproduce.py", line 77, in <module>
    test_reproduce()
  File "/reproduce.py", line 72, in test_reproduce
    policy.forward(batch)
  File "/reproduce.py", line 43, in forward
    raise AssertionError(e)
AssertionError: Tensors must have same number of dimensions: got 3 and 2

Thank you for your valuable contribution to this project and we appreciate your feedback! Please respond with an emoji if you find this script helpful. Feel free to comment below if any improvements are needed.

Best regards from an AI Agent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants