GitHub - dwipddalal/joint-gym: A collection of reinforcement learning algorithms implemented from scratch. JointGYM is designed for the effortless integration of these algorithms into various applications.

Joint-Gym is a pybullet-based wrapper for Reinforcement Learning-based tasks.

Building simulation environment

We used pybullet to build a physics-based simulation environment that can be embedded with physics.

PyBullet is a physics engine that can be used for simulating rigid-body dynamics with contacts and is widely used in robotics and machine learning research. It is written in C++, but it provides a Python API, which allows it to be used with Python.

The PyBullet Python API can be used to create, simulate and control physics in a PyBullet simulation. It provides functions for creating and manipulating objects, applying forces and torques, and retrieving information about the simulation state. You can also use PyBullet's Python API to connect to other Python libraries such as OpenAI's Gym, TensorFlow, and PyTorch, to perform reinforcement learning tasks.

In summary, PyBullet is a powerful physics engine that can be used to simulate robotic arms and other multi-body systems, and its Python API allows it to be easily integrated with Python-based machine-learning libraries.

Some experiments that we ran to check for the robustness of pybullet environment

Controlling a 3r robot in pybullet

Concepts of Reinforcement Learning

Two major classes of algorithms:

Model-based algorithm
Model-free algorithm

Two classes of learning:

Online Learning
Offline Learning

Two classes of policy:

On-policy
Off-policy

Q Function

The action-value function, also known as the Q-function, is a fundamental concept in reinforcement learning that maps a state-action pair to the expected total reward of taking that action in that state and following a specific policy thereafter.

Formally, the action-value function is defined as:

$Q(s, a) = \mathbb{E}[R_{t+1} + \gamma R_{t+2} + \gamma^2 R_{t+3} + \ldots | S_t = s, A_t = a]$

Where s is the current state, a is the action taken in that state, R_t is the reward received at time t, γ is a discount factor that determines the importance of future rewards, and E[.] is the expected value operator.

The Q-function represents the quality of taking a particular action in a specific state. By computing the Q-values for all actions in each state, an agent can determine the best action to take in each state and thus optimize its behavior to maximize its expected total reward.

Objective in reinforcement learning

In RL the objective is to find an approximate function that can map the input state and action pair with the expected value.

To create a reinforcement learning-based model for solving the inverse kinematics problem of a 2-r robotic arm, we can use a deep reinforcement learning algorithm such as deep Q-learning.

steps to make RL model for this case:

Define the state space: The state space is the set of all possible states that the robot arm can be in. In this case, the state space can consist of the initial and final positions of the end-effector, as well as the angles of the two joints.

Define the action space: The action space is the set of all possible actions that the robot can take. In this case, the action space can consist of the change in angle for each of the two joints.

Define the reward function: The reward function is used to evaluate the goodness of a particular action taken in a given state. In this case, we can define the reward as the negative Euclidean distance between the current position of the end-effector and the target position.

Train the model: We can use deep Q-learning to train the model by iteratively updating the Q-values for each state-action pair. The Q-value represents the expected future reward for taking a particular action in a given state.

Test the model: Once the model is trained, we can test it by inputting a new initial and final position for the end-effector and having the model output the optimal angles for the two joints to reach the final position.

Analogy with Deep learning

Episode = Epoch

Results

https://drive.google.com/file/d/1lKRV3yjGzJWdM20cQ53CsYYHYlah5ED9/view?usp=sharing

Documentation Style

We have used the PEP 8 format style.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
Reinforcement learning controlling pole		Reinforcement learning controlling pole
Some basic controlling scripts		Some basic controlling scripts
URDF files		URDF files
inverse kinematics		inverse kinematics
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building simulation environment

Some experiments that we ran to check for the robustness of pybullet environment

Concepts of Reinforcement Learning

Q Function

Objective in reinforcement learning

steps to make RL model for this case:

Analogy with Deep learning

Results

Documentation Style

About

Releases

Packages

Contributors 2

Languages

dwipddalal/joint-gym

Folders and files

Latest commit

History

Repository files navigation

Building simulation environment

Some experiments that we ran to check for the robustness of pybullet environment

Concepts of Reinforcement Learning

Q Function

Objective in reinforcement learning

steps to make RL model for this case:

Analogy with Deep learning

Results

Documentation Style

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages