Agile autonomous flight in cluttered environments with safety constraints

Work done with Dr. M Vidyasagar FRS (IIT Hyderabad) and Dr. Srikanth Saripalli (Texas A&M University)

The goal of this project is to achieve minimum-time flight in an environment with gates and obstacles. Additional internal/external perturbations and disturbances are injected into the environment (gate positions, wind gusts).

AirSim Experiments

Experiments tried: Stereo matching & obstacle detection
Reinforcement Learning approaches: DQN, PPO
Holistic Parameters: # of people, max speed, etc.
Policy: Stop & wait whenever obstacles are within 0.2m distance
Reward definition example:
- Forward in x direction: +1 × (V_x)
- Deviation in y direction: -1 × |y|
- Collision penalty: -5
Safe flight distance ~ 46.1m (slow speeds)

PPO Implementation Details

Action Space: Let f_i, i ∈ [1,4] be the thrust for each motor.
f_i = 1 (full power), f_i = 0 (off). Discretized steps in [0, 0.2, 0.4, 0.6, 0.8, 1] ^ 4. No added inertial disturbances yet.

Reward Function: R_t = max(0, 1 – ‖x – x_goal‖) – C_θ‖θ‖ – C_ω‖ω‖
where the first term rewards proximity to the target, and additional terms penalize spinning/rotations, etc.

Observations for our agent [cmdFullState(pos, vel, acc, yaw, omega)]:

pos: array-like of float[3]
vel: array-like of float[3]
acc: array-like of float[3]
yaw: single float (radians)
omega: array-like of float[3]

Results (PPO, 50 seeds)

Level	Success Rate	Mean Time (sec)
0 (non-adaptive)	100%	4.5
1 (adaptive)	84%	5.3

Follow this repository for more details.

The above figure shows the path tracked by a Crazyflie quadrotor while abiding by preimposed safety constraints.