abbyvansoest/maxent

null

{
  "createdAt": "2019-03-07T13:22:05Z",
  "defaultBranch": "master",
  "description": null,
  "fullName": "abbyvansoest/maxent",
  "homepage": null,
  "language": "Python",
  "name": "maxent",
  "pushedAt": "2019-05-30T14:17:36Z",
  "stargazersCount": 13,
  "topics": [],
  "updatedAt": "2025-06-10T16:48:25Z",
  "url": "https://github.com/abbyvansoest/maxent"
}

Efficient Maximum-Entropy Exploration

This is the experimental repository for entropy-based exploration: the MAXENT algorithm

MAXENT is a new algorithm to encourage efficient discovery of an unknown state space in RL problems: https://arxiv.org/abs/1812.02690

This repo contains the experimental code for various OpenAI/Mujoco environments: Swimmer, Ant, HalfCheetah, Walker2d, and Humanoid. The stable code for two classic control tasks is in a different repo: https://github.com/abbyvansoest/maxent_base

All implemetations use a forked copy of OpenAI Gym available at: https://github.com/abbyvansoest/gym-fork. Changes were made to the graphics used for rendering and the behavior of state reseting.

Note that this code is memory-intensive. It is set up to run on a specialized deep-learning machine. To reduce the dimensionality, change the discretization setup in swimmer_utils.py.

Dependencies: Tensoflow, OpenAI Gym/Mujoco license, matplotlib, numpy, OpenAI SpinningUp, scipy

See the respective directories for each environment for commands to run and recreate experiments.