Run ๐
After dependencies are installed, you can run the algorithm file directly.
python abcdrl/dqn_torch.py \
--trainer.env-id Cartpole-v1 \
--trainer.total_timesteps 500000 \ #(1)!
--trainer.gamma 0.99 \
--trainer.learning-rate 2.5e-4 \ #(2)!
--trainer.capture-video \
--logger.track \
--logger.wandb-project-name abcdrl \
--logger.wandb-tags tag1 tag2
- The connector can use
_
or-
- or
0.00025
Set specific GPU device
- Using
gpu:0
andgpu:1
๐CUDA_VISIBLE_DEVICES="0,1" python abcdrl/dqn_torch.py
- Using
gpu:1
๐CUDA_VISIBLE_DEVICES="1" python abcdrl/dqn_torch.py
- Using
cpu
only ๐python abcdrl/dqn_torch.py --no-cuda
CUDA_VISIBLE_DEVICES="" python abcdrl/dqn_torch.py
CUDA_VISIBLE_DEVICES="-1" python abcdrl/dqn_torch.py
Parameters in the algorithm file, consisting of two parts. The first part is the initialization parameters of Trainer๐
, and the second part is the parameters of the feature (Logger๐
, ...).
abcdrl/dqn_torch.py | |
---|---|
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
abcdrl/dqn_torch.py | |
---|---|
310 311 312 313 314 315 316 317 318 319 320 |
|
Note
You can use the python abcdrl/dqn_torch.py -h
command to view algorithm parameters and features parameters.
python abcdrl/dqn_torch.py -h
usage: dqn_torch.py [-h] [--trainer.exp-name {None}|STR] [--trainer.seed INT]
[--trainer.no-cuda] [--trainer.capture-video]
[--trainer.env-id STR] [--trainer.num-envs INT]
[--trainer.total-timesteps INT] [--trainer.gamma FLOAT]
[--trainer.buffer-size INT]
[--trainer.start-epsilon FLOAT]
[--trainer.end-epsilon FLOAT]
[--trainer.exploration-fraction FLOAT]
[--trainer.batch-size INT] [--trainer.learning-rate FLOAT]
[--trainer.learning-starts INT]
[--trainer.target-network-frequency INT]
[--trainer.train-frequency INT] [--logger.track]
[--logger.wandb-project-name STR]
[--logger.wandb-tags STR [STR ...]]
[--logger.wandb-entity {None}|STR]
โญโ arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ -h, --help show this help message and exit โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ trainer arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --trainer.exp-name {None}|STR โ
โ (default: None) โ
โ --trainer.seed INT (default: 1) โ
โ --trainer.no-cuda (sets: cuda=False) โ
โ --trainer.capture-video โ
โ (sets: capture_video=True) โ
โ --trainer.env-id STR (default: CartPole-v1) โ
โ --trainer.num-envs INT (default: 1) โ
โ --trainer.total-timesteps INT โ
โ (default: 500000) โ
โ --trainer.gamma FLOAT (default: 0.99) โ
โ --trainer.buffer-size INT โ
โ Collect (default: 10000) โ
โ --trainer.start-epsilon FLOAT โ
โ Collect (default: 1.0) โ
โ --trainer.end-epsilon FLOAT โ
โ Collect (default: 0.05) โ
โ --trainer.exploration-fraction FLOAT โ
โ Collect (default: 0.5) โ
โ --trainer.batch-size INT โ
โ Learn (default: 128) โ
โ --trainer.learning-rate FLOAT โ
โ Learn (default: 0.00025) โ
โ --trainer.learning-starts INT โ
โ Train (default: 10000) โ
โ --trainer.target-network-frequency INT โ
โ Train (default: 500) โ
โ --trainer.train-frequency INT โ
โ Train (default: 10) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ logger arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --logger.track (sets: track=True) โ
โ --logger.wandb-project-name STR โ
โ (default: abcdrl) โ
โ --logger.wandb-tags STR [STR ...] โ
โ (default: ) โ
โ --logger.wandb-entity {None}|STR โ
โ (default: None) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Last update:
2023-03-01