Skip to content

Run ๐Ÿƒ

After dependencies are installed, you can run the algorithm file directly.

python abcdrl/dqn_torch.py \
    --trainer.env-id Cartpole-v1 \
    --trainer.total_timesteps 500000 \ #(1)!
    --trainer.gamma 0.99 \
    --trainer.learning-rate 2.5e-4 \ #(2)!
    --trainer.capture-video \
    --logger.track \
    --logger.wandb-project-name abcdrl \
    --logger.wandb-tags tag1 tag2
  1. The connector can use _ or -
  2. or 0.00025

Set specific GPU device

  • Using gpu:0 and gpu:1 ๐Ÿ‘‡
    • CUDA_VISIBLE_DEVICES="0,1" python abcdrl/dqn_torch.py
  • Using gpu:1 ๐Ÿ‘‡
    • CUDA_VISIBLE_DEVICES="1" python abcdrl/dqn_torch.py
  • Using cpu only ๐Ÿ‘‡
    • python abcdrl/dqn_torch.py --no-cuda
    • CUDA_VISIBLE_DEVICES="" python abcdrl/dqn_torch.py
    • CUDA_VISIBLE_DEVICES="-1" python abcdrl/dqn_torch.py

Parameters in the algorithm file, consisting of two parts. The first part is the initialization parameters of Trainer๐Ÿ”, and the second part is the parameters of the feature (Logger๐Ÿ“Š, ...).

abcdrl/dqn_torch.py
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
class Trainer:
    @dataclasses.dataclass
    class Config:
        exp_name: Optional[str] = None
        seed: int = 1
        cuda: bool = True
        capture_video: bool = False
        env_id: str = "CartPole-v1"
        num_envs: int = 1
        total_timesteps: int = 500_000
        gamma: float = 0.99
        # Collect
        buffer_size: int = 10_000
        start_epsilon: float = 1.0
        end_epsilon: float = 0.05
        exploration_fraction: float = 0.5
        # Learn
        batch_size: int = 128
        learning_rate: float = 2.5e-4
        # Train
        learning_starts: int = 10_000
        target_network_frequency: int = 500
        train_frequency: int = 10

    def __init__(self, config: Config = Config()) -> None:
        # ...
abcdrl/dqn_torch.py
310
311
312
313
314
315
316
317
318
319
320
class Logger:
    @dataclasses.dataclass
    class Config:
        track: bool = False
        wandb_project_name: str = "abcdrl"
        wandb_tags: List[str] = dataclasses.field(default_factory=lambda: [])
        wandb_entity: Optional[str] = None

    @classmethod
    def decorator(cls, config: Config = Config()) -> Callable[..., Generator[dict[str, Any], None, None]]:
        # ...

Note

You can use the python abcdrl/dqn_torch.py -h command to view algorithm parameters and features parameters.

python abcdrl/dqn_torch.py -h
usage: dqn_torch.py [-h] [--trainer.exp-name {None}|STR] [--trainer.seed INT]
                    [--trainer.no-cuda] [--trainer.capture-video]
                    [--trainer.env-id STR] [--trainer.num-envs INT]
                    [--trainer.total-timesteps INT] [--trainer.gamma FLOAT]
                    [--trainer.buffer-size INT]
                    [--trainer.start-epsilon FLOAT]
                    [--trainer.end-epsilon FLOAT]
                    [--trainer.exploration-fraction FLOAT]
                    [--trainer.batch-size INT] [--trainer.learning-rate FLOAT]
                    [--trainer.learning-starts INT]
                    [--trainer.target-network-frequency INT]
                    [--trainer.train-frequency INT] [--logger.track]
                    [--logger.wandb-project-name STR]
                    [--logger.wandb-tags STR [STR ...]]
                    [--logger.wandb-entity {None}|STR]

โ•ญโ”€ arguments โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ -h, --help              show this help message and exit โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ trainer arguments โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --trainer.exp-name {None}|STR                           โ”‚
โ”‚                         (default: None)                 โ”‚
โ”‚ --trainer.seed INT      (default: 1)                    โ”‚
โ”‚ --trainer.no-cuda       (sets: cuda=False)              โ”‚
โ”‚ --trainer.capture-video                                 โ”‚
โ”‚                         (sets: capture_video=True)      โ”‚
โ”‚ --trainer.env-id STR    (default: CartPole-v1)          โ”‚
โ”‚ --trainer.num-envs INT  (default: 1)                    โ”‚
โ”‚ --trainer.total-timesteps INT                           โ”‚
โ”‚                         (default: 500000)               โ”‚
โ”‚ --trainer.gamma FLOAT   (default: 0.99)                 โ”‚
โ”‚ --trainer.buffer-size INT                               โ”‚
โ”‚                         Collect (default: 10000)        โ”‚
โ”‚ --trainer.start-epsilon FLOAT                           โ”‚
โ”‚                         Collect (default: 1.0)          โ”‚
โ”‚ --trainer.end-epsilon FLOAT                             โ”‚
โ”‚                         Collect (default: 0.05)         โ”‚
โ”‚ --trainer.exploration-fraction FLOAT                    โ”‚
โ”‚                         Collect (default: 0.5)          โ”‚
โ”‚ --trainer.batch-size INT                                โ”‚
โ”‚                         Learn (default: 128)            โ”‚
โ”‚ --trainer.learning-rate FLOAT                           โ”‚
โ”‚                         Learn (default: 0.00025)        โ”‚
โ”‚ --trainer.learning-starts INT                           โ”‚
โ”‚                         Train (default: 10000)          โ”‚
โ”‚ --trainer.target-network-frequency INT                  โ”‚
โ”‚                         Train (default: 500)            โ”‚
โ”‚ --trainer.train-frequency INT                           โ”‚
โ”‚                         Train (default: 10)             โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ logger arguments โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --logger.track          (sets: track=True)              โ”‚
โ”‚ --logger.wandb-project-name STR                         โ”‚
โ”‚                         (default: abcdrl)               โ”‚
โ”‚ --logger.wandb-tags STR [STR ...]                       โ”‚
โ”‚                         (default: )                     โ”‚
โ”‚ --logger.wandb-entity {None}|STR                        โ”‚
โ”‚                         (default: None)                 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Last update: 2023-03-01