LightZero

教程

  • 安装和快速入门指南
  • 如何在 LightZero 中自定义你的算法?
  • 如何在 LightZero 中自定义您的环境?
  • 如何在 LightZero 中设置配置文件
  • LightZero 的日志记录和监控系统

API 文档

  • 代理
  • 配置
  • 条目
  • 环境
  • MCTS
  • 模型
  • 政策
  • 工人
LightZero
  • <no title>
  • 查看页面源码

  • 代理
    • AlphaZeroAgent
      • AlphaZeroAgent.__init__()
      • AlphaZeroAgent.batch_evaluate()
      • AlphaZeroAgent.best
      • AlphaZeroAgent.deploy()
      • AlphaZeroAgent.supported_env_list
      • AlphaZeroAgent.train()
    • MuZeroAgent
      • MuZeroAgent.__init__()
      • MuZeroAgent.batch_evaluate()
      • MuZeroAgent.best
      • MuZeroAgent.deploy()
      • MuZeroAgent.supported_env_list
      • MuZeroAgent.train()
    • EfficientZeroAgent
      • EfficientZeroAgent.__init__()
      • EfficientZeroAgent.batch_evaluate()
      • EfficientZeroAgent.best
      • EfficientZeroAgent.deploy()
      • EfficientZeroAgent.supported_env_list
      • EfficientZeroAgent.train()
    • GumbelMuZeroAgent
      • GumbelMuZeroAgent.__init__()
      • GumbelMuZeroAgent.batch_evaluate()
      • GumbelMuZeroAgent.best
      • GumbelMuZeroAgent.deploy()
      • GumbelMuZeroAgent.supported_env_list
      • GumbelMuZeroAgent.train()
    • SampledEfficientZeroAgent
      • SampledEfficientZeroAgent.__init__()
      • SampledEfficientZeroAgent.batch_evaluate()
      • SampledEfficientZeroAgent.best
      • SampledEfficientZeroAgent.deploy()
      • SampledEfficientZeroAgent.supported_env_list
      • SampledEfficientZeroAgent.train()
    • SampledAlphaZeroAgent
      • SampledAlphaZeroAgent.__init__()
      • SampledAlphaZeroAgent.batch_evaluate()
      • SampledAlphaZeroAgent.best
      • SampledAlphaZeroAgent.deploy()
      • SampledAlphaZeroAgent.supported_env_list
      • SampledAlphaZeroAgent.train()
  • 配置
    • lzero.config.meta
      • __标题__
      • __版本__
      • __描述__
      • __作者__
      • __作者邮箱__
  • 条目
    • train_alphazero
      • train_alphazero.__init__()
    • eval_alphazero
      • eval_alphazero.__init__()
    • train_muzero
      • train_muzero.__init__()
    • eval_muzero
      • eval_muzero.__init__()
    • train_muzero_with_gym_env
      • train_muzero_with_gym_env.__init__()
    • eval_muzero_with_gym_env
      • eval_muzero_with_gym_env.__init__()
    • train_muzero_with_reward_model
      • train_muzero_with_reward_model.__init__()
  • 环境
    • LightZeroEnvWrapper
      • LightZeroEnvWrapper.__init__()
      • LightZeroEnvWrapper._is_protocol
      • LightZeroEnvWrapper._np_random
      • LightZeroEnvWrapper.action_space
      • LightZeroEnvWrapper.class_name()
      • LightZeroEnvWrapper.close()
      • LightZeroEnvWrapper.metadata
      • LightZeroEnvWrapper.np_random
      • LightZeroEnvWrapper.observation_space
      • LightZeroEnvWrapper.render()
      • LightZeroEnvWrapper.render_mode
      • LightZeroEnvWrapper.reset()
      • LightZeroEnvWrapper.reward_range
      • LightZeroEnvWrapper.seed()
      • LightZeroEnvWrapper.spec
      • LightZeroEnvWrapper.step()
      • LightZeroEnvWrapper.unwrapped
    • ActionDiscretizationEnvWrapper
      • ActionDiscretizationEnvWrapper.__init__()
      • ActionDiscretizationEnvWrapper._is_protocol
      • ActionDiscretizationEnvWrapper._np_random
      • ActionDiscretizationEnvWrapper.action_space
      • ActionDiscretizationEnvWrapper.class_name()
      • ActionDiscretizationEnvWrapper.close()
      • ActionDiscretizationEnvWrapper.metadata
      • ActionDiscretizationEnvWrapper.np_random
      • ActionDiscretizationEnvWrapper.observation_space
      • ActionDiscretizationEnvWrapper.render()
      • ActionDiscretizationEnvWrapper.render_mode
      • ActionDiscretizationEnvWrapper.reset()
      • ActionDiscretizationEnvWrapper.reward_range
      • ActionDiscretizationEnvWrapper.seed()
      • ActionDiscretizationEnvWrapper.spec
      • ActionDiscretizationEnvWrapper.step()
      • ActionDiscretizationEnvWrapper.unwrapped
  • MCTS
    • 缓冲区
      • GameBuffer
      • MuZeroBuffer
      • EfficientZeroBuffer
    • 树搜索
      • MuZeroMCTSCtree
      • EfficientZeroMCTSCtree
      • GumbelMuZeroMCTSCtree
  • 模型
    • 常见
      • SimNorm
      • FeatureAndGradientHook
      • DownSample
      • RepresentationNetworkUniZero
      • RepresentationNetwork
      • RepresentationNetworkMLP
      • LatentDecoder
      • LatentEncoderForMemoryEnv
      • LatentDecoderForMemoryEnv
      • VectorDecoderForMemoryEnv
      • PredictionNetwork
      • PredictionNetworkMLP
      • PredictionHiddenNetwork
    • MuZeroModel
      • MuZeroModel
      • DynamicsNetwork
    • MuZeroModelMLP
      • MuZeroModelMLP
      • DynamicsNetwork
    • EfficientZeroModel
      • DynamicsNetwork
    • EfficientZeroModelMLP
      • DynamicsNetworkMLP
    • AlphaZeroModel
      • AlphaZeroModel
      • PredictionNetwork
    • SampledEfficientZeroModel
      • PredictionNetwork
    • SampledEfficientZeroModelMLP
      • PredictionNetworkMLP
    • StochasticMuZeroModel
      • StochasticMuZeroModel
      • DynamicsNetwork
      • AfterstatePredictionNetwork
      • ChanceEncoderBackbone
      • ChanceEncoderBackboneMLP
      • ChanceEncoder
      • StraightThroughEstimator
      • OnehotArgmax
    • StochasticMuZeroModelMLP
      • StochasticMuZeroModelMLP
  • 政策
    • AlphaZeroPolicy
      • AlphaZeroPolicy
    • MuZero策略
      • MuZeroPolicy
    • EfficientZeroPolicy
      • EfficientZeroPolicy
    • Gumbel AlphaZero策略
      • GumbelAlphaZeroPolicy
    • Gumbel MuZeroPolicy
      • GumbelMuZeroPolicy
    • Sampled AlphaZero策略
      • SampledAlphaZeroPolicy
    • Sampled MuZeroPolicy
      • SampledMuZeroPolicy
    • Sampled EfficientZeroPolicy
      • SampledEfficientZeroPolicy
    • Stochastic MuZeroPolicy
      • StochasticMuZeroPolicy
    • UniZeroPolicy
      • UniZeroPolicy
  • 工人
    • MuZeroCollector
      • MuZeroCollector
    • MuZeroEvaluator
      • MuZeroEvaluator
上一页 下一页

© 版权所有 2023, OpenDILab Contributors.

利用 Sphinx 构建,使用的 主题 由 Read the Docs 开发.