Haitham Bou Ammar RL team leader @Huawei R&D UK & UCL H. Assistant Prof. 发布于 2022-02-28 22:26:57 (RL 正在改变世界!贝尔曼方程是#RL 的核心。这是一个 3 (3.5:P) 步骤的简明证明) 课程学习 #离线强化学习#多智能体强化学习#ml-agents 浏览 (1312) 点赞 (6) 收藏 评论(2) 请 登录后发表观点 Haitham Bou Ammar 2022-02-28 22:46:26 回复 Ah yea, should deffo do that. Maybe have some about those detailed derivations :D Jun Wang 汪军 2022-02-28 22:44:24 回复 什么时候开个课啊? 到底啦