RLChina 强化学习社区
发表
发动态
发帖子
登录/注册
首页
话题
发动态
发帖子
消息
登录/注册
最新
推荐
关注
Haitham Bou Ammar
RL team leader @Huawei R&D UK & UCL H. Assistant Prof.
个人成就
积分
8
帖子
5
评论
3
注册排名
539
关注了
4
关注者
16
帖子
文章
Haitham Bou Ammar
RL team leader @Huawei R&D UK & UCL H. Assistant Prof.
发布于2022-03-07 02:52:21
A 4 step proof that value baselines don't affect policy grads in #RL😀Just the log-trick & Fubini gets u there!
赞
8
评论
4
浏览
1509
课程学习
Haitham Bou Ammar
RL team leader @Huawei R&D UK & UCL H. Assistant Prof.
发布于2022-03-02 22:47:19
Iterated Law of Expectation Consice Proof (6 simple steps)
When deriving the Belman equations, we needed the iterated law of expectations. Rather than believing me, have a look at this 6-...
赞
3
评论
1
浏览
1193
课程学习
Haitham Bou Ammar
RL team leader @Huawei R&D UK & UCL H. Assistant Prof.
发布于2022-03-02 05:44:26
ELBO in 5 simple steps starting direclty from Bayes Rule!
赞
3
评论
浏览
1119
课程学习
Haitham Bou Ammar
RL team leader @Huawei R&D UK & UCL H. Assistant Prof.
发布于2022-02-28 22:26:57
(RL 正在改变世界!贝尔曼方程是#RL 的核心。这是一个 3 (3.5:P) 步骤的简明证明)
赞
6
评论
2
浏览
1294
课程学习
到底啦