Modern Reinforcement learning using Deep Learning [2024]

Model types, Algorithms and approaches, Function approximation, Deep reinforcement-learning, Deep Multi-agent Reinforcement

Free tutorial

Rating: 1.0 out of 51.0 (4 ratings)

2,176 students

42min of on-demand video

Created by Nitsan Soffair

English

English [Auto]

Table of Contents

What you’ll learn

Being able to start Deep reinforcement-learning research
Being able to start a Deep reinforcement-learning engineering role
Understand modern state-of-the-art Deep reinforcement-learning knowledge
Understand Deep reinforcement-learning knowledge

Requirements

Interest in Deep reinforcement-learning

Description

Hello I am Nitsan Soffair, A Deep RL researcher at BGU.

In my Deep reinforcement-learning course you will learn the newest state-of-the-art Deep reinforcement-learning knowledge.

You will do the following

Get state-of-the-art knowledge regarding
1. Model types
2. Algorithms and approaches
3. Function approximation
4. Deep reinforcement-learning
5. Deep Multi-agent Reinforcement-learning
Validate your knowledge by answering short and very short quizzes of each lecture.
Be able to complete the course by ~2 hours.

Syllabus

Model types
1. Markov decision process (MDP)A discrete-time stochastic control process.
2. Partially observable Markov decision process (POMDP)A generalization of MDP in which an agent cannot observe the state.
3. Decentralized Partially observable Markov decision process (Dec-POMDP)A generalization of POMDP to consider multiple decentralized agents.
Algorithms and approaches
1. Bellman equationsA condition for optimality of optimization of dynamic programming.
2. Model-freeA model-free algorithm is an algorithm which does not use the policy of the MDP.
3. Off-policyAn off-policy algorithm is an algorithm that use policy 1 for learning and policy 2 for acting in the environment.
4. Exploration-exploitationA trade-off in Reinforcement-learning between exploring new policies to use existing policies.
5. Value-iterationAn iterative algorithm applying bellman optimality backup.
6. SARSAAn algorithm for learning a Markov decision process policy
7. Q-learningA model-free reinforcement learning algorithm to learn the value of an action in a particular state.
Function approximation
1. Function approximatorsThe problem asks us to select a function among a well-defined class that closely matches (“approximates”) a target function in a task-specific way.
2. Policy-gradientValue-based, Policy-based, Actor-critic, policy-gradient, and softmax policy
3. REINFORCEA policy-gradient algorithm.
Deep reinforcement-learning
1. Deep Q-Network (DQN)A deep reinforcement-learning algorithm using experience reply and fixed Q-targets.
2. Deep Recurrent Q-Learning (DRQN)Deep reinforcement-learning algorithm for POMDP extends DQN and uses LSTM.
3. Optimistic Exploration with Pessimistic Initialization (OPIQ)A deep reinforcement-learning for MDP based on DQN.
4. Value Decomposition Networks (VDN)A multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
5. QMIXA multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
6. QTRANA multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
7. Weighted QMIXA deep multi-agent reinforcement-learning for Dec-POMDP.

Resources

Wikipedia
David Silver’s Reinforcement-learning course

Who this course is for:

Anyone who interests in Deep reinforcement-learning

Show less

Course content

6 sections • 23 lectures • 41m total lengthCollapse all sections

Model types3 lectures • 3min

Markov decision process (MDP)00:53
Markov decision process (MDP)3 questions
Partially observable markov decision process (POMDP)01:18
Partially observable markov decision process (POMDP)2 questions
Decentralized partially observable markov decision process (Dec-POMDP)00:57
Decentralized partially observable markov decision process (Dec-POMDP)1 question

Algorithms and approaches7 lectures • 5min

Bellman equations00:47
Bellman equations3 questions
Model free00:19
Model free2 questions
Off-policy00:19
Off-policy2 questions
Exploration-exploitation00:47
Exploration-exploitation3 questions
Value-iteration00:54
Value-iteration3 questions
SARSA01:13
SARSA3 questions
Q-learning00:54
Q-learning3 questions

Function approximation3 lectures • 3min

Function approximators00:26
Function approximators3 questions
Policy gradient01:34
Policy gradient3 questions
REINFORCE01:08
REINFORCE3 questions

Deep reinforcement-learning3 lectures • 5min

Deep Q-Network (DQN)01:19
Deep Q-Network (DQN)3 questions
Deep Recurrent Q-Learning (DRQN)01:02
Deep Recurrent Q-Learning (DRQN)3 questions
Optimistic Exploration with Pessimistic Initialization (OPIQ)02:29
Optimistic Exploration with Pessimistic Initialization (OPIQ)3 questions

Deep Multi-agent Reinforcement-learning4 lectures • 5min

Value Decomposition Networks (VDN)01:17
Value Decomposition Networks (VDN)3 questions
QMIX01:04
QMIX3 questions
QTRAN01:46
QTRAN3 questions
Weighted QMIX01:17
Weighted QMIX3 questions

Modern Reinforcement learning using Deep Learning

What you’ll learn

Requirements

Description

Who this course is for:

Course content

Model types3 lectures • 3min

Algorithms and approaches7 lectures • 5min

Function approximation3 lectures • 3min

Deep reinforcement-learning3 lectures • 5min

Deep Multi-agent Reinforcement-learning4 lectures • 5min

Extra content3 lectures • 20min

👇👇👇👇 Click Below to Enroll in Free Udemy Course 👇👇👇👇

👇👇 See Also 👇👇

Like this:

Leave a Comment Cancel reply

What you’ll learn

Requirements

Description

Who this course is for:

Course content

Model types3 lectures • 3min

Algorithms and approaches7 lectures • 5min

Function approximation3 lectures • 3min

Deep reinforcement-learning3 lectures • 5min

Deep Multi-agent Reinforcement-learning4 lectures • 5min

Extra content3 lectures • 20min

👇👇👇👇 Click Below to Enroll in Free Udemy Course 👇👇👇👇

👇👇 See Also 👇👇

Like this:

Leave a Comment Cancel reply

Ads Blocker Detected!!!

Ads Blocker Detected!!!