Sharing Is Caring:

Modern Reinforcement learning using Deep Learning

Model types, Algorithms and approaches, Function approximation, Deep reinforcement-learning, Deep Multi-agent Reinforcement

Free tutorial

Rating: 1.0 out of 51.0 (4 ratings)

2,176 students

42min of on-demand video

Created by Nitsan Soffair

English

English [Auto]

What you’ll learn

  • Being able to start Deep reinforcement-learning research
  • Being able to start a Deep reinforcement-learning engineering role
  • Understand modern state-of-the-art Deep reinforcement-learning knowledge
  • Understand Deep reinforcement-learning knowledge

Requirements

  • Interest in Deep reinforcement-learning

Description

Hello I am Nitsan Soffair, A Deep RL researcher at BGU.

In my Deep reinforcement-learning course you will learn the newest state-of-the-art Deep reinforcement-learning knowledge.

You will do the following

  1. Get state-of-the-art knowledge regarding
    1. Model types
    2. Algorithms and approaches
    3. Function approximation
    4. Deep reinforcement-learning
    5. Deep Multi-agent Reinforcement-learning
  2. Validate your knowledge by answering short and very short quizzes of each lecture.
  3. Be able to complete the course by ~2 hours.

Syllabus

  1. Model types
    1. Markov decision process (MDP)A discrete-time stochastic control process.
    2. Partially observable Markov decision process (POMDP)A generalization of MDP in which an agent cannot observe the state.
    3. Decentralized Partially observable Markov decision process (Dec-POMDP)A generalization of POMDP to consider multiple decentralized agents.
  2. Algorithms and approaches
    1. Bellman equationsA condition for optimality of optimization of dynamic programming.
    2. Model-freeA model-free algorithm is an algorithm which does not use the policy of the MDP.
    3. Off-policyAn off-policy algorithm is an algorithm that use policy 1 for learning and policy 2 for acting in the environment.
    4. Exploration-exploitationA trade-off in Reinforcement-learning between exploring new policies to use existing policies.
    5. Value-iterationAn iterative algorithm applying bellman optimality backup.
    6. SARSAAn algorithm for learning a Markov decision process policy
    7. Q-learningA model-free reinforcement learning algorithm to learn the value of an action in a particular state.
  3. Function approximation
    1. Function approximatorsThe problem asks us to select a function among a well-defined class that closely matches (“approximates”) a target function in a task-specific way.
    2. Policy-gradientValue-based, Policy-based, Actor-critic, policy-gradient, and softmax policy
    3. REINFORCEA policy-gradient algorithm.
  4. Deep reinforcement-learning
    1. Deep Q-Network (DQN)A deep reinforcement-learning algorithm using experience reply and fixed Q-targets.
    2. Deep Recurrent Q-Learning (DRQN)Deep reinforcement-learning algorithm for POMDP extends DQN and uses LSTM.
    3. Optimistic Exploration with Pessimistic Initialization (OPIQ)A deep reinforcement-learning for MDP based on DQN.
    4. Value Decomposition Networks (VDN)A multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
    5. QMIXA multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
    6. QTRANA multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
    7. Weighted QMIXA deep multi-agent reinforcement-learning for Dec-POMDP.
Read Also -->   Theoretical and Computational Methods for Biology

Resources

  • Wikipedia
  • David Silver’s Reinforcement-learning course

Who this course is for:

  • Anyone who interests in Deep reinforcement-learning

Show less

Course content

6 sections • 23 lectures • 41m total lengthCollapse all sections

Model types3 lectures • 3min

  • Markov decision process (MDP)00:53
  • Markov decision process (MDP)3 questions
  • Partially observable markov decision process (POMDP)01:18
  • Partially observable markov decision process (POMDP)2 questions
  • Decentralized partially observable markov decision process (Dec-POMDP)00:57
  • Decentralized partially observable markov decision process (Dec-POMDP)1 question

Algorithms and approaches7 lectures • 5min

  • Bellman equations00:47
  • Bellman equations3 questions
  • Model free00:19
  • Model free2 questions
  • Off-policy00:19
  • Off-policy2 questions
  • Exploration-exploitation00:47
  • Exploration-exploitation3 questions
  • Value-iteration00:54
  • Value-iteration3 questions
  • SARSA01:13
  • SARSA3 questions
  • Q-learning00:54
  • Q-learning3 questions

Function approximation3 lectures • 3min

  • Function approximators00:26
  • Function approximators3 questions
  • Policy gradient01:34
  • Policy gradient3 questions
  • REINFORCE01:08
  • REINFORCE3 questions

Deep reinforcement-learning3 lectures • 5min

  • Deep Q-Network (DQN)01:19
  • Deep Q-Network (DQN)3 questions
  • Deep Recurrent Q-Learning (DRQN)01:02
  • Deep Recurrent Q-Learning (DRQN)3 questions
  • Optimistic Exploration with Pessimistic Initialization (OPIQ)02:29
  • Optimistic Exploration with Pessimistic Initialization (OPIQ)3 questions

Deep Multi-agent Reinforcement-learning4 lectures • 5min

  • Value Decomposition Networks (VDN)01:17
  • Value Decomposition Networks (VDN)3 questions
  • QMIX01:04
  • QMIX3 questions
  • QTRAN01:46
  • QTRAN3 questions
  • Weighted QMIX01:17
  • Weighted QMIX3 questions

Extra content3 lectures • 20min

  • GPT-307:03
  • DALL-E05:09
  • CLIP07:37

👇👇👇👇 Click Below to Enroll in Free Udemy Course 👇👇👇👇

Go to Course

👇👇 See Also 👇👇

Join Us Join Us Join Us
Sharing Is Caring:

Leave a Comment

Ads Blocker Image Powered by Code Help Pro

Ads Blocker Detected!!!

We have detected that you are using extensions to block ads. Please support us by disabling these ads blocker.

Powered By
Best Wordpress Adblock Detecting Plugin | CHP Adblock