Bem-vindo(a) ao Scribd!

Pular no carrossel

Machine Learning

Enviado por

DonReba

0% acharam este documento útil (0 voto)

22 visualizações15 páginas

High-level explanation for DeepMind's Atari 2600 AI and AlphaGo.

Direitos autorais

Formatos disponíveis

PPTX, PDF, TXT ou leia online no Scribd

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Denunciar este documento

High-level explanation for DeepMind's Atari 2600 AI and AlphaGo.

Direitos autorais:

Formatos disponíveis

Baixe no formato PPTX, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

0% acharam este documento útil (0 voto)

22 visualizações15 páginas

Machine Learning

Enviado por

DonReba

High-level explanation for DeepMind's Atari 2600 AI and AlphaGo.

Direitos autorais:

Formatos disponíveis

Baixe no formato PPTX, PDF, TXT ou leia online no Scribd

Sinalizar o conteúdo como inadequado

Pular para a página

Você está na página 1de 15

Pesquisar no documento

Machine Learning

Alexey Badalov, 2016-05

Atari 2600
In 2015, an AI learned to play 29 different video
games with no instructions, just by trying the
controls and looking at the screen and the score.

Video Pinball

Boxing

Breakout

Stargunner

AlphaGo
In 2016, the AlphaGo AI defeated Lee Se-dol, the
worlds second-best Go player an achievement
comparable to Deep Blues victory over Garry
Kasparov in chess 20 years earlier.

Image from The Verge

definition
Learning is the acquisition of knowledge or skills
through study, experience, or being taught.
Oxford Dictionary

Learning is a change in probability of response.

B. F. Skinner

Learning is improving performance in a task with

experience.
Tom Mitchell

reinforcement learning
You change the world from one state to another
through your actions, and sometimes you get
rewarded for it.
The meaning of life is to get the maximum total
reward.
Reinforcement learning algorithms can gradually
explore the different states and learn the long-term
reward values of different actions.

human visual system

convolutional neural
networks
Work analogously to the primary visual cortex:
successive layers extract higher-level features.
Can be many levels deep, but finding the correct
depth and structure is not an exact science.

Atari 2600
In 2015, an AI learned to play 29 different video
games with no instructions, just by trying the
controls and looking at the screen and the score.

Video Pinball

Boxing

Breakout

Stargunner

the human benchmark

1 person
game tester, but not a professional gamer
2 hours to practice for each game
no sound
no pause, no save/load

The AI got 75% of his score on 29/49 games.

teaching the AI
38 days worth of recorded games were split into
SARSA fragments, which were used to train a
neural network.
The state is 4 consecutive video
Aaction leading to the reward frames.
Sstate before the reward

R reward
Sstate after the reward

The action is some combination of

Atari 2600 controls.

A the following action

The reward is either -1, 0, or 1,

depending on which way the score
changes.

The network learned to recognize situations in

which to press specific buttons to keep increasing
the score.

AlphaGo
In 2016, the AlphaGo AI defeated Lee Se-dol, the
worlds second-best Go player an achievement
comparable to Deep Blues victory over Garry
Kasparov in chess 20 years earlier.

Image from The Verge

Go
Players take turns placing stones on the board.
Stones that get surrounded are removed from the
board.
The goal is to capture the most territory and stones.
On the order of 10170 different states, compared to
1047 in chess.

teaching the AI
SARSA fragments based on 29,400,000 positions
from 160,000 games were used to teach two neural
networks. 50 GPUs worked for a month
The state is a carefully designed set of
Aaction leading to the reward parameters about each board position,
such as stone colour, turns since last
R reward
move, legality, number of liberties, etc.
Sstate before the reward

Sstate after the reward

A the following action

The action is a position for placing a

stone.
The reward is -1 if the fragment
comes from a losing game and 1 if it
comes from a winning game.

differences from Atari 2600

2 neural networks:
- a value network that judges the value of different
board positions at a glance. This serves to
replace human intuition.
- a policy network that, like in Atari 2600 gaming,
learns to recognize situations on the board and
which moves will bring the maximum reward
Monte Carlo Tree Search a forward-thinking
algorithm that evaluates moves by repeatedly
playing against itself using the value and policy
neural networks.

the match
The game lasted 3.5 hours.
Lee Se-dol is the worlds second-best player. He
says he does not consider AlphaGo a superior
player, but a large factor in his loss was the novelty
of playing against a non-human opponent.

1202 CPUs
176 GPUs
100+ scientists

1 human brain
1 cup of coffee

Você também pode gostar

Shoe Dog: A Memoir by the Creator of Nike
No Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Nota: 4.5 de 5 estrelas
4.5/5 (537)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
No Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Nota: 4 de 5 estrelas
4/5 (5794)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
No Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Nota: 4 de 5 estrelas
4/5 (895)
The Yellow House: A Memoir (2019 National Book Award Winner)
No Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Nota: 4 de 5 estrelas
4/5 (98)
Grit: The Power of Passion and Perseverance
No Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Nota: 4 de 5 estrelas
4/5 (588)
The Little Book of Hygge: Danish Secrets to Happy Living
No Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Nota: 3.5 de 5 estrelas
3.5/5 (400)
The Emperor of All Maladies: A Biography of Cancer
No Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Nota: 4.5 de 5 estrelas
4.5/5 (271)
Never Split the Difference: Negotiating As If Your Life Depended On It
No Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Nota: 4.5 de 5 estrelas
4.5/5 (838)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
No Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Nota: 3.5 de 5 estrelas
3.5/5 (2259)
On Fire: The (Burning) Case for a Green New Deal
No Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Nota: 4 de 5 estrelas
4/5 (74)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
No Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Nota: 4.5 de 5 estrelas
4.5/5 (474)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
No Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Nota: 3.5 de 5 estrelas
3.5/5 (231)
Team of Rivals: The Political Genius of Abraham Lincoln
No Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Nota: 4.5 de 5 estrelas
4.5/5 (234)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
No Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Nota: 4.5 de 5 estrelas
4.5/5 (266)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
No Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Nota: 4.5 de 5 estrelas
4.5/5 (345)
Yes Please
No Everand
Yes Please
Amy Poehler
Nota: 4 de 5 estrelas
4/5 (1891)
The Unwinding: An Inner History of the New America
No Everand
The Unwinding: An Inner History of the New America
George Packer
Nota: 4 de 5 estrelas
4/5 (45)
Rise of ISIS: A Threat We Can't Ignore
No Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Nota: 3.5 de 5 estrelas
3.5/5 (137)
Principles: Life and Work
No Everand
Principles: Life and Work
Ray Dalio
Nota: 4 de 5 estrelas
4/5 (599)
Fear: Trump in the White House
No Everand
Fear: Trump in the White House
Bob Woodward
Nota: 3.5 de 5 estrelas
3.5/5 (738)
Angela's Ashes: A Memoir
No Everand
Angela's Ashes: A Memoir
Frank McCourt
Nota: 4.5 de 5 estrelas
4.5/5 (440)
Bad Feminist: Essays
No Everand
Bad Feminist: Essays
Roxane Gay
Nota: 4 de 5 estrelas
4/5 (1016)
Steve Jobs
No Everand
Steve Jobs
Walter Isaacson
Nota: 4.5 de 5 estrelas
4.5/5 (806)
The Glass Castle: A Memoir
No Everand
The Glass Castle: A Memoir
Jeannette Walls
Nota: 4.5 de 5 estrelas
4.5/5 (1713)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
No Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Nota: 4 de 5 estrelas
4/5 (1090)
John Adams
No Everand
John Adams
David McCullough
Nota: 4.5 de 5 estrelas
4.5/5 (2409)
The Outsider: A Novel
No Everand
The Outsider: A Novel
Stephen King
Nota: 4 de 5 estrelas
4/5 (1839)
The Light Between Oceans: A Novel
No Everand
The Light Between Oceans: A Novel
M.L. Stedman
Nota: 4.5 de 5 estrelas
4.5/5 (789)
Manhattan Beach: A Novel
No Everand
Manhattan Beach: A Novel
Jennifer Egan
Nota: 3.5 de 5 estrelas
3.5/5 (792)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
No Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Nota: 4.5 de 5 estrelas
4.5/5 (121)
The Woman in Cabin 10
No Everand
The Woman in Cabin 10
Ruth Ware
Nota: 3.5 de 5 estrelas
3.5/5 (2322)
Brooklyn: A Novel
No Everand
Brooklyn: A Novel
Colm Tóibín
Nota: 3.5 de 5 estrelas
3.5/5 (1937)
A Man Called Ove: A Novel
No Everand
A Man Called Ove: A Novel
Fredrik Backman
Nota: 4.5 de 5 estrelas
4.5/5 (4610)
The Perks of Being a Wallflower
No Everand
The Perks of Being a Wallflower
Stephen Chbosky
Nota: 4.5 de 5 estrelas
4.5/5 (2104)
Wolf Hall: A Novel
No Everand
Wolf Hall: A Novel
Hilary Mantel
Nota: 4 de 5 estrelas
4/5 (3811)
Little Women
No Everand
Little Women
Louisa May Alcott
Nota: 4 de 5 estrelas
4/5 (104)
Her Body and Other Parties: Stories
No Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Nota: 4 de 5 estrelas
4/5 (821)
The Art of Racing in the Rain: A Novel
No Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Nota: 4 de 5 estrelas
4/5 (4200)
Sing, Unburied, Sing: A Novel
No Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Nota: 4 de 5 estrelas
4/5 (1103)
The Constant Gardener: A Novel
No Everand
The Constant Gardener: A Novel
John le Carré
Nota: 3.5 de 5 estrelas
3.5/5 (104)
A Tree Grows in Brooklyn
No Everand
A Tree Grows in Brooklyn
Betty Smith
Nota: 4.5 de 5 estrelas
4.5/5 (1929)
12 Phase 4 Gersain Criollo Campos
Documento6 páginas
12 Phase 4 Gersain Criollo Campos
katherine bernate fernandez
Ainda não há avaliações
PDF-5 The Cognitive Model
Documento24 páginas
PDF-5 The Cognitive Model
Chinmayi C S
Ainda não há avaliações
Electromyography & It's Application in Orthodontics
Documento22 páginas
Electromyography & It's Application in Orthodontics
prasadgayake
100% (1)
Possibilities: Limitations and
Documento12 páginas
Possibilities: Limitations and
Hazel Kinilog
88% (8)
3 Language Comprehension
Documento14 páginas
3 Language Comprehension
Zidan Muhammad Rausyan
Ainda não há avaliações
Effective Comunication Skills
Documento36 páginas
Effective Comunication Skills
Scarlet
100% (5)
Unit Planning Flowchart WORD
Documento1 página
Unit Planning Flowchart WORD
Michelle Peut
Ainda não há avaliações
Social Influence and Its Varieties
Documento15 páginas
Social Influence and Its Varieties
Yeda Cruz
100% (1)
Terminologia Neuroanatomica
Documento144 páginas
Terminologia Neuroanatomica
matteo_vavassori
Ainda não há avaliações
Contingencies in Decision Making
Documento15 páginas
Contingencies in Decision Making
núria pla ferrer
Ainda não há avaliações
Lesson Plan Template Bird Puppets
Documento6 páginas
Lesson Plan Template Bird Puppets
api-613934482
Ainda não há avaliações
Coatnet: Marrying Convolution and Attention For All Data Sizes
Documento18 páginas
Coatnet: Marrying Convolution and Attention For All Data Sizes
Lịch Nguyễn Đăng
Ainda não há avaliações
Brain-Based-Coaching - CoachingToolkit - ParticipantManual
Documento69 páginas
Brain-Based-Coaching - CoachingToolkit - ParticipantManual
Eden Padayachee
Ainda não há avaliações
Effects of Retroactive and Proactive Interference On Word List Recall in Schizophrenia
Documento10 páginas
Effects of Retroactive and Proactive Interference On Word List Recall in Schizophrenia
Priscila Freitas
Ainda não há avaliações
Eccv10 Tutorial Part4
Documento52 páginas
Eccv10 Tutorial Part4
jatin
Ainda não há avaliações
CMH Group FINAL
Documento34 páginas
CMH Group FINAL
Subarna Das
Ainda não há avaliações
MODULE 5 - Week 2 - The Child and Adolescent Learners and Learning Principles
Documento5 páginas
MODULE 5 - Week 2 - The Child and Adolescent Learners and Learning Principles
Marsha MG
Ainda não há avaliações
Anatomy and Physiology (Status Epilepticus)
Documento3 páginas
Anatomy and Physiology (Status Epilepticus)
Marvin John Labiano
33% (3)
Sensory Receptors Lab
Documento5 páginas
Sensory Receptors Lab
api-290419496
Ainda não há avaliações
Atencion y Motivación
Documento9 páginas
Atencion y Motivación
NICKOLE PATRICIA MARIN DIAZ
Ainda não há avaliações
Psychology II Final Draft
Documento24 páginas
Psychology II Final Draft
Ajay Khedar
Ainda não há avaliações
Answers
Documento5 páginas
Answers
d o s
Ainda não há avaliações
CAS Guide
Documento2 páginas
CAS Guide
Vish94
Ainda não há avaliações
Learners With Difficulty Communicating
Documento32 páginas
Learners With Difficulty Communicating
Michael Angelo Trigo Oliva
Ainda não há avaliações
Fatima Memorial Medical & Dental College & AHS, Lahore
Documento8 páginas
Fatima Memorial Medical & Dental College & AHS, Lahore
Aqsa Mohsin
Ainda não há avaliações
A Presentation On Visual Perception: MR .Mayur Rahul Sir
Documento18 páginas
A Presentation On Visual Perception: MR .Mayur Rahul Sir
djbkhsg
Ainda não há avaliações
Psychological Effects of Cell Phone Addiction
Documento5 páginas
Psychological Effects of Cell Phone Addiction
Sandeep Soni
Ainda não há avaliações
O.B Assignement MBA - May 2019 - Case Study v35
Documento4 páginas
O.B Assignement MBA - May 2019 - Case Study v35
ahmed
Ainda não há avaliações
RPMS SY 2021-2022: Teacher Reflection Form (TRF)
Documento3 páginas
RPMS SY 2021-2022: Teacher Reflection Form (TRF)
Mark Kim Ortiz
Ainda não há avaliações
Effective Teaching Methods
Documento5 páginas
Effective Teaching Methods
mungai35
Ainda não há avaliações