Encontre seu próximo livro favorito

Torne'se membro hoje e leia gratuitamente por 30 dias.
Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network

Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network

Ler amostra

Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network

Comprimento:
103 página
51 minutos
Lançado em:
Jul 2, 2019
ISBN:
9780463951262
Formato:
Livro

Descrição

* Research fields: Computer Vision and Machine Learning.
* Book Topic: Action recognition from videos.
* Recognition Tool: Recurrent Neural Network (RNN) with LSTM (Long-Short Term Memory) layer and fully connected layer.
* Programming Language: Step-by-step implementation with Python in Jupyter Notebook.
* Major Steps: Building a network, training the network, testing the network, comparing the network with an SVM (Support Vector Machines) classifier.
* Processing Units to Execute the Codes: CPU and GPU (on Google Colaboratory).
* Image Feature Extraction Tool: Pretrained VGG16 network.
* Dataset: UCF101 (the first 15 actions, 2010 videos).
* Main Results: For the testing data, the highest prediction accuracy from the RNN is 86.97%, which is a little higher than that from the SVM classifier (86.09%).
* Detailed Description:
Recurrent Neural Network (RNN) is a great tool to do video action recognition. This book built an RNN with an LSTM (Long-Short Term Memory) layer and a fully connected layer to do video action recognition.
The RNN was trained and evaluated with VGG16 Features that were saved in .mat files; the features were extracted from images with a modified pretrained VGG16 network; the images were converted from videos in the UCF101 dataset, which has 101 different actions including 13,320 videos; please notice that only the first 15 actions in this dataset were used to do the recognition.
The codes were implemented step-by-step with Python in Jupyter Notebook, and they could be executed on both CPUs and GPUs; free GPUs on Google Colaboratory were used as hardware accelerator to do most of the calculations.
For the purpose of getting a higher testing accuracy, the architecture of the network was regulated, and parameters of the network and its optimizer were fine-tuned.
For comparison purpose only, an SVM (Support Vector Machines) classifier was trained and tested.
For the first 15 actions in the UCF101 dataset, the highest prediction accuracy of the testing data from the RNN is 86.97%, which is a little higher than that from the SVM classifier (86.09%).
In conclusion, the performances of the RNN and the SVM classifier are approximately the same for the task in this book, which is a little embarrassed. However, RNN does have its own advantages in many other cases in the fields of Computer Vision and Machine Learning, and the implementation in this book can be an introduction to this topic in order to throw out a minnow to catch a whale.

Lançado em:
Jul 2, 2019
ISBN:
9780463951262
Formato:
Livro

Sobre o autor

Dr. Magic is a Senior Software Engineer living in Long Island, New York. He loves reading and writing. He is very interested in Computer Vision and Machine Learning. He has concentrated on image processing for more than five years.

Relacionado a Action Recognition

Livros relacionados
Artigos relacionados

Amostra do Livro

Action Recognition - Mark Magic

Action Recognition

Step-by-step Recognizing Actions with Python and Recurrent Neural Network

By Dr. Mark Magic

Long Island, NY, USA

The author and the editor have taken care in the preparation of this book and taken great efforts to ensure that the information and instructions contained in this book are accurate, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions.

No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. Use of the contents contained in this book is at your own risk.

If any code samples or techniques contained or described in this book is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.

Action Recognition: Step-by-step Recognizing Actions with Python and Recurrent Neural Network

Copyright 2019 Dr. Mark Magic All rights reserved.

Published by M.J. Magic Publishing. This publication is protected by copyright, and permission must be obtained from the author prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission to use material from this work, please submit a written request to Dr. Mark Magic: mark.john.magic@gmail.com.

This ebook is licensed for your personal enjoyment only. This ebook may not be re-sold or given away to other people. If you would like to share this book with another person, please purchase an additional copy for each recipient. If you’re reading this book and did not purchase it, or it was not purchased for your use only, then please return to your favorite ebook retailer and purchase your own copy. Thank you for respecting the hard work of this author.

Please remember to leave a review for this book at your favorite retailer.

This book is available in print at most online retailers.

First edition: July 2019

Table of Contents

Chapter 1: Introduction

Chapter 2: Feature Extraction and Dataset Loading

Chapter 3: Modelling with Long-Short Term Memory (LSTM) Network

Chapter 4: Model Evaluation

Chapter 5: Model Improvements

Chapter 6: Conclusions

Appendix

A.1. All codes in extract_UCF101_images.py

A.2. All codes in extract_vgg16_feat.py

A.3. All codes in Action_Recognition.ipynb

References

Postscript

About Dr. Mark Magic

Connect with Dr. Mark Magic

Other books by Dr. Mark Magic

Chapter 1: Introduction

Recurrent Neural Network (RNN) [¹] is a great tool to do video action recognition, especially the LSTM (Long-Short Term Memory) algorithm [²]. This book is about this topic. The codes are implemented step by step with Python in Jupyter Notebook [³] and they can be run on both CPUs and GPUs. The dataset is the UCF101 [⁴,⁵] that was developed by Soomro et al. from the University of Central Florida. The dataset has 101 different actions/classes.

The major steps of the implementation are: first, convert the video to images; second, extract features of each image with the pretrained VGG16 network [⁶,⁷,⁸]; next, separate the features into training data and testing data with their corresponding labels; then, define an RNN with LSTM and train the RNN with the training data; and last, evaluate the RNN with the testing data. For comparison purpose only, a Support Vector Machines (SVM) [⁹,¹⁰,¹¹] classifier is also trained and tested using the same dataset.

Python [¹²] is one of the best programing languages to realize tasks in the fields of Computer Vision and Machine Learning. This is the reason that we choose Python to implement the action recognition task in this book. Python is an interpreted, high-level, general-purpose programming language. It has a design philosophy of emphasizing code readability, notably using significant whitespace. It features an automatic memory management. It supports multiple programming paradigms, including object-oriented, imperative, functional and procedural. It has a large and comprehensive standard library.

The Anaconda Distributions of Python can be downloaded from https://www.anaconda.com/download. We will use Python 3.7 version in this book. After downloading Anaconda3-2018.12-Windows-x86_64.exe for Windows 64-bit Operating Systems, first install it with default settings; then open Anaconda Prompt to install several libraries: install opencv-python using pip install opencv-python==3.4.3.18, install PyTorch [¹³] using pip install https://download.pytorch.org/whl/cpu/torch-1.0.0-cp37-cp37m-win_amd64.whl, and install torchvision using pip install torchvision. By default, the Jupyter Notebook is included in this distribution.

The hardware settings for the calculations on CPUs in this book are: Lenovo ThinkPad T440s PC; Windows 8.1, 64-bit Operating System; Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz 2.69 GHz Processor; and 12.0 GB RAM.

On CPUs, the calculations are very slow, and sometimes, the computer may even be no response or crush; therefore, GPUs on Google Colaboratory [¹⁴] are used as hardware accelerator to do most of the calculations. For the details of how to use Colaboratory, please see Chapter 8 in the book "Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning" [¹⁵].

There are two major drawbacks of Colaboratory: the connection may be lost if there is no activity for about 10 to 20 minutes, even if a file is uploading or a

Você chegou ao final desta amostra. Inscreva-se para ler mais!
Página 1 de 1

Análises

O que as pessoas pensam sobre Action Recognition

0
0 avaliações / 0 Análises
O que você acha?
Classificação: 0 de 5 estrelas

Avaliações de leitores