Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

F# for Machine Learning Essentials
F# for Machine Learning Essentials
F# for Machine Learning Essentials
Ebook353 pages1 hour

F# for Machine Learning Essentials

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Get up and running with machine learning with F# in a fun and functional way

About This Book

- Design algorithms in F# to tackle complex computing problems
- Be a proficient F# data scientist using this simple-to-follow guide
- Solve real-world, data-related problems with robust statistical models, built for a range of datasets

Who This Book Is For

If you are a C# or an F# developer who now wants to explore the area of machine learning, then this book is for you. Familiarity with theoretical concepts and notation of mathematics and statistics would be an added advantage.

What You Will Learn

- Use F# to find patterns through raw data
- Build a set of classification systems using Accord.NET, Weka, and F#
- Run machine learning jobs on the Cloud with MBrace
- Perform mathematical operations on matrices and vectors using Math.NET
- Use a recommender system for your own problem domain
- Identify tourist spots across the globe using inputs from the user with decision tree algorithms

In Detail

The F# functional programming language enables developers to write simple code to solve complex problems. With F#, developers create consistent and predictable programs that are easier to test and reuse, simpler to parallelize, and are less prone to bugs.
If you want to learn how to use F# to build machine learning systems, then this is the book you want.
Starting with an introduction to the several categories on machine learning, you will quickly learn to implement time-tested, supervised learning algorithms. You will gradually move on to solving problems on predicting housing pricing using Regression Analysis. You will then learn to use Accord.NET to implement SVM techniques and clustering. You will also learn to build a recommender system for your e-commerce site from scratch. Finally, you will dive into advanced topics such as implementing neural network algorithms while performing sentiment analysis on your data.

Style and approach

This book is a fast-paced tutorial guide that uses hands-on examples to explain real-world applications of machine learning. Using practical examples, the book will explore several machine learning techniques and also describe how you can use F# to build machine learning systems.
LanguageEnglish
Release dateFeb 25, 2016
ISBN9781783989355
F# for Machine Learning Essentials

Related to F# for Machine Learning Essentials

Related ebooks

Programming For You

View More

Related articles

Reviews for F# for Machine Learning Essentials

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    F# for Machine Learning Essentials - Sudipta Mukherjee

    Table of Contents

    F# for Machine Learning Essentials

    Credits

    Foreword

    About the Author

    Acknowledgments

    About the Reviewers

    www.PacktPub.com

    eBooks, discount offers, and more

    Why subscribe?

    Preface

    What this book covers

    What you need for this book

    Who this book is for

    Conventions

    Reader feedback

    Customer support

    Downloading the example code

    Downloading the color images of this book

    Errata

    Piracy

    Questions

    1. Introduction to Machine Learning

    Objective

    Getting in touch

    Different areas where machine learning is being used

    Why use F#?

    Supervised machine learning

    Training and test dataset/corpus

    Some motivating real life examples of supervised learning

    Nearest Neighbour algorithm (a.k.a k-NN algorithm)

    Distance metrics

    Decision tree algorithms

    Linear regression

    Logistic regression

    Recommender systems

    Unsupervised learning

    Machine learning frameworks

    Machine learning for fun and profit

    Recognizing handwritten digits – your Hello World ML program

    How does this work?

    Summary

    2. Linear Regression

    Objective

    Different types of linear regression algorithms

    APIs used

    Math.NET Numerics for F# 3.7.0

    Getting Math.NET

    Experimenting with Math.NET

    The basics of matrices and vectors (a short and sweet refresher)

    Creating a vector

    Creating a matrix

    Finding the transpose of a matrix

    Finding the inverse of a matrix

    Trace of a matrix

    QR decomposition of a matrix

    SVD of a matrix

    Linear regression method of least square

    Finding linear regression coefficients using F#

    Finding the linear regression coefficients using Math.NET

    Putting it together with Math.NET and FsPlot

    Multiple linear regression

    Multiple linear regression and variations using Math.NET

    Weighted linear regression

    Plotting the result of multiple linear regression

    Ridge regression

    Multivariate multiple linear regression

    Feature scaling

    Summary

    3. Classification Techniques

    Objective

    Different classification algorithms you will learn

    Some interesting things you can do

    Binary classification using k-NN

    How does it work?

    Finding cancerous cells using k-NN: a case study

    Understanding logistic regression

    The sigmoid function chart

    Binary classification using logistic regression (using Accord.NET)

    Multiclass classification using logistic regression

    How does it work?

    Multiclass classification using decision trees

    Obtaining and using WekaSharp

    How does it work?

    Predicting a traffic jam using a decision tree: a case study

    Challenge yourself!

    Summary

    4. Information Retrieval

    Objective

    Different IR algorithms you will learn

    What interesting things can you do?

    Information retrieval using tf-idf

    Measures of similarity

    Generating a PDF from a histogram

    Minkowski family

    L1 family

    Intersection family

    Inner Product family

    Fidelity family or squared-chord family

    Squared L2 family

    Shannon's Entropy family

    Combinations

    Set-based similarity measures

    Similarity of asymmetric binary attributes

    Some example usages of distance metrics

    Finding similar cookies using asymmetric binary similarity measures

    Grouping/clustering color images based on Canberra distance

    Summary

    5. Collaborative Filtering

    Objective

    Different classification algorithms you will learn

    Vocabulary of collaborative filtering

    Baseline predictors

    Basis of User-User collaborative filtering

    Implementing basic user-user collaborative filtering using F#

    Code walkthrough

    Variations of gap calculations and similarity measures

    Item-item collaborative filtering

    Top-N recommendations

    Evaluating recommendations

    Prediction accuracy

    Confusion matrix (decision support)

    Ranking accuracy metrics

    Prediction-rating correlation

    Working with real movie review data (Movie Lens)

    Summary

    6. Sentiment Analysis

    Objective

    What you will learn

    A baseline algorithm for SA using SentiWordNet lexicons

    Handling negations

    Identifying praise or criticism with sentiment orientation

    Pointwise Mutual Information

    Using SO-PMI to find sentiment analysis

    Summary

    7. Anomaly Detection

    Objective

    Different classification algorithms

    Some cool things you will do

    The different types of anomalies

    Detecting point anomalies using IQR (Interquartile Range)

    Detecting point anomalies using Grubb's test

    Grubb's test for multivariate data using Mahalanobis distance

    Code walkthrough

    Chi-squared statistic to determine anomalies

    Detecting anomalies using density estimation

    Strategy to convert a collective anomaly to a point anomaly problem

    Dealing with categorical data in collective anomalies

    Summary

    Index

    F# for Machine Learning Essentials


    F# for Machine Learning Essentials

    Copyright © 2016 Packt Publishing

    All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

    Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

    Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

    First published: February 2016

    Production reference: 1190216

    Published by Packt Publishing Ltd.

    Livery Place

    35 Livery Street

    Birmingham B3 2PB, UK.

    ISBN 978-1-78398-934-8

    www.packtpub.com

    Credits

    Author

    Sudipta Mukherjee

    Reviewers

    Alena Hall

    David Stephens

    Commissioning Editor

    Ashwin Nair

    Acquisition Editors

    Harsha Bharwani

    Larissa Pinto

    Content Development Editor

    Athira Laji

    Technical Editor

    Ryan Kochery

    Copy Editor

    Alpha Singh

    Project Coordinator

    Bijal Patel

    Proofreader

    Safis Editing

    Indexer

    Rekha Nair

    Graphics

    Abhinash Sahu

    Production Coordinator

    Aparna Bhagat

    Cover Work

    Aparna Bhagat

    Foreword

    Machine Learning (ML) is one of the most impactful technologies of the last 10 years, fueled by the exponential growth of electronic data about people and their interaction with the world and each other, as well as the availability of massive computing power to extract patterns from data. Applications of ML are already affecting all of us in everyday life, whether it's face recognition in modern cameras, personalized web or product searches, or even the detection of road sign patterns in modern cars. Machine learning is a set of algorithms that learn prediction programs from past data in order to use them for future predictions—whether the prediction programs are represented as decision trees, as neural networks, or via nearest-neighbor functions.

    Another influential development in computer science is the invention of F#. Less than 10 years ago, functional programming was a more of an academic endeavor than a style of programming and software development used in production systems. The development of F# since 2005 changed this forever. With F#, programmers are not only able to benefit from type inference and easy parallelization of workflows, but they also get the runtime performance that they are used to from programming in other .NET languages, such as C#. I personally witnessed this transformation at Microsoft Research and saw how data-intensive applications could be written much more safely in less than 100 lines of F# code compared to thousands of lines of C# code.

    A critically important ingredient of ML is data; it's the lifeblood of any ML algorithm. Parsing, cleaning, and visualizing data is the basis of any successful ML application and constitutes the majority of the time that practitioners spend in making machine learning systems work. F# proves to be the perfect bridge between data processing and analysis, with ML on one hand and the ability to invent new ML algorithms on the other hand.

    In this book, Sudipta Mukherjee introduces the reader to the basics of machine learning, ranging from supervised methods, such as classification learning and regression, to unsupervised methods, such as K-means clustering. Sudipta focuses on the applied aspects of machine learning and develops all algorithms in F#, both natively as well as by integrating with .NET libraries such as WekaSharp, Accord.Net and Math.Net. He covers a wide range of algorithms for classification and regression learning and also explores more novel ML concepts, such as anomaly detection. The book is enriched with directly applicable source code examples, and the reader will enjoy learning about modern machine learning algorithms through the numerous examples provided.

    Dr. Ralf Herbrich

    Director of Machine Learning Science at Amazon

    About the Author

    Sudipta Mukherjee was born in Kolkata and migrated to Bangalore. He is an electronics engineer by education and a computer engineer/scientist by profession and passion. He graduated in 2004 with a degree in electronics and communication engineering.

    He has a keen interest in data structure, algorithms, text processing, natural language processing tools development, programming languages, and machine learning at large. His first book on Data Structure using C has been received quite well. Parts of the book can be read on Google Books at http://goo.gl/pttSh. The book was also translated into simplified Chinese, available from Amazon.cn at http://goo.gl/lc536. This is Sudipta's second book with Packt Publishing. His first book, .NET 4.0 Generics (http://goo.gl/MN18ce), was also received very well. During the last few years, he has been hooked to the functional programming style. His book on functional programming, Thinking in LINQ (http://goo.gl/hm0lNF), was released last year. Last year, he also gave a talk at @FuConf based on his LINQ book (https://goo.gl/umdxIX). He lives in Bangalore with his wife and son.

    Sudipta can be reached via e-mail at <sudipto80@yahoo.com> and via Twitter at @samthecoder.

    Acknowledgments

    First, I want to thank Dr. Don Syme (@dsyme) and everyone in the product team who brought F# to the world and made a fantastic integration with Visual Studio. I also want to thank Professor Andrew Ng (@AndrewYNg). I first learned about machine learning from his MOOC on machine learning at Coursera (https://www.coursera.org/learn/machine-learning).

    This book couldn't have seen the light of day without a few people: my acquisition editor, Ms. Harsha Bharwani, who persuaded me to work on this book; and my development editor, Ms. Athira Laji, who tolerated many delays in the delivery schedule but kept the bar high and got me going. She is one of the most compassionate development editors I have ever worked with. Thank you mam! I have been fortunate to have a couple of very educated reviewers on board: Mr. David Stephens (the PM of the F# programming language) (@NumberByColors) and Ms. Alena Dzenisenka (@lenadroid). The book uses several open source frameworks and F#. So, thanks to all the people who have contributed to these projects. I also want to say a huge thank you to Dr. Ralf Herbrich (@rherbrich), the director of machine learning science at Amazon, Berlin, for kindly writing a foreword for the book.

    Last but not least, I must say that I am very fortunate to have a very loving family, who always stood by me whenever I needed support. My wife, Mou, made sure that I had enough time to write the chapters. We couldn't go out on weekends. I promise to make up for all the missed family time. Thank you sweetheart! My

    Enjoying the preview?
    Page 1 of 1