F# for Machine Learning Essentials
()
About this ebook
About This Book
- Design algorithms in F# to tackle complex computing problems
- Be a proficient F# data scientist using this simple-to-follow guide
- Solve real-world, data-related problems with robust statistical models, built for a range of datasets
Who This Book Is For
If you are a C# or an F# developer who now wants to explore the area of machine learning, then this book is for you. Familiarity with theoretical concepts and notation of mathematics and statistics would be an added advantage.
What You Will Learn
- Use F# to find patterns through raw data
- Build a set of classification systems using Accord.NET, Weka, and F#
- Run machine learning jobs on the Cloud with MBrace
- Perform mathematical operations on matrices and vectors using Math.NET
- Use a recommender system for your own problem domain
- Identify tourist spots across the globe using inputs from the user with decision tree algorithms
In Detail
The F# functional programming language enables developers to write simple code to solve complex problems. With F#, developers create consistent and predictable programs that are easier to test and reuse, simpler to parallelize, and are less prone to bugs.
If you want to learn how to use F# to build machine learning systems, then this is the book you want.
Starting with an introduction to the several categories on machine learning, you will quickly learn to implement time-tested, supervised learning algorithms. You will gradually move on to solving problems on predicting housing pricing using Regression Analysis. You will then learn to use Accord.NET to implement SVM techniques and clustering. You will also learn to build a recommender system for your e-commerce site from scratch. Finally, you will dive into advanced topics such as implementing neural network algorithms while performing sentiment analysis on your data.
Style and approach
This book is a fast-paced tutorial guide that uses hands-on examples to explain real-world applications of machine learning. Using practical examples, the book will explore several machine learning techniques and also describe how you can use F# to build machine learning systems.
Related to F# for Machine Learning Essentials
Related ebooks
Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition) Rating: 0 out of 5 stars0 ratingsDeep Learning with TensorFlow Rating: 5 out of 5 stars5/5Designing Machine Learning Systems with Python Rating: 0 out of 5 stars0 ratingsPython Text Mining: Perform Text Processing, Word Embedding, Text Classification and Machine Translation Rating: 0 out of 5 stars0 ratingsAdvanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch Rating: 0 out of 5 stars0 ratingsAdvanced Machine Learning with Python Rating: 0 out of 5 stars0 ratingsDistributed Computing with Python Rating: 0 out of 5 stars0 ratingsInteractive Applications Using Matplotlib Rating: 0 out of 5 stars0 ratingsPractical Machine Learning Rating: 2 out of 5 stars2/5Building a Recommendation System with R Rating: 0 out of 5 stars0 ratingsHyperparameter Optimization in Machine Learning: Make Your Machine Learning and Deep Learning Models More Efficient Rating: 0 out of 5 stars0 ratingsPython Deep Learning Rating: 5 out of 5 stars5/5Practical Data Science with Python 3: Synthesizing Actionable Insights from Data Rating: 0 out of 5 stars0 ratingsLearning Data Mining with Python Rating: 0 out of 5 stars0 ratingsGetting Started with Python Data Analysis Rating: 0 out of 5 stars0 ratingsLearning Data Mining with Python - Second Edition Rating: 0 out of 5 stars0 ratingsScientific Computing with Python 3 Rating: 0 out of 5 stars0 ratingsPractical Python Data Visualization: A Fast Track Approach To Learning Data Visualization With Python Rating: 4 out of 5 stars4/5Machine Learning Systems: Designs that scale Rating: 0 out of 5 stars0 ratingsPython Data Science Essentials Rating: 0 out of 5 stars0 ratingsReinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges Rating: 0 out of 5 stars0 ratingsTensorFlow A Complete Guide - 2019 Edition Rating: 0 out of 5 stars0 ratingsDeep Learning for Computer Vision with SAS: An Introduction Rating: 0 out of 5 stars0 ratingsMastering F# Rating: 5 out of 5 stars5/5
Programming For You
HTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Python Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming Rating: 0 out of 5 stars0 ratingsLearn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5Coding All-in-One For Dummies Rating: 4 out of 5 stars4/5Learn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS Rating: 0 out of 5 stars0 ratingsJava for Beginners: A Crash Course to Learn Java Programming in 1 Week Rating: 5 out of 5 stars5/5The Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application Rating: 0 out of 5 stars0 ratingsPYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5Hacking: Ultimate Beginner's Guide for Computer Hacking in 2018 and Beyond: Hacking in 2018, #1 Rating: 4 out of 5 stars4/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5The Little SAS Book: A Primer, Sixth Edition Rating: 5 out of 5 stars5/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5Teach Yourself C++ Rating: 4 out of 5 stars4/5Pokemon Go: Guide + 20 Tips and Tricks You Must Read Hints, Tricks, Tips, Secrets, Android, iOS Rating: 5 out of 5 stars5/5
Reviews for F# for Machine Learning Essentials
0 ratings0 reviews
Book preview
F# for Machine Learning Essentials - Sudipta Mukherjee
Table of Contents
F# for Machine Learning Essentials
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Introduction to Machine Learning
Objective
Getting in touch
Different areas where machine learning is being used
Why use F#?
Supervised machine learning
Training and test dataset/corpus
Some motivating real life examples of supervised learning
Nearest Neighbour algorithm (a.k.a k-NN algorithm)
Distance metrics
Decision tree algorithms
Linear regression
Logistic regression
Recommender systems
Unsupervised learning
Machine learning frameworks
Machine learning for fun and profit
Recognizing handwritten digits – your Hello World
ML program
How does this work?
Summary
2. Linear Regression
Objective
Different types of linear regression algorithms
APIs used
Math.NET Numerics for F# 3.7.0
Getting Math.NET
Experimenting with Math.NET
The basics of matrices and vectors (a short and sweet refresher)
Creating a vector
Creating a matrix
Finding the transpose of a matrix
Finding the inverse of a matrix
Trace of a matrix
QR decomposition of a matrix
SVD of a matrix
Linear regression method of least square
Finding linear regression coefficients using F#
Finding the linear regression coefficients using Math.NET
Putting it together with Math.NET and FsPlot
Multiple linear regression
Multiple linear regression and variations using Math.NET
Weighted linear regression
Plotting the result of multiple linear regression
Ridge regression
Multivariate multiple linear regression
Feature scaling
Summary
3. Classification Techniques
Objective
Different classification algorithms you will learn
Some interesting things you can do
Binary classification using k-NN
How does it work?
Finding cancerous cells using k-NN: a case study
Understanding logistic regression
The sigmoid function chart
Binary classification using logistic regression (using Accord.NET)
Multiclass classification using logistic regression
How does it work?
Multiclass classification using decision trees
Obtaining and using WekaSharp
How does it work?
Predicting a traffic jam using a decision tree: a case study
Challenge yourself!
Summary
4. Information Retrieval
Objective
Different IR algorithms you will learn
What interesting things can you do?
Information retrieval using tf-idf
Measures of similarity
Generating a PDF from a histogram
Minkowski family
L1 family
Intersection family
Inner Product family
Fidelity family or squared-chord family
Squared L2 family
Shannon's Entropy family
Combinations
Set-based similarity measures
Similarity of asymmetric binary attributes
Some example usages of distance metrics
Finding similar cookies using asymmetric binary similarity measures
Grouping/clustering color images based on Canberra distance
Summary
5. Collaborative Filtering
Objective
Different classification algorithms you will learn
Vocabulary of collaborative filtering
Baseline predictors
Basis of User-User collaborative filtering
Implementing basic user-user collaborative filtering using F#
Code walkthrough
Variations of gap calculations and similarity measures
Item-item collaborative filtering
Top-N recommendations
Evaluating recommendations
Prediction accuracy
Confusion matrix (decision support)
Ranking accuracy metrics
Prediction-rating correlation
Working with real movie review data (Movie Lens)
Summary
6. Sentiment Analysis
Objective
What you will learn
A baseline algorithm for SA using SentiWordNet lexicons
Handling negations
Identifying praise or criticism with sentiment orientation
Pointwise Mutual Information
Using SO-PMI to find sentiment analysis
Summary
7. Anomaly Detection
Objective
Different classification algorithms
Some cool things you will do
The different types of anomalies
Detecting point anomalies using IQR (Interquartile Range)
Detecting point anomalies using Grubb's test
Grubb's test for multivariate data using Mahalanobis distance
Code walkthrough
Chi-squared statistic to determine anomalies
Detecting anomalies using density estimation
Strategy to convert a collective anomaly to a point anomaly problem
Dealing with categorical data in collective anomalies
Summary
Index
F# for Machine Learning Essentials
F# for Machine Learning Essentials
Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: February 2016
Production reference: 1190216
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-934-8
www.packtpub.com
Credits
Author
Sudipta Mukherjee
Reviewers
Alena Hall
David Stephens
Commissioning Editor
Ashwin Nair
Acquisition Editors
Harsha Bharwani
Larissa Pinto
Content Development Editor
Athira Laji
Technical Editor
Ryan Kochery
Copy Editor
Alpha Singh
Project Coordinator
Bijal Patel
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Abhinash Sahu
Production Coordinator
Aparna Bhagat
Cover Work
Aparna Bhagat
Foreword
Machine Learning (ML) is one of the most impactful technologies of the last 10 years, fueled by the exponential growth of electronic data about people and their interaction with the world and each other, as well as the availability of massive computing power to extract patterns from data. Applications of ML are already affecting all of us in everyday life, whether it's face recognition in modern cameras, personalized web or product searches, or even the detection of road sign patterns in modern cars. Machine learning is a set of algorithms that learn prediction programs from past data in order to use them for future predictions—whether the prediction programs are represented as decision trees, as neural networks, or via nearest-neighbor functions.
Another influential development in computer science is the invention of F#. Less than 10 years ago, functional programming was a more of an academic endeavor than a style of programming and software development used in production systems. The development of F# since 2005 changed this forever. With F#, programmers are not only able to benefit from type inference and easy parallelization of workflows, but they also get the runtime performance that they are used to from programming in other .NET languages, such as C#. I personally witnessed this transformation at Microsoft Research and saw how data-intensive applications could be written much more safely in less than 100 lines of F# code compared to thousands of lines of C# code.
A critically important ingredient of ML is data; it's the lifeblood of any ML algorithm. Parsing, cleaning, and visualizing data is the basis of any successful ML application and constitutes the majority of the time that practitioners spend in making machine learning systems work. F# proves to be the perfect bridge between data processing and analysis, with ML on one hand and the ability to invent new ML algorithms on the other hand.
In this book, Sudipta Mukherjee introduces the reader to the basics of machine learning, ranging from supervised methods, such as classification learning and regression, to unsupervised methods, such as K-means clustering. Sudipta focuses on the applied aspects of machine learning and develops all algorithms in F#, both natively as well as by integrating with .NET libraries such as WekaSharp, Accord.Net and Math.Net. He covers a wide range of algorithms for classification and regression learning and also explores more novel ML concepts, such as anomaly detection. The book is enriched with directly applicable source code examples, and the reader will enjoy learning about modern machine learning algorithms through the numerous examples provided.
Dr. Ralf Herbrich
Director of Machine Learning Science at Amazon
About the Author
Sudipta Mukherjee was born in Kolkata and migrated to Bangalore. He is an electronics engineer by education and a computer engineer/scientist by profession and passion. He graduated in 2004 with a degree in electronics and communication engineering.
He has a keen interest in data structure, algorithms, text processing, natural language processing tools development, programming languages, and machine learning at large. His first book on Data Structure using C has been received quite well. Parts of the book can be read on Google Books at http://goo.gl/pttSh. The book was also translated into simplified Chinese, available from Amazon.cn at http://goo.gl/lc536. This is Sudipta's second book with Packt Publishing. His first book, .NET 4.0 Generics (http://goo.gl/MN18ce), was also received very well. During the last few years, he has been hooked to the functional programming style. His book on functional programming, Thinking in LINQ (http://goo.gl/hm0lNF), was released last year. Last year, he also gave a talk at @FuConf based on his LINQ book (https://goo.gl/umdxIX). He lives in Bangalore with his wife and son.
Sudipta can be reached via e-mail at <sudipto80@yahoo.com> and via Twitter at @samthecoder.
Acknowledgments
First, I want to thank Dr. Don Syme (@dsyme) and everyone in the product team who brought F# to the world and made a fantastic integration with Visual Studio. I also want to thank Professor Andrew Ng (@AndrewYNg). I first learned about machine learning from his MOOC on machine learning at Coursera (https://www.coursera.org/learn/machine-learning).
This book couldn't have seen the light of day without a few people: my acquisition editor, Ms. Harsha Bharwani, who persuaded me to work on this book; and my development editor, Ms. Athira Laji, who tolerated many delays in the delivery schedule but kept the bar high and got me going. She is one of the most compassionate development editors I have ever worked with. Thank you mam! I have been fortunate to have a couple of very educated reviewers on board: Mr. David Stephens (the PM of the F# programming language) (@NumberByColors) and Ms. Alena Dzenisenka (@lenadroid). The book uses several open source frameworks and F#. So, thanks to all the people who have contributed to these projects. I also want to say a huge thank you to Dr. Ralf Herbrich (@rherbrich), the director of machine learning science at Amazon, Berlin, for kindly writing a foreword for the book.
Last but not least, I must say that I am very fortunate to have a very loving family, who always stood by me whenever I needed support. My wife, Mou, made sure that I had enough time to write the chapters. We couldn't go out on weekends. I promise to make up for all the missed family time. Thank you sweetheart! My