Você está na página 1de 10

Lecture 1:

Machine Learning Introduction


Machine Learning = Hacking + Math & Statistics
• The field of machine learning is concerned with the
question of how to construct computer programs
that automatically improve with experience.
• Vast amounts of data are being generated in many
fields, and the statisticians’s job is to make sense of
it all: to extract important patterns and trends, and
to understand “what the data says”. We call
this learning from data.
• Machine Learning is the training of a model from
data that generalizes a decision against a
performance measure.
Machine Learning Definition
• Arthur Samuel Definition: Machine Learning is the "field of study that
gives computers the ability to learn without being explicitly
programmed."
• “A computer program is said to learn from experience E with respect
to some task T and some performance measure P, if its performance
on T, as measured by P, improves with experience E.” -- Tom Mitchell,
Carnegie Mellon University.
Machine Learning
• So if you want your program to predict, for example, traffic patterns
at a busy intersection (task T), you can run it through a machine
learning algorithm with data about past traffic patterns (experience E)
and, if it has successfully “learned”, it will then do better at predicting
future traffic patterns (performance measure P).
• ML solves problems that cannot be solved by numerical means alone.
• The field is quite vast and is expanding rapidly, being continually
partitioned and sub-partitioned.
Problem Definition Framework

• use a simple framework when defining a new problem to address


with machine learning. The framework helps me to quickly
understand the elements and motivation for the problem and
whether machine learning is suitable or not.
• The framework involves answering three questions to varying degrees
of thoroughness:
• Step 1: What is the problem?
• Step 2: Why does the problem need to be solved?
• Step 3: How would I solve the problem?
Step 1: What is the Problem
• For example: I need a program that will tell me which tweets will get retweets.
• Formalism

• Use this formalism to define the T, P, and E for your problem.
• For example:
• Task (T): Classify a tweet that has not been published as going to get retweets or
not.
• Experience (E): A corpus of tweets for an account where some have retweets and
some do not.
• Performance (P): Classification accuracy, the number of tweets predicted
correctly out of all tweets considered as a percentage.

Assumptions

• Create a list of assumptions about the problem and it’s phrasing.


• hese may be rules of thumb and domain specific information that you think will get you
to a viable solution faster.
• It can be useful to highlight questions that can be tested against real data because
breakthroughs and innovation occur when assumptions and best practice are
demonstrated to be wrong in the face of real data. It can also be useful to highlight areas
of the problem specification that may need to be challenged, relaxed or tightened.
• For example:
• The specific words used in the tweet matter to the model.
• The specific user that retweets does not matter to the model.
• The number of retweets may matter to the model.
• Older tweets are less predictive than more recent tweets.

Mapping of input to output

• How Machine Learning Algorithms Work (they learn a mapping of


input to output)
• There is a common principle that underlies all supervised machine
learning algorithms for predictive modeling.
Learning a Function

• Learning a Function
• Machine learning algorithms are described as learning a target function (f)
that best maps input variables (X) to an output variable (Y).
Y = f(X)
• This is a general learning task where we would like to make predictions in
the future (Y) given new examples of input variables (X).
• We don’t know what the function (f) looks like or it’s form. If we did, we
would use it directly and we would not need to learn it from data using
machine learning algorithms.
• It is harder than you think. There is also error (e) that is independent of the
input data (X).
Y = f(X) + e
Error
• This error might be error such as not having enough attributes to
sufficiently characterize the best mapping from X to Y. This error is
called irreducible error because no matter how good we get at
estimating the target function (f), we cannot reduce this error.
• This is to say, that the problem of learning a function from data is a
difficult problem and this is the reason why the field of machine
learning and machine learning algorithms exisst.

Você também pode gostar