Você está na página 1de 64

Lecture 03

Using Functions in R
A function is a piece of code written to carry out a
specified task; it may accept arguments or parameters
(or not) and it may return one or more values (or not!)
Generically, its arguments constitute the input and their
return values their output.
Functions
R comes with many functions that you can use to do sophisticated tasks
like random sampling.
For example:
You can round a number with the round function, or calculate its
factorial with the factorial function

Using a function is pretty simple. Just write the name of the function
and then the data you want the function to operate on in
parentheses: > factorial(3)
[1] 6
> round(3.1415)
[1] 3
The data that you pass into the function is called the functions
argument. The argument can be raw data, an R object, or even the
results of another R function. In this last case, R will work from the
innermost function to the outermost, as in Figure 5:
> mean(1:6) Figure 5. When you link functions
[1] 3.5 together, R will resolve them from
> mean(die) the innermost
operation to the outermost. Here
[1] 3.5
R first looks up die, then
> round(mean(die)) calculates the mean of one
[1] 4 through six, then rounds the
> mean
There is an R function that can help roll the die. You can simulate a roll
of the die with Rs sample function. sample takes two arguments: a vector
named x and a number named size. sample will return size elements from the
vector:
> sample(x = 1:4, size = 2)
[1] 1 4
> sample(x = 1:4, size = 1)
[1] 4
> sample(x = 1:4, size = 4)
[1] 1 4 2 3
>
To roll your die and get a number back, set x to die and sample one
element from it. Youll get a new (maybe different) number each time
you roll it:
> sample(x = die, size = 1)
[1] 4
> sample(x = die, size = 1)
[1] 4
> sample(x = die, size = 1)
[1] 5
> sample(x = die, size = 1)
[1] 1
> sample(x = die, size = 1)
[1] 4
[1] 1
>

Many R functions take multiple arguments that help them do their job.
You can give a function as many arguments as you like as long as you
separate each argument with a comma.
If youre not sure which names to use with a function, you can look up the
functions arguments with args. To do this, place the name of the function in the
parentheses behind args. For example, you can see that the round function takes
two arguments, one named x and one named digits:
> args(round)
function (x, digits = 0)
NULL
>
Sampling with replacement is an easy way to create independent random samples.
Each value in your sample will be a sample of size one that is independent of the
other values. This is the correct way to simulate a pair of dice:

>
> sample(die, size = 2, replace = TRUE)
[1] 4 3
Congratulate yourself; youve just run your first simulation in R! You now have a
method for simulating the result of rolling a pair of dice. If you want to add up
the dice, you can feed your result straight into the sum function:

> dice <- sample(die, size = 2, replace = TRUE)


> dice
[1] 5 5
> sum(dice)
[1] 10
>

What would happen if you call dice multiple times? Would R generate a new pair of
dice values each time? Lets give it a try:
> dice
[1] 5 5
> dice
[1] 5 5
> dice
[1] 5 5
>

Nope. Each time you call dice, R will show you the result of that one time you called sample and
saved the output to dice. R wont rerun sample(die, 2, replace = TRUE) to create a new roll of the
dice. This is a relief in a way. Once you save a set of
results to an R object, those results do not change. Programming would be quite hard if the
values of your objects changed each time you called them. However, it would be convenient to
have an object that can re-roll the dice whenever you call it. You can make such an object by
writing your own R function.
The Function Constructor
Every function in R has three basic parts:
a name, a body of code, and a set of arguments.

To make your own function, you need to replicate these parts and store
them in an R object, which you can do with the function function. To do
this, call function() and follow it with a pair of braces, {}:

my_function <- function() {}

function will build a function out of whatever R code you place


between the braces. For example, you can turn your dice code into a
function by calling:
> roll <- function(){
+ die <-1:6
+ dice <- sample(die, size = 2, replace = TRUE)
+ sum(dice)
+}

Dont forget to save the output of function to an R object. This object


will become your new function. To use it, write the objects name
followed by an open and closed parenthesis:

> roll()
[1] 3
You can think of the parentheses as the trigger that causes R to run
the function. If you type in a functions name without the parentheses,
R will show you the code that is stored inside the function. If you type
in the name with the parentheses, R will run that code:
> roll
function(){
die <-1:6
dice <- sample(die, size = 2, replace = TRUE)
sum(dice)
}
>
The code that you place inside your function is known as the body of
the function. When you run a function in R, R will execute all of the
code in the body and then return the result of the last line of code. If
the last line of code doesnt return a value, neither will your function, so
you want to ensure that your final line of code returns a value.

One way to check this is to think about what would happen if you ran
the body of code line by line in the command line. Would R display a
result after the last line, or would it not? Heres some code that would
display a result:
> dice <- sample(die, size = 2, replace = TRUE)
> two <- 1+1
> a <- sqrt(2)
>
>a
[1] 1.414214
> two
[1] 2
> dice
[1] 5 2
>
Arguments
Arguments
What if we removed one line of code from our function and changed the
name die to bones, like this?
> roll2 <- function() {
+ dice <- sample(bones, size = 2, replace = TRUE) Now Ill get an error when I
+ sume(dice) run the function. The
+} function needs the object
> roll2() bones to do its job, but there
Error in sample(bones, size = 2, replace = TRUE) : is no object named bones to
object 'bones' not found be found:
>
You can supply bones when you call roll2 if you make bones an
argument of the function. To do this, put the name bones in the
parentheses that follow function when you define roll2:
Now roll2 will work as long as you supply bones when you call the
function. You can take advantage of this to roll different types of dice
each time you call roll2. Remember, were rolling pairs of dice:
> roll2 <- function(bones){
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sum(dice)
+}
> roll2(bones = 1:4)
[1] 8
> roll2(bones = 1:6)
[1] 8
> roll2(bones = 1:20)
[1] 33
>
Notice that roll2 will still give an error if you do not supply a value for
the bones argument when you call roll2:
> roll2 <- function(bones) {
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sume(dice)
+}
> roll2()
Error in sample(bones, size = 2, replace = TRUE) :
argument "bones" is missing, with no default
>

You can prevent this error by giving the bones argument a default
value. To do this, set bones equal to a value when you define roll2:
You can prevent this error by giving the bones argument a default value.
To do this, set bones equal to a value when you define roll2:
roll2 <- function(bones = 1:6) {
+ dice <- sample(bones, size = 2, replace = TRUE)
+ sum(dice)
+}
> roll2()
[1] 8
>

You can give your functions as many arguments as you like. Just list their
names, separated by commas, in the parentheses that follow function.
Figure 3. Every function in R has the same parts, and you can use
function to create these parts.
Summary of Using Functions in R and
with more examples
Functions in R
In R, according to the base docs, you define a function
with the construct:
function ( arglist ) {body}
where the code in between the curly braces is the body
of the function. Note that by using build-in functions,
the only thing you need to worry about is how to
effectively communicate the correct input arguments
(arglist) and manage the return value/s (if any)
What are the most popular functions in R?
R has many built in functions, and you can access many
more by installing new packages. So theres no-doubt
you already use functions. This guide will show how to
write your own functions, and explain why this is
helpful for writing nice R code.

The best way to learn more about the inner workings


of functions, is to write our own ones.
Writing your own R functions
The best way to learn more about the inner workings of
functions, is to write our own ones.
Whether we need to accomplish a particular task and
are not aware that a dedicated function or library exists
already; or because by the time we spend googling for
some existing solution we may have already come out
with our own (if not too complicated), we will find
ourselves at some time typing something like:
Below we briefly introduce function syntax, and then
look at how functions help you to write nice R code.

function.name <- function(arguments)


{
computations on the arguments
some other code
}
Writing functions is simple. Paste the following code into
your console

sum.of.squares <- function(x,y) {


x^2 + y^2
}
You have now created a function called sum.of.squares which requires
two arguments and returns the sum of the squares of these
arguments. Since you ran the code through the console, the function is
now available, like any of the other built-in functions within R. Running
sum.of.squares(3,4) will give you the answer 25.

The procedure for writing any other functions is similar, involving three
key steps:

1. Define the function,


2. Load the function into the R session,
3. Use the function.
1. Defining a function
Functions are defined by code with a specific format:

function.name <- function(arg1, arg2, arg3=2, ...) {


newVar <- sin(arg1) + sin(arg2) # do Some Useful Stuff
newVar / arg3 # return value
}
function.name: is the functions name. This can be any valid
variable name, but you should avoid using names that are used
elsewhere in R, such as dir, function, plot, etc.

arg1, arg2, arg3: these are the arguments of the function, also
called formals. You can write a function with any number of
arguments. These can be any R object: numbers, strings, arrays,
data frames, of even pointers to other functions; anything that
is needed for the function.name function to run.
Some arguments have default values specified, such as arg3 in
our example. Arguments without a default must have a value
supplied for the function to run. You do not need to provide a
value for those arguments with a default, as the function will
use the default value.

The argument: The ..., or ellipsis, element in the function


definition allows for other arguments to be passed into the
function, and passed onto to another function. This technique
is often in plotting, but has uses in many other places.
Function body: The function code between the within the {}
brackets is run every time the function is called. This code
might be very long or very short. Ideally functions are short
and do just one thing problems are rarely too small to benefit
from some abstraction. Sometimes a large function is
unavoidable, but usually these can be in turn constructed from
a bunch of small functions. More on that below.
Return value: The last line of the code is the value that will be
returned by the function. It is not necessary that a function
return anything, for example a function that makes a plot might
not return anything, whereas a function that does a
mathematical operation might return a number, or a list.
2. Load the function into the R session
For R to be able to execute your function, it needs first to be read into
memory. This is just like loading a library, until you do it the functions
contained within it cannot be called.

There are two methods for loading functions into the memory:
1. Copy the function text and paste it into the console
2. Use the source() function to load your functions from file.
Our recommendation for writing nice R code is that in most cases, you
should use the second of these options. Put your functions into a file
with an intuitive name, like plotting-fun.R and save this file within the R
folder in your project. You can then read the function into memory by
calling:
source("R/plotting-fun.R")

From the point of view of writing nice code, this approach is


nice because it leaves you with an uncluttered analysis script,
and a repository of useful functions that can be loaded into
any analysis script in your project. It also lets you group
related functions together easily.
3. Using your function
You can now use the function anywhere in your analysis. In
thinking about how you use functions, consider the following:

Functions in R can be treated much like any other R object.


Functions can be passed as arguments to other functions or
returned from other functions.
You can define a function inside of another function.
A little more on the ellipsis argument

The ellipsis argument ... is a powerful way of passing an arbitrary


number of functions to a lower level function. This is how

data.frame(a=1, b=2)
returns a data.frame with two columns and

data.frame(a=1, b=2, c=3)

returns a data.frame with three columns.


Suppose you wanted a function that plots x/y points in red, but
you want all of plots other tricks. You can write the function
like this:
red.plot <- function(x, y, ...) {
plot(x, y, col="red", ...)
}
and then do
red.plot(1:10, 1:10, xlab="My x axis", ylab="My y axis")
and your new function will automatically pass the arguments
xlab and ylab through to plot, even though you never told
red.plot about them.
Example of a Function
pow <- function(x, y) {
# function to print x raised to the power y

result <- x^y


print(paste(x,"raised to the power", y, "is", result))
}
Here, we created a function called pow().
It takes two arguments, finds the first argument raised to the power of
second argument and prints the result in appropriate format.
We have used a built-in function paste() which is used to concatenate
strings.
How to call a function?
We can call the above function as follows.
>pow(8, 2)
[1] "8 raised to the power 2 is 64"

> pow(2, 8)
[1] "2 raised to the power 8 is 256"
Here, the arguments used in the function declaration (x and
y) are called formal arguments and those used while calling
the function are called actual arguments.
Built-in Function
Simple examples of in-built functions are seq(), mean(), max(),
sum(x) and paste(...) etc. They are directly called by user
written programs. You can refer most widely used R functions.
# Create a sequence of numbers from 32 to 44.
print(seq(32,44))

# Find mean of numbers from 25 to 82.


print(mean(25:82))

# Find sum of numbers frm 41 to 68.


print(sum(41:68))
When we execute the above code, it produces the following
result

[1] 32 33 34 35 36 37 38 39 40 41
42 43 44
[1] 53.5
[1] 1526
User-defined Function
We can create user-defined functions in R. They are specific to what a
user wants and once created they can be used like the built-in
functions. Below is an example of how a function is created and used.
# Create a function to print squares of numbers in sequence.
new.function <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}
Calling a Function
# Create a function to print squares of numbers in sequence.
new.function <- function(a) {
for(i in 1:a) {
b <- i^2
print(b)
}
}

# Call the function new.function supplying 6 as an argument.


new.function(6)
When we execute the above code, it produces the following
result
> new.function(6)
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
[1] 36
Calling a Function without an Argument
# Create a function without an argument.
new.function <- function() {
for(i in 1:5) {
print(i^2)
}
}

# Call the function without supplying an argument.


new.function()
When we execute the above code, it produces the following
result
# Create a function without an argument.
new.function <- function() {
for(i in 1:5) {
print(i^2)
}
}

# Call the function without supplying an argument.


new.function()
When we execute the above code, it produces the following
result

> new.function()
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
>
Calling a Function with Argument Values (by position and by
name)

The arguments to a function call can be supplied in the same


sequence as defined in the function or they can be supplied in
a different sequence but assigned to the names of the
arguments.
# Create a function with arguments.
new.function <- function(a,b,c) {
result <- a * b + c
print(result)
}
# Call the function by position of arguments.
new.function(5,3,11)

# Call the function by names of the arguments.


new.function(a = 11, b = 5, c = 3)
When we execute the above code, it produces the following
result

> # Call the function by position of arguments.


> new.function(5,3,11)
[1] 26
>
> # Call the function by names of the arguments.
> new.function(a = 11, b = 5, c = 3)
[1] 58
>
Calling a Function with Default Argument

We can define the value of the arguments in the function


definition and call the function without supplying any
argument to get the default result. But we can also call such
functions by supplying new values of the argument and get
non default result.
# Create a function with arguments.
new.function <- function(a = 3, b = 6) {
result <- a * b
print(result)
}

# Call the function without giving any argument.


new.function()

# Call the function with giving new values of the argument.


new.function(9,5)
When we execute the above code, it produces the following
result
> # Call the function by position of
arguments.
> new.function(5,3,11)
[1] 26
>
> # Call the function by names of the
arguments.
> new.function(a = 11, b = 5, c = 3)
[1] 58
>
Lazy Evaluation of Function
Argument to functions are evaluated lazily, which means so
they are evaluated only when needed by the function body
# Create a function with arguments.
new.function <- function(a,b) {
print(a^2)
print(a)
print(b)
}
#Evalouate function without applying one of the arguments

new.function(6)
When we execute the above code, it produces the following
result

> new.function(6)
[1] 36
[1] 6
Error in print(b) : argument "b" is missing, with no default
>
Control Structures in R
As the name suggest, a control structure controls the flow of
code / commands written inside a function. A function is a set
of multiple commands written to automate a repetitive coding
task.
Example: You have 10 data sets. You want to find the mean of
Age column present in every data set. This can be done in 2
ways: either you write the code to compute mean 10 times or
you simply create a function and pass the data set to it.

Lets understand the control structures in R with simple


examples:
if, else This structure is used to test a condition.
Below is the syntax:

if (<condition>){
##do something
} else {
##do something
}
Example
#initialize a variable
N <- 10
#check if this variable * 5 is > 40
if (N * 5 > 40){
print("This is easy!")
} else {
print ("It's not easy!")
}
[1] "This is easy!"
for This structure is used when a loop is to be executed
fixed number of times. It is commonly used for iterating over
the elements of an object (list, vector). Below is the syntax:

for (<search condition>){


#do something
}
Example
initialize a vector
y <- c(99,45,34,65,76,23)

#print the first 4 numbers of this vector


for(i in 1:4){
print (y[i])
}
[1] 99
[1] 45
[1] 34
[1] 65
while It begins by testing a condition, and executes only if the
condition is found to be true. Once the loop is executed, the
condition is tested again. Hence, its necessary to alter the condition
such that the loop doesnt go infinity. Below is the syntax:
#initialize a condition
Age <- 12

#check if age is less than 17


while(Age < 17){
print(Age)
Age <- Age + 1 #Once the loop is executed, this code
breaks the loop
}
[1] 12
[1] 13
[1] 14
[1] 15
[1] 16
There are other control structures as well but are less
frequently used than explained above. Those structures are:

1.repeat It executes an infinite loop


2.break It breaks the execution of a loop
3.next It allows to skip an iteration in a loop
4.return It help to exit a function

Note: If you find the section control structures difficult to


understand, not to worry. R is supported by various packages to
compliment the work done by control structures.
Useful websites
Top 100 R packages for 2013 (Jan-May)!
https://www.r-statistics.com/2013/06/top-100-r-packages-for-
2013-jan-may/
cheat sheets
https://r-dir.com/reference/crib-sheets.html
Finding the essential R packages using the pagerank algorithm
http://blog.revolutionanalytics.com/2014/12/a-reproducible-r-
example-finding-the-most-popular-packages-using-the-
pagerank-algorithm.html

Você também pode gostar