6 Writing functions

For this topic, we will use data on weight and height to calculate body mass index. As a refresher, body mass index is calculated as follows:

\[ \text{Body mass index} ~=~ \frac{weight ~ (kgs)}{height ~ (m) ^ 2} \] For this topic on writing functions in R, we will use BMI as an example to explore and demonstrate how we can create our own functions in R.

Let’s say for example that you have been doing a research on children aged 11 years and older in 3 schools and you have collected the following data:

School 1516

school1516
##     school sex ageMonths weight height
## 427   1516   1       138   24.5  126.0
## 428   1516   1       150   28.3  136.3
## 429   1516   1       162   32.2  143.5
## 430   1516   1       162   32.7  143.5
## 431   1516   1       150   28.6  137.0
## 432   1516   2       138   26.5  134.0
## 433   1516   1       150   29.9  139.2
## 434   1516   1       150   30.0  139.5
## 435   1516   1       162   34.0  148.0
## 436   1516   1       138   25.4  135.7
## 437   1516   1       150   32.3  143.0
## 438   1516   2       174   38.3  153.5
## 439   1516   2       162   41.6  151.0
## 440   1516   1       150   30.7  145.0
## 441   1516   2       186   46.8  155.2
## 442   1516   1       186   46.6  163.4
## 443   1516   1       150   33.5  145.5
## 444   1516   1       186   47.0  164.0
## 445   1516   1       174   41.1  159.5
## 446   1516   2       162   39.1  152.2
## 447   1516   2       174   40.9  155.5
## 448   1516   2       162   39.7  153.0
## 449   1516   2       162   40.9  153.2
## 450   1516   1       150   34.2  147.5
## 451   1516   2       150   41.8  149.4
## 452   1516   1       138   28.0  141.5
## 453   1516   1       138   30.0  142.0
## 454   1516   1       138   33.1  142.0
## 455   1516   1       186   46.1  167.5
## 456   1516   1       150   36.2  149.0
## 457   1516   2       162   47.4  156.0
## 458   1516   1       150   30.3  150.2
## 459   1516   2       150   36.4  152.1
## 460   1516   2       150   36.4  155.0
## 461   1516   2       150   44.1  155.0
## 462   1516   2       162   42.3  160.1
## 463   1516   2       179   50.4  163.5
## 464   1516   1       150   37.6  155.0
## 465   1516   2       138   36.0  154.5
## 466   1516   2       138   46.1  156.0

School 1522

school1522
##     school sex ageMonths weight height
## 646   1522   1       203   30.6  140.5
## 647   1522   1       174   30.8  140.0
## 648   1522   1       162   29.3  136.3
## 649   1522   1       150   24.0  132.0
## 650   1522   1       150   28.1  132.1
## 651   1522   2       150   27.2  134.9
## 652   1522   1       162   34.2  139.2
## 653   1522   1       150   25.5  134.2
## 654   1522   1       138   24.6  129.0
## 655   1522   1       174   36.4  147.5
## 656   1522   1       150   28.7  137.5
## 657   1522   1       186   45.8  155.6
## 658   1522   1       174   36.3  151.6
## 659   1522   1       150   31.0  139.5
## 660   1522   1       138   29.0  134.3
## 661   1522   1       179   38.3  155.5
## 662   1522   2       138   31.3  138.4
## 663   1522   1       162   36.5  148.8
## 664   1522   1       155   36.8  145.2
## 665   1522   1       138   28.3  136.8
## 666   1522   1       138   26.8  137.3
## 667   1522   2       138   32.6  141.4
## 668   1522   2       138   31.9  143.0
## 669   1522   1       174   42.6  160.7
## 670   1522   2       198   57.8  158.0
## 671   1522   2       162   43.9  153.5
## 672   1522   2       150   35.1  150.6
## 673   1522   2       186   52.6  159.6
## 674   1522   2       150   45.1  152.8
## 675   1522   2       138   34.6  147.2
## 676   1522   2       150   45.3  153.1
## 677   1522   1       186   51.8  170.2
## 678   1522   2       150   57.1  154.2
## 679   1522   2       138   33.5  149.2
## 680   1522   1       150   36.3  154.1
## 681   1522   1       174   44.0  169.1
## 682   1522   2       150   44.5  158.3
## 683   1522   2       150   51.5  159.1
## 684   1522   2       138   47.4  157.8
## 685   1522   2       138   36.8  158.5
## 686   1522   2       138   52.0  161.0

School 1525

school1525
##     school sex ageMonths weight height
## 752   1525   1       186   26.2    137
## 753   1525   1       186   32.7    138
## 754   1525   1       150   25.9    130
## 755   1525   1       162   30.4    137
## 756   1525   2       138   24.4    129
## 757   1525   2       138   23.8    130
## 758   1525   1       150   26.1    133
## 759   1525   1       150   26.4    135
## 760   1525   1       174   35.1    148
## 761   1525   1       162   28.7    142
## 762   1525   1       150   28.0    136
## 763   1525   1       174   34.0    149
## 764   1525   1       186   40.6    155
## 765   1525   2       150   35.8    142
## 766   1525   1       150   35.4    140
## 767   1525   2       138   27.8    137
## 768   1525   2       138   28.2    137
## 769   1525   2       138   29.7    139
## 770   1525   2       138   30.9    139
## 771   1525   1       138   28.2    137
## 772   1525   2       138   26.2    140
## 773   1525   2       138   26.6    140
## 774   1525   1       138   27.2    138
## 775   1525   2       138   27.0    141
## 776   1525   1       150   31.3    145
## 777   1525   2       162   33.9    152
## 778   1525   2       162   42.0    153
## 779   1525   2       185   38.3    157
## 780   1525   2       138   31.0    145
## 781   1525   2       138   32.3    145
## 782   1525   1       139   35.1    144
## 783   1525   2       150   36.4    152
## 784   1525   2       138   32.7    147
## 785   1525   1       174   44.9    166
## 786   1525   2       138   32.2    148
## 787   1525   2       138   36.4    148
## 788   1525   1       138   31.4    146
## 789   1525   2       138   45.0    149
## 790   1525   2       162   49.4    160
## 791   1525   2       138   34.3    150
## 792   1525   1       138   30.0    148
## 793   1525   2       150   37.0    156
## 794   1525   2       162   52.2    165
## 795   1525   2       138   42.9    158

In this dataset, the units of the height measurement is in centimetres.

Using what we have learned earlier on calculating BMI using R, I can perform the following R commands to get the BMI for each child in each of the schools:

## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2

## Calculate BMI for children in school 1516
school1522$weight / (school1516$height / 100) ^ 2

## Calculate BMI for children in school 1516
school1525$weight / (school1516$height / 100) ^ 2

Because the commands are repetitive, I can easily copy and paste my initial line of code to calculate BMI for children in school 1516 and then just change the object names accordingly to calculate the BMI for children in the two other schools.

When I run these lines of code, I get the following results:

## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2
##  [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313
## Calculate BMI for children in school 1516
school1522$weight / (school1516$height / 100) ^ 2
## Warning in school1522$weight/(school1516$height/100)^2: longer object length is not a multiple of shorter object length
##  [1] 19.27438 16.57903 14.22865 11.65487 14.97150 15.14814 17.65012 13.10363 11.23083 19.76704 14.03492 19.43787 15.92035 14.74435
## [15] 12.03967 14.34481 14.78490 13.57079 14.46527 12.21679 11.08343 13.92627 13.59168 19.58058 25.89564 21.92561 17.40726 26.08609
## [29] 16.07485 15.58488 18.61440 22.96095 24.68185 13.94381 15.10926 17.16604 16.64656 21.43600 19.85735 15.12163 32.75384
## Calculate BMI for children in school 1516
school1525$weight / (school1516$height / 100) ^ 2
## Warning in school1525$weight/(school1516$height/100)^2: longer object length is not a multiple of shorter object length
##  [1] 16.50290 17.60176 12.57755 14.76284 13.00016 13.25462 13.46983 13.56612 16.02447 15.58555 13.69260 14.42986 17.80624 17.02735
## [15] 14.69670 10.41216 13.32058 11.04253 12.14611 12.17362 10.83529 11.36315 11.58914 12.41023 14.02307 16.93116 20.82920 18.99425
## [29] 11.04923 14.54889 14.42308 16.13472 14.13479 18.68887 13.40271 14.20099 11.74611 18.73049 20.69522 14.09435 18.89645 19.91636
## [43] 25.34934 20.83308

The calculation for the BMI of children in school 1516 seems to have completed without issues and a vector of BMI results have been produced. However, for school 1522 and school 1525, there is a warning saying:

## Warning in school1522$weight/(school1516$height)^2: longer object length is not a multiple
## of shorter object length

Although a result has been provided, the warning gives me an indication that someting is not quite right with my calculation and when I inspect further, I notice that in my formula for school 1522 and for school 1525, my denominator is still using data for school 1516 and this is most likely what is causing the warning message.

So, to correct this I go back to my lines of code and edit the denominators for school 1522 and school 1525 as follows:

## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2

## Calculate BMI for children in school 1516
school1522$weight / (school1522$height / 100) ^ 2

## Calculate BMI for children in school 1516
school1525$weight / (school1525$height / 100) ^ 2

which gives me:

## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2
##  [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313
## Calculate BMI for children in school 1516
school1522$weight / (school1522$height / 100) ^ 2
##  [1] 15.50132 15.71429 15.77161 13.77410 16.10277 14.94669 17.65012 14.15908 14.78277 16.73082 15.18017 18.91674 15.79459 15.92991
## [15] 16.07852 15.83937 16.34076 16.48493 17.45479 15.12217 14.21653 16.30492 15.59978 16.49597 23.15334 18.63150 15.47594 20.65000
## [29] 19.31656 15.96837 19.32626 17.88178 24.01416 15.04898 15.28626 15.38741 17.75817 20.34543 19.03550 14.64837 20.06095
## Calculate BMI for children in school 1516
school1525$weight / (school1525$height / 100) ^ 2
##  [1] 13.95919 17.17076 15.32544 16.19692 14.66258 14.08284 14.75493 14.48560 16.02447 14.23329 15.13841 15.31463 16.89906 17.75441
## [15] 18.06122 14.81166 15.02477 15.37188 15.99296 15.02477 13.36735 13.57143 14.28271 13.58081 14.88704 14.67278 17.94182 15.53816
## [29] 14.74435 15.36266 16.92708 15.75485 15.13258 16.29409 14.70051 16.61797 14.73072 20.26936 19.29687 15.24444 13.69613 15.20381
## [43] 19.17355 17.18475

I now do not get the warning message and the expected length of BMI values for each school has now been produced.

From this short example above, we realise how tedious a task it is to type in the code above every time we need to calculate BMI. Also, it becomes even challenging to debug issues with the code because we have to review and edit (as needed) each iteration of the calculation to see where it may have gone wrong (especially when doing a cut and paste approach).

It would be better (and easier) to have a function that calculates and displays the BMI values automatically. Fortunately, R allows us to do just that.

The function() function allows us to create new functions in R with the following generic syntax:

function_name <- function(argument1, argument2, ...) {
  ## Your code here
}

Using this template/generic syntax, we apply it to create a function called calculate_bmi as follows:

calculate_bmi <- function(weight, height) {
  weight / height ^ 2
}

We now have a function for calculating and outputing BMI values.

Let us now test it with our 3 sets of data:

School 1516

calculate_bmi(
  weight = school1516$weight,
  height = school1516$height / 100
)
##  [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313

School 1522

calculate_bmi(
  weight = school1522$weight,
  height = school1522$height / 100
)
##  [1] 15.50132 15.71429 15.77161 13.77410 16.10277 14.94669 17.65012 14.15908 14.78277 16.73082 15.18017 18.91674 15.79459 15.92991
## [15] 16.07852 15.83937 16.34076 16.48493 17.45479 15.12217 14.21653 16.30492 15.59978 16.49597 23.15334 18.63150 15.47594 20.65000
## [29] 19.31656 15.96837 19.32626 17.88178 24.01416 15.04898 15.28626 15.38741 17.75817 20.34543 19.03550 14.64837 20.06095

School 1525

calculate_bmi(
  weight = school1525$weight,
  height = school1525$height / 100
)
##  [1] 13.95919 17.17076 15.32544 16.19692 14.66258 14.08284 14.75493 14.48560 16.02447 14.23329 15.13841 15.31463 16.89906 17.75441
## [15] 18.06122 14.81166 15.02477 15.37188 15.99296 15.02477 13.36735 13.57143 14.28271 13.58081 14.88704 14.67278 17.94182 15.53816
## [29] 14.74435 15.36266 16.92708 15.75485 15.13258 16.29409 14.70051 16.61797 14.73072 20.26936 19.29687 15.24444 13.69613 15.20381
## [43] 19.17355 17.18475

In our example here, the calculate_bmi() function helped a little bit in making the code to calculate BMI for each student in each school more efficient. But the efficiency that functions provide become more evident when you need to make more complex operations. For example, what if you need to get the mean BMI for students in each school? Without a function, we will have to do the following script for each school:

School 1516

## Calculate BMI for children in school 1516
bmi_school1516 <- school1516$weight / (school1516$height / 100) ^ 2

## Get the mean BMI for children in school 1516
mean_bmi_school1516 <- mean(bmi_school1516)

mean_bmi_school1516
## [1] 16.28491

School 1522

## Calculate BMI for children in school 1522
bmi_school1522 <- school1522$weight / (school1522$height / 100) ^ 2

## Get the mean BMI for children in school 1522
mean_bmi_school1522 <- mean(bmi_school1522)

mean_bmi_school1522
## [1] 16.89955

School 1525

## Calculate BMI for children in school 1525
bmi_school1525 <- school1525$weight / (school1525$height) ^ 2

## Get the mean BMI for children in school 1525
mean_bmi_school1525 <- mean(bmi_school1525)

mean_bmi_school1525
## [1] 0.001564695

As the operations/calculations we want to perform become more complex, the copy and paste method becomes more and more tedious. With the function approach, we can use the following:

calculate_mean_bmi <- function(weight, height) {
  bmi <- weight / height ^ 2
  
  mean_bmi <- mean(bmi)
  
  return(mean_bmi)
}

Applying the function to the datasets, we get:

School 1516

calculate_mean_bmi(
  weight = school1516$weight,
  height = school1516$height / 100
)
## [1] 16.28491

School 1522

calculate_mean_bmi(
  weight = school1522$weight,
  height = school1522$height / 100
)
## [1] 16.89955

School 1525

calculate_mean_bmi(
  weight = school1525$weight,
  height = school1525$height / 100
)
## [1] 15.64695