6 Writing functions
For this topic, we will use data on weight and height to calculate body mass index. As a refresher, body mass index is calculated as follows:
\[ \text{Body mass index} ~=~ \frac{weight ~ (kgs)}{height ~ (m) ^ 2} \] For this topic on writing functions in R, we will use BMI as an example to explore and demonstrate how we can create our own functions in R.
Let’s say for example that you have been doing a research on children aged 11 years and older in 3 schools and you have collected the following data:
School 1516
## school sex ageMonths weight height
## 427 1516 1 138 24.5 126.0
## 428 1516 1 150 28.3 136.3
## 429 1516 1 162 32.2 143.5
## 430 1516 1 162 32.7 143.5
## 431 1516 1 150 28.6 137.0
## 432 1516 2 138 26.5 134.0
## 433 1516 1 150 29.9 139.2
## 434 1516 1 150 30.0 139.5
## 435 1516 1 162 34.0 148.0
## 436 1516 1 138 25.4 135.7
## 437 1516 1 150 32.3 143.0
## 438 1516 2 174 38.3 153.5
## 439 1516 2 162 41.6 151.0
## 440 1516 1 150 30.7 145.0
## 441 1516 2 186 46.8 155.2
## 442 1516 1 186 46.6 163.4
## 443 1516 1 150 33.5 145.5
## 444 1516 1 186 47.0 164.0
## 445 1516 1 174 41.1 159.5
## 446 1516 2 162 39.1 152.2
## 447 1516 2 174 40.9 155.5
## 448 1516 2 162 39.7 153.0
## 449 1516 2 162 40.9 153.2
## 450 1516 1 150 34.2 147.5
## 451 1516 2 150 41.8 149.4
## 452 1516 1 138 28.0 141.5
## 453 1516 1 138 30.0 142.0
## 454 1516 1 138 33.1 142.0
## 455 1516 1 186 46.1 167.5
## 456 1516 1 150 36.2 149.0
## 457 1516 2 162 47.4 156.0
## 458 1516 1 150 30.3 150.2
## 459 1516 2 150 36.4 152.1
## 460 1516 2 150 36.4 155.0
## 461 1516 2 150 44.1 155.0
## 462 1516 2 162 42.3 160.1
## 463 1516 2 179 50.4 163.5
## 464 1516 1 150 37.6 155.0
## 465 1516 2 138 36.0 154.5
## 466 1516 2 138 46.1 156.0
School 1522
## school sex ageMonths weight height
## 646 1522 1 203 30.6 140.5
## 647 1522 1 174 30.8 140.0
## 648 1522 1 162 29.3 136.3
## 649 1522 1 150 24.0 132.0
## 650 1522 1 150 28.1 132.1
## 651 1522 2 150 27.2 134.9
## 652 1522 1 162 34.2 139.2
## 653 1522 1 150 25.5 134.2
## 654 1522 1 138 24.6 129.0
## 655 1522 1 174 36.4 147.5
## 656 1522 1 150 28.7 137.5
## 657 1522 1 186 45.8 155.6
## 658 1522 1 174 36.3 151.6
## 659 1522 1 150 31.0 139.5
## 660 1522 1 138 29.0 134.3
## 661 1522 1 179 38.3 155.5
## 662 1522 2 138 31.3 138.4
## 663 1522 1 162 36.5 148.8
## 664 1522 1 155 36.8 145.2
## 665 1522 1 138 28.3 136.8
## 666 1522 1 138 26.8 137.3
## 667 1522 2 138 32.6 141.4
## 668 1522 2 138 31.9 143.0
## 669 1522 1 174 42.6 160.7
## 670 1522 2 198 57.8 158.0
## 671 1522 2 162 43.9 153.5
## 672 1522 2 150 35.1 150.6
## 673 1522 2 186 52.6 159.6
## 674 1522 2 150 45.1 152.8
## 675 1522 2 138 34.6 147.2
## 676 1522 2 150 45.3 153.1
## 677 1522 1 186 51.8 170.2
## 678 1522 2 150 57.1 154.2
## 679 1522 2 138 33.5 149.2
## 680 1522 1 150 36.3 154.1
## 681 1522 1 174 44.0 169.1
## 682 1522 2 150 44.5 158.3
## 683 1522 2 150 51.5 159.1
## 684 1522 2 138 47.4 157.8
## 685 1522 2 138 36.8 158.5
## 686 1522 2 138 52.0 161.0
School 1525
## school sex ageMonths weight height
## 752 1525 1 186 26.2 137
## 753 1525 1 186 32.7 138
## 754 1525 1 150 25.9 130
## 755 1525 1 162 30.4 137
## 756 1525 2 138 24.4 129
## 757 1525 2 138 23.8 130
## 758 1525 1 150 26.1 133
## 759 1525 1 150 26.4 135
## 760 1525 1 174 35.1 148
## 761 1525 1 162 28.7 142
## 762 1525 1 150 28.0 136
## 763 1525 1 174 34.0 149
## 764 1525 1 186 40.6 155
## 765 1525 2 150 35.8 142
## 766 1525 1 150 35.4 140
## 767 1525 2 138 27.8 137
## 768 1525 2 138 28.2 137
## 769 1525 2 138 29.7 139
## 770 1525 2 138 30.9 139
## 771 1525 1 138 28.2 137
## 772 1525 2 138 26.2 140
## 773 1525 2 138 26.6 140
## 774 1525 1 138 27.2 138
## 775 1525 2 138 27.0 141
## 776 1525 1 150 31.3 145
## 777 1525 2 162 33.9 152
## 778 1525 2 162 42.0 153
## 779 1525 2 185 38.3 157
## 780 1525 2 138 31.0 145
## 781 1525 2 138 32.3 145
## 782 1525 1 139 35.1 144
## 783 1525 2 150 36.4 152
## 784 1525 2 138 32.7 147
## 785 1525 1 174 44.9 166
## 786 1525 2 138 32.2 148
## 787 1525 2 138 36.4 148
## 788 1525 1 138 31.4 146
## 789 1525 2 138 45.0 149
## 790 1525 2 162 49.4 160
## 791 1525 2 138 34.3 150
## 792 1525 1 138 30.0 148
## 793 1525 2 150 37.0 156
## 794 1525 2 162 52.2 165
## 795 1525 2 138 42.9 158
In this dataset, the units of the height measurement is in centimetres.
Using what we have learned earlier on calculating BMI using R, I can perform the following R commands to get the BMI for each child in each of the schools:
## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2
## Calculate BMI for children in school 1516
school1522$weight / (school1516$height / 100) ^ 2
## Calculate BMI for children in school 1516
school1525$weight / (school1516$height / 100) ^ 2
Because the commands are repetitive, I can easily copy and paste my initial line of code to calculate BMI for children in school 1516 and then just change the object names accordingly to calculate the BMI for children in the two other schools.
When I run these lines of code, I get the following results:
## [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313
## Warning in school1522$weight/(school1516$height/100)^2: longer object length is not a multiple of shorter object length
## [1] 19.27438 16.57903 14.22865 11.65487 14.97150 15.14814 17.65012 13.10363 11.23083 19.76704 14.03492 19.43787 15.92035 14.74435
## [15] 12.03967 14.34481 14.78490 13.57079 14.46527 12.21679 11.08343 13.92627 13.59168 19.58058 25.89564 21.92561 17.40726 26.08609
## [29] 16.07485 15.58488 18.61440 22.96095 24.68185 13.94381 15.10926 17.16604 16.64656 21.43600 19.85735 15.12163 32.75384
## Warning in school1525$weight/(school1516$height/100)^2: longer object length is not a multiple of shorter object length
## [1] 16.50290 17.60176 12.57755 14.76284 13.00016 13.25462 13.46983 13.56612 16.02447 15.58555 13.69260 14.42986 17.80624 17.02735
## [15] 14.69670 10.41216 13.32058 11.04253 12.14611 12.17362 10.83529 11.36315 11.58914 12.41023 14.02307 16.93116 20.82920 18.99425
## [29] 11.04923 14.54889 14.42308 16.13472 14.13479 18.68887 13.40271 14.20099 11.74611 18.73049 20.69522 14.09435 18.89645 19.91636
## [43] 25.34934 20.83308
The calculation for the BMI of children in school 1516 seems to have completed without issues and a vector of BMI results have been produced. However, for school 1522 and school 1525, there is a warning saying:
## Warning in school1522$weight/(school1516$height)^2: longer object length is not a multiple
## of shorter object length
Although a result has been provided, the warning gives me an indication that someting is not quite right with my calculation and when I inspect further, I notice that in my formula for school 1522 and for school 1525, my denominator is still using data for school 1516 and this is most likely what is causing the warning message.
So, to correct this I go back to my lines of code and edit the denominators for school 1522 and school 1525 as follows:
## Calculate BMI for children in school 1516
school1516$weight / (school1516$height / 100) ^ 2
## Calculate BMI for children in school 1516
school1522$weight / (school1522$height / 100) ^ 2
## Calculate BMI for children in school 1516
school1525$weight / (school1525$height / 100) ^ 2
which gives me:
## [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313
## [1] 15.50132 15.71429 15.77161 13.77410 16.10277 14.94669 17.65012 14.15908 14.78277 16.73082 15.18017 18.91674 15.79459 15.92991
## [15] 16.07852 15.83937 16.34076 16.48493 17.45479 15.12217 14.21653 16.30492 15.59978 16.49597 23.15334 18.63150 15.47594 20.65000
## [29] 19.31656 15.96837 19.32626 17.88178 24.01416 15.04898 15.28626 15.38741 17.75817 20.34543 19.03550 14.64837 20.06095
## [1] 13.95919 17.17076 15.32544 16.19692 14.66258 14.08284 14.75493 14.48560 16.02447 14.23329 15.13841 15.31463 16.89906 17.75441
## [15] 18.06122 14.81166 15.02477 15.37188 15.99296 15.02477 13.36735 13.57143 14.28271 13.58081 14.88704 14.67278 17.94182 15.53816
## [29] 14.74435 15.36266 16.92708 15.75485 15.13258 16.29409 14.70051 16.61797 14.73072 20.26936 19.29687 15.24444 13.69613 15.20381
## [43] 19.17355 17.18475
I now do not get the warning message and the expected length of BMI values for each school has now been produced.
From this short example above, we realise how tedious a task it is to type in the code above every time we need to calculate BMI. Also, it becomes even challenging to debug issues with the code because we have to review and edit (as needed) each iteration of the calculation to see where it may have gone wrong (especially when doing a cut and paste approach).
It would be better (and easier) to have a function that calculates and displays the BMI values automatically. Fortunately, R
allows us to do just that.
The function()
function allows us to create new functions in R
with the following generic syntax:
Using this template/generic syntax, we apply it to create a function called calculate_bmi
as follows:
We now have a function for calculating and outputing BMI values.
Let us now test it with our 3 sets of data:
School 1516
## [1] 15.43210 15.23333 15.63695 15.87976 15.23789 14.75830 15.43095 15.41604 15.52228 13.79349 15.79539 16.25481 18.24481 14.60166
## [15] 19.42954 17.45347 15.82409 17.47472 16.15550 16.87903 16.91463 16.95929 17.42632 15.71962 18.72730 13.98444 14.87800 16.41539
## [29] 16.43128 16.30557 19.47732 13.43083 15.73414 15.15088 18.35588 16.50280 18.85363 15.65036 15.08153 18.94313
School 1522
## [1] 15.50132 15.71429 15.77161 13.77410 16.10277 14.94669 17.65012 14.15908 14.78277 16.73082 15.18017 18.91674 15.79459 15.92991
## [15] 16.07852 15.83937 16.34076 16.48493 17.45479 15.12217 14.21653 16.30492 15.59978 16.49597 23.15334 18.63150 15.47594 20.65000
## [29] 19.31656 15.96837 19.32626 17.88178 24.01416 15.04898 15.28626 15.38741 17.75817 20.34543 19.03550 14.64837 20.06095
School 1525
## [1] 13.95919 17.17076 15.32544 16.19692 14.66258 14.08284 14.75493 14.48560 16.02447 14.23329 15.13841 15.31463 16.89906 17.75441
## [15] 18.06122 14.81166 15.02477 15.37188 15.99296 15.02477 13.36735 13.57143 14.28271 13.58081 14.88704 14.67278 17.94182 15.53816
## [29] 14.74435 15.36266 16.92708 15.75485 15.13258 16.29409 14.70051 16.61797 14.73072 20.26936 19.29687 15.24444 13.69613 15.20381
## [43] 19.17355 17.18475
In our example here, the calculate_bmi()
function helped a little bit in making the code to calculate BMI for each student in each school more efficient. But the efficiency that functions provide become more evident when you need to make more complex operations. For example, what if you need to get the mean BMI for students in each school? Without a function, we will have to do the following script for each school:
School 1516
## Calculate BMI for children in school 1516
bmi_school1516 <- school1516$weight / (school1516$height / 100) ^ 2
## Get the mean BMI for children in school 1516
mean_bmi_school1516 <- mean(bmi_school1516)
mean_bmi_school1516
## [1] 16.28491
School 1522
## Calculate BMI for children in school 1522
bmi_school1522 <- school1522$weight / (school1522$height / 100) ^ 2
## Get the mean BMI for children in school 1522
mean_bmi_school1522 <- mean(bmi_school1522)
mean_bmi_school1522
## [1] 16.89955
School 1525
## Calculate BMI for children in school 1525
bmi_school1525 <- school1525$weight / (school1525$height) ^ 2
## Get the mean BMI for children in school 1525
mean_bmi_school1525 <- mean(bmi_school1525)
mean_bmi_school1525
## [1] 0.001564695
As the operations/calculations we want to perform become more complex, the copy and paste method becomes more and more tedious. With the function approach, we can use the following:
calculate_mean_bmi <- function(weight, height) {
bmi <- weight / height ^ 2
mean_bmi <- mean(bmi)
return(mean_bmi)
}
Applying the function to the datasets, we get:
School 1516
## [1] 16.28491
School 1522
## [1] 16.89955
School 1525
## [1] 15.64695