combinedevents: an R package for calculating scores and marks for combined events in track and field

Introduction to the R package and the competitions

R
track and field
combined events
Author

Katie Frank

Published

July 29, 2022

With the exciting wrap-up of the 2022 World Athletics Championships1 in Eugene, Oregon, this July, I thought it would be a good time to release this long-awaited (by me) blog post. This is my first post- I don’t know how regularly I’ll update here, but this blog should be a good place to keep track of any fun or interesting projects I work on.

In September of 2020, I submitted my first R package combinedevents to CRAN2. In this post, I am going to demonstrate how the package works and touch on the finer details of the package (like a vignette would), but first I’ll introduce what combined events are in track and field.

What are combined events?

Combined events (or multi-events) are competitions in which athletes compete in multiple track and field events encompassing running, jumping, and throwing. Athletes earn points for their performance in each event, and at the end of the competition the summed points across all events gives the total score. The winner is the athlete with the highest point total.

Track and field competitions are held both outdoors and indoors. In indoor track and field, the track is typically 200 meters, half the size of a standard outdoor track, and many of the contested events differ from those held outdoors. The most common outdoor combined events are the men’s decathlon and the women’s heptathlon, while the most common indoor combined events are the men’s heptathlon and women’s pentathlon. Each of these combined events are detailed in the tables below.

Outdoor combined events

Men’s decathlon

The men’s decathlon consists of 10 events competed over 2 days in the following order

Decathlon events
Day 1 Day 2
100 meters 110 meters hurdles
Long jump Discus throw
Shot put Pole vault
High jump Javelin throw
400 meters 1500 meters

Women’s heptathlon

The women’s heptathlon consists of 7 events competed over 2 days in the following order

Heptathlon events
Day 1 Day 2
100 meters hurdles Long jump
High jump Javelin throw
Shot put 800 meters
200 meters

Indoor combined events

Men’s heptathlon

The men’s heptathlon consists of 7 events over 2 days in the following order

Heptathlon events
Day 1 Day 2
60 meters 60 meters hurdles
Long jump Pole vault
Shot put 1000 meters
High jump

Women’s pentathlon

The women’s pentathlon is a one-day competition consisting of 5 events in the following order

Pentathlon events
60 meters hurdles
High jump
Shot put
Long jump
800 meters

How are combined events scored?

The number of points athletes earn in each event is based on a set of scoring tables created by the International Association of Athletics Federation (IAAF), now known as World Athletics3.

These scoring tables are progressive, which means that

  1. Accomplishing more difficult feats will earn you more points.
  2. Equal improvements in performance (e.g., a half-second improvement in 100m time) are not rewarded equally. To elaborate further with the 100m example, even though going from 11.5 to 11 seconds has the same half-second improvement as going from 11 to 10.5 seconds, the increase in points scored for the latter is greater because the latter performance is more difficult for sprinters to achieve. Essentially, the scoring tables reflect the fact that it’s harder to make performance gains as you approach the limits of human performance.

Introduction to the package

To get started with combinedevents, first install the package if you haven’t already, and then load the package.

# install.packages("combinedevents")
library(combinedevents)

The two main functions in combinedevents are scores() and marks(). The package also includes the data frame dec, which contains the performances of 23 decathletes at the 2016 Summer Olympics.

Using scores()

The scores() function calculates scores for combined events competitions. As an example, let’s calculate the points for decathlon champion Ashton Eaton at the 2016 Summer Olympics:

scores(
  marks = c(`100m` = 10.46, LJ = 7.94, SP = 14.73, HJ = 2.01, 
            `400m` = 46.07, `110mH` = 13.8, DT = 45.49, PV = 5.2, 
            JT = 59.77, `1500m` = "4:23.33"),
  gender = "male", 
  combined_event = "decathlon"
  )
   decathlon    mark score
1       100m   10.46   985
2         LJ    7.94  1045
3         SP   14.73   773
4         HJ    2.01   813
5       400m   46.07  1005
6      110mH    13.8  1000
7         DT   45.49   777
8         PV     5.2   972
9         JT   59.77   734
10     1500m 4:23.33   789
11     TOTAL    <NA>  8893

Note: as long as the combined_event argument isn’t NULL, you don’t have to supply the names of the individual events to the marks argument in scores():

# Not run
scores(
  marks = c(10.46, 7.94, 14.73, 2.01, 46.07, 
            13.8, 45.49, 5.2, 59.77, "4:23.33"),
  gender = "male", 
  combined_event = "decathlon"
  )

Figure 1: Ashton Eaton at the 2016 Summer Olympics.

Another features of scores() is that it allows you to calculate the points for as many individual events as you want without having to specify a particular combined event.

scores(
  marks = c(LJ = 7, LJ = 7.01, LJ = 7.02,
            `400m` = 50, `400m` = 49.5, `400m` = 49),
  gender = "male"
  )
  event  mark score
1    LJ  7.00   814
2    LJ  7.01   816
3    LJ  7.02   818
4  400m 50.00   815
5  400m 49.50   838
6  400m 49.00   861

Using marks()

The marks() function calculates marks for track and field combined events competitions. This function performs the opposite action of scores(): you give it the scores you want to obtain, and it gives you the marks you need to achieve those scores. To see its usefulness, let’s first consider the performance of heptathlon champion Katarina Johnson-Thompson at the 2019 World Athletics Championships:

(hep_example <- scores(
  marks = c(`100mH` = 13.09, HJ = 1.95, SP = 13.86,
            `200m` = 23.08, LJ = 6.77, JT = 43.93, `800m` = "2:07.26"),
  gender = "female", 
  combined_event = "heptathlon"
  ))
  heptathlon    mark score
1      100mH   13.09  1111
2         HJ    1.95  1171
3         SP   13.86   785
4       200m   23.08  1071
5         LJ    6.77  1095
6         JT   43.93   743
7       800m 2:07.26  1005
8      TOTAL    <NA>  6981

The vector of scores for the events comprising the heptathlon can be easily extracted from the object hep_example.

(hep_scores <- hep_example$scores)
100mH    HJ    SP  200m    LJ    JT  800m 
 1111  1171   785  1071  1095   743  1005 

Now, let’s see the values of the marks returned when we supply hep_scores to marks().

marks(scores = hep_scores, gender = "female", combined_event = "heptathlon")
  heptathlon    mark score
1      100mH   13.09  1111
2         HJ    1.95  1171
3         SP   13.86   785
4       200m   23.08  1071
5         LJ    6.77  1095
6         JT   43.92   743
7       800m 2:07.30  1005
8      TOTAL    <NA>  6981

Notice that the marks for the first five events are the same as those in hep_example but are different for JT and 800m. In particular, the mark returned for JT is 1cm shorter than her actual mark and for 800m is 40 milliseconds (or 0.04 of a second) slower than her actual mark. This is a result of how the marks() function was written: for track events, the function returns the slowest time needed to achieve the input score. Similarly, for jumping and throwing events, marks() returns the shortest distance necessary to achieve the input score.

Figure 2: Katarina Johnson-Thompson at the 2019 World Athletics Championships.

A couple of asides about marks()

While marks() acts as the natural opposite of scores(), the function is NOT the inverse of scores() because, as we just saw, you can have two different marks mapped to the same score.

For some events, when a score is given to marks(), the score returned may be different from the one input because some scores are not actually possible. This behavior stems from the fact that track and field measurements are only so granular- the finest units of measurement are milliseconds for track events and centimeters for jumping and throwing events. Thus, when an impossible score is given to marks(), the function will return the closest higher score that corresponds to a mark. To get a better idea of what I mean, let’s calculate the scores for two high jump marks: one of 2m (or about 6\(^{\prime}\) 6.75\(^{\prime\prime}\)) and the other 2.01m (roughly 6\(^{\prime}\) 7\(^{\prime\prime}\)).

scores(c(HJ = 2, HJ = 2.01), "male")
  event mark score
1    HJ 2.00   803
2    HJ 2.01   813

From the output, we see that scores of \(\{804, 805, \dots, 812\}\) are not possible for the men’s high jump event. So, when we give marks() those scores, only scores of 813 are returned.

HJ_scores <- 804:812
names(HJ_scores) <- rep("HJ", length(HJ_scores))

marks(HJ_scores, "male")
  event mark score
1    HJ 2.01   813
2    HJ 2.01   813
3    HJ 2.01   813
4    HJ 2.01   813
5    HJ 2.01   813
6    HJ 2.01   813
7    HJ 2.01   813
8    HJ 2.01   813
9    HJ 2.01   813

Final thoughts

I hope this post provided a solid introduction to both combinedevents, the package, and combined events, the competitions. For more information on the package, I recommend checking out its documentation (i.e., run help(package = "combinedevents")). Lastly, while I briefly touched on how combined events are scored, I’d like to take a more in-depth look at this topic in a future post. Thanks for reading!

Footnotes

  1. Outside of the Olympics, the World Athletics Championships are the most well-known athletics competitions. They are typically held every two years. There is also an indoor version of the event held every two years.↩︎

  2. The latest version (version 0.1.1) was released on 02/03/2021. It contains several very minor updates to the originally released package.↩︎

  3. World Athletics is the international governing body of track and field along with several other athletics sports.↩︎