Use reproducible examples (in R) when asking for help

3 minute read

Published:

Regardless of the language you are coding in, at some point, there will come a time when you just need to ask for help. When you do, you should always include a reproducible example to help explain your problem. Failing to do so will just irritate readers and limit the number of helpful responses.

So what exactly is a reproducible example? Stack overflow has a nice page on this.

For R users there is a wonderful package created by Jenny Bryan (one of the Rstudio folks) to make life a little easier. The package is called reprex and it is a dream to use.

So let’s create a reproducible example! I was recently asked why one of the packages i developed (buoydata) was not returning any buoy information for a given spatial region (between latitudes [33-34] and longitudes [-120 -119]). For example

buoydata::buoyDataWorld |>
  dplyr::filter(LAT > 33,LAT < 34) |>
  dplyr::filter(LON > -119, LON < -120)
#> # A tibble: 0 × 13
#> # ℹ 13 variables: ID <chr>, Y1 <dbl>, YN <dbl>, nYEARS <dbl>, LAT <dbl>,
#> #   LON <dbl>, STATION_LOC <chr>, STATION_NAME <chr>, TTYPE <chr>,
#> #   TIMEZONE <chr>, OWNER <chr>, OWNERNAME <chr>, COUNTRYCODE <chr>

Created on 2025-02-10 with reprex v2.1.0

This chunk of R code was created using the reprex package (as is displayed under the output). Generating this is simple. All you need to do is wrap the reprex function (from the reprex package) around your code.

reprex::reprex({
  buoydata::buoyDataWorld |>
    dplyr::filter(LAT > 33,LAT < 34) |>
    dplyr::filter(LON > -119, LON < -120)
})

When this this bit of code is executed in R, it is rendered to the clipboard (and also to the viewer tab in RStudio). All you need to do is paste it! Whether that is in an email, on stack overflow, or as an issue on GitHub

The recipient can then see exactly what you are trying to do, the functions you are using, and the expected output. This makes it easier for anyone to help troubleshoot your problem.

For this particular example, the problem is that the LON arguments were in reverse order. Eg.

buoydata::buoyDataWorld |>
  dplyr::filter(LAT > 33,LAT < 34) |>
  dplyr::filter(LON > -120 , LON < -119)
#> # A tibble: 8 × 13
#>   ID       Y1    YN nYEARS   LAT   LON STATION_LOC   STATION_NAME TTYPE TIMEZONE
#>   <chr> <dbl> <dbl>  <dbl> <dbl> <dbl> <chr>         <chr>        <chr> <chr>   
#> 1 46025  1982  2023     42  33.8 -119. 33NM WSW of … Santa Monic… 3-me… P       
#> 2 46219  2004  2023     20  33.2 -120. San Nicolas … <NA>         Wave… P       
#> 3 46238  2008  2013      6  33.8 -120. Santa Cruz B… <NA>         Wave… P       
#> 4 46249  2011  2012      2  33.8 -120. Santa Cruz I… <NA>         Wave… P       
#> 5 46251  2013  2023     11  33.8 -120. Santa Cruz B… <NA>         Wave… P       
#> 6 46252  2014  2015      2  34.0 -119. Anacapa Pass… <NA>         Wave… P       
#> 7 46255  2015  2017      3  33.4 -120. Begg Rock, C… <NA>         Wave… P       
#> 8 46262  2017  2018      2  33.7 -119. Santa Barbar… <NA>         Wave… P       
#> # ℹ 3 more variables: OWNER <chr>, OWNERNAME <chr>, COUNTRYCODE <chr>

Created on 2025-02-10 with reprex v2.1.0

A quick, and easy resolution!

Tags: