Test-driven development is hard...and important.

I’ve been developing an R package that interacts with the Databrary.org API and with Datavyu annotation files stored locally or on Databrary alongside shared videos. If you’re curious, you can download the databraryapi package from this GitHub repository: https://github.com/PLAY-behaviorome/databraryapi.

Like many people in the software world, I’m entirely self-taught. Ok, I took a class in C programming at the U.S. Department of Agriculture’s Graduate School in the year before I applied to graduate schools in cognitive neuroscience. But my R package development skills are entirely self-taught. I must thank the incredibly generous geniuses who have gone before me and who share their code and their talents so freely and openly. Without the almost instant availability of these resources, my progress would be much, much slower.

Developers are an opinionated bunch, and there are at least as many styles (fads?) in software development as their are developers. One style that I have started to try to emulate is ‘test-driven development’. In TDD, the idea is that you create tests for how each part of your package should respond given this or that input. If your tests are through enough and correct, your package should work…at least within the boundaries of what you tested.

For the latest version of the package (0.1.4), I added a bunch of new tests to evaluate several new functions I’ve added to the package. Let’s just say that getting through my own self-designed test battery was challenging. But as a result, the code is cleaner and less buggy than it would be if I hadn’t gone this route.

In the larger sense, TDD is sort of a “plan for the worst” style. I like it because I know it forces me to be more precise and specific than I might otherwise choose to be. Since I’ve been using the databraryapi more often for keeping tabs on what’s going on in the Databrary world, that’s a very good thing.

In case you’re curious what the package can do, check this out:

## Skipping install of 'databraryapi' from a github remote, the SHA1 (a6a700ea) has not changed since last install.
##   Use `force = TRUE` to force installation

Here is a list of some recently authorized researchers:

## # A tibble: 6 x 6
##      id sortname  prename     affiliation               url   institution
##   <int> <chr>     <chr>       <chr>                     <chr> <lgl>      
## 1  4744 Tadepalli Prasad      Oregon State University   <NA>  NA         
## 2  4871 Black     Jessica Jan Saint Vincent College     <NA>  NA         
## 3  4477 Prull     Matthew     Whitman College           <NA>  NA         
## 4   753 Childers  Jane        Trinity University        <NA>  NA         
## 5  4881 Lindskog  Marcus      Uppsala University        <NA>  NA         
## 6  4852 Hudson    Danae       Missouri State University <NA>  NA

And here is a very simple plot of the growth in authorized researchers and institutions over time:

with(databraryapi::read_csv_data_as_df(), plot(Auth_Investigators, Institutions))
## No encoding supplied: defaulting to UTF-8.

I’ve said in other places that I think scientists will eventually interact with their data programmatically – via scripts like this – with the data and materials stored in repositories that others can also access programmatically. Furthermore, I think that the philosophy of test-driven development can help make our software AND the results and findings we derive from it more robust and reproducible.

Rick Gilmore
Rick Gilmore
Professor of Psychology

My research interests include perceptual development, big data approaches to behavioral science, and open science.