How to download Kobotoolbox data in R

Kobotoolbox is a popular tool or platform used by Non-profit organizations across the globe to collect data. Using a R package it is now possible, with convenience, to download data from Kobotoolbox to R.

Kobotoolbox

Kobotoolbox is perhaps the most widely used data collection tool used by non-profit organizations across the globe. While UNDP uses it to fight malaria, Society for Odonate Studies use it to track dragonfly migration. The tool is widespread. I came across the tool couple of years back while helping Jaljeevika in its initiative to start using Data Science in Fisheries industry in India.

Developed, maintained and supported by the community, Kobotoolbox is a very advanced tool with modern capabilities. It even has the capability to collect data from a remote location (not connected) and transfer the data in server when the device connects to the internet. Despite the advanced capabilities, I found it little difficult to download the data in R directly. I had to create an export manually and download the ‘csv’ or ‘xls.’ But the export does not update itself when new data is added in the survey. So, there was a need to create exports every time there was a need to analyse data.

The other option was to use the Kobotoolbox APIs. This did provide the opportunity to access data directly without creating any export. However, it was an additional task or was bit inconvenient. Usually, one can easily find a package that is a simple to use wrapper of APIs. That is why I started working on creating the KoboconnectR package. After several failed attempts, it is finally in CRAN(https://cran.r-project.org).

KoboconnectR

KoboconnectR package is a small package in R, with simple functions that enables download of data from Kobotoolbox.

Installation

The package can be downloaded from CRAN.

1install.packages("KoboconnectR")

The development version can be installed using

1# install.packages("devtools")  # Install devtools first, if not already installed
2devtools::install_github("asitav-sen/KoboconnectR")

Check the API token

1library(KoboconnectR)
2get_kobo_token(url="kobo.humanitarianresponse.info", uname="userid", pwd="password")

Check if you have the token. The output is usually like this

` $token

[1] “nask976bdshuiqw9829nsh718” `

Check the support documentation of Kobotoolbox for details.

Extracting data

Extracting data is two step process. Step one is to identify the asset id. And next is to use the asset id to extract the data.

Check the assets you have access to

1KoboconnectR::kobotools_api(url="kobo.humanitarianresponse.info", simplified=T, uname="userid", pwd="password")

The output will be a simplified data frame if simplified =T is used in the kobotools_api function.

From the data frame, you can find the asset id you need ((under the column asset)).

## 
Downloading: 3.7 kB     
Downloading: 3.7 kB     
Downloading: 3.7 kB     
Downloading: 3.7 kB

Extract the data

Update (Download data directly using new function)

The data can now be extracted directly using kobo_df_download() function. This function creates an export, downloads the data and then deletes the export from the server. Please note the sleep parameter. This defines the time in seconds before each step. For example, time the system waits before attempting to download after creating an export. The default value is 2 i.e. 2 seconds. In case the download fails during that time, please increase the value and check. In case of any issues, please do not hesitate to raise an issue at the link provided below.

In case of error try increasing the value of sleep parameter(using the dev version please).

1KoboconnectR::kobo_df_download(uname = "username",
2                                          pwd="password", assetid = "assetid",
3                                          lang = "English (en)", all="false", lang="_default",
4                               hierarchy="false", include_grp="true",grp_sep="/",fsep=";",
5                              multi_sel="both", media_url="true", fields=NULL, sub_ids=NULL, sleep=2)

Alternative methods are mnentioned below.

Use the asset id and plug in to the kobotools_kpi_data function to download the data.

1KoboconnectR::kobotools_kpi_data(assetid= "assetid", url="kobo.humanitarianresponse.info", uname="username", pwd="password")

This returns a list from json file. Main data is usually inside results.

Following shows the summary of the list downloaded from one kobotools_kpi_data query.

## 
Downloading: 8.4 kB     
Downloading: 8.4 kB     
Downloading: 8.4 kB     
Downloading: 8.4 kB

##          Length Class  Mode   
## count     1     -none- numeric
## next      0     -none- NULL   
## previous  0     -none- NULL   
## results  39     -none- list

The data downloaded using this procedure is in json format, which, for some of us, is not a very convenient format to deal with. We do prefer straightforward ‘csv’ and ‘xsl’ files. That is why, it is possible to create exports from R using KoboconnectR. The export can be downloaded in R.

‘Kobotoolbox’ provides ability to create export of survey results. These exports can be of different formats including ‘csv’ and ‘xls.’ The manual process is mentioned the documentation.

Viewing the list of existing exports

Using ‘kobo_exports()’ function, you can view the list of existing exports. This includes URL to the ‘xls’ or ‘csv’ file that you can use to download, import or read files in R.

1KoboconnectR::kobo_exports(url="kobo.humanitarianresponse.info", uname="", pwd="")

It is possible to use the URL of the export to download the data in R.

Creating export

Exported data is not updated automatically when new data is entered in the survey. So, you may need to create new export to accommodate new data. To create export, use ‘kobo_export_create()’ function. Please note that ‘Kobotoolbox’ has a limited memory and you may have to delete existing exports manually to clean up.

1new_export_url<-KoboconnectR::kobo_export_create(url="kobo.humanitarianresponse.info", uname="", pwd="",
2                               assetid="", type= "csv", all="false", lang="_default",
3                               hierarchy="false", include_grp="true",grp_sep="/") # Create export
4
5df<-httr::GET(new_export_url, httr::authenticate(user="userid", password ="password")) # Download
6df<-httr::content(df, type="raw",encoding = "UTF-8") # Extract raw content
7writeBin(df, "data.csv") # Write in local
8read.csv("data.csv", sep=";") |> head() |> DT::datatable() #read

On successful execution, the URL of the created export will be returned and printed. You can use this URL to download, import or read data in R.

## 
Downloading: 1.7 kB     
Downloading: 1.7 kB     
Downloading: 1.7 kB     
Downloading: 1.7 kB     Export instruction sent successfully. Waiting for result. 
## 
Downloading: 1.7 kB     
Downloading: 1.7 kB     
Downloading: 1.7 kB     
Downloading: 1.7 kB     [1] "Export successful"
## [1] "https://kobo.humanitarianresponse.info/private-media/scary_scarecrow/exports/Yavatmal_Rapid_Pond_Survey_-_latest_version_-_English_en_-_2022-05-26-08-18-47.csv"
## [[1]]
## [1] "https://kobo.humanitarianresponse.info/private-media/scary_scarecrow/exports/Yavatmal_Rapid_Pond_Survey_-_latest_version_-_English_en_-_2022-05-26-08-18-47.csv"

Please download the package, use it and let me know your feedback.

Call for contribution

Please feel free to contribute to the package. The development version is available at github. Please do not hesitate to add issues or new ideas in github