Programmatic access
You can access refbank data in R through the refbank package or directly using the Redivis API for Python.
The codebook contains descriptions of all tables and what each column means.
- Install the
refbankrR package:
remotes::install_github("refbank/refbankr")Access the data:
library(refbankr)
# Load table as tidyverse tibble
datasets <- get_datasets()
# experiment-level information about the conditions
conditions <- get_conditions()
# trial-level information about who was in the trial, what the stimuli were, and when in a game it occurred
trials <- get_trials()
# information about participant choices on each trial
choices <- get_choices()
# language data for each trial
messages <- get_messages()
# meta-data about stimuli
images <- get_images()
# meta-data about participants
participants <- get_players()There are also pre-computed summary tables that give a quick overview of the data without downloading full tables.
# summary stats aggregated per game
per_game_summary <- get_per_game_summary()
# counts of data per dataset and condition
dataset_summary <- get_dataset_summary()By default, functions return all datasets in the current version of the data, but you can specify a different version, or a specific set of datasets.
# learn what the current version number is
get_current_version()
# specify a specific version
# or specific datasets
trials <- get_trials(version="7.3", datasets=c("hawkins2023_frompartners", "hawkins2021_respect"))For testing, you can limit the number of results retrieved.
trials <- get_trials(max_results=100)
# this is non-deterministic in which items are returnedFor convenience, you can get some tables with other tables joined in already.
messages <- get_messages(include_trial_data=T,
include_player_data=T,
include_image_data=T,
include_condition_data=T)
choices <- get_choices(include_trial_data=T,
include_player_data=T,
include_image_data=T,
include_condition_data=T)
trials <- get_trials(include_image_data=T,
include_condition_data=T)
You can also download the image files where available.
download_image_files(destination="images/")In addition to the primary data tables, there are also derived tables of processed data, including vector embeddings, cosine similarities, linguistic parses, and message annotations.
embeddings <- get_sbert_embeddings()
# available sim_type values:
# "to_last", "to_next", "to_first", "diverge", "diff", "idiosyncrasy"
similarities <- get_cosine_similarities(sim_type = c("to_last", "to_first"))
# stanza-parsed linguistic output for each message
parsed <- get_parsed_messages()
# human or model annotations for messages
annotated <- get_annotated_messages()See refbankr github repository for more information.
- Install the redivis-python client library:
pip install --upgrade redivisAccess the data:
import redivis
organization = redivis.organization("datapages")
dataset = organization.dataset("refbank")
table = dataset.table("summary")
# Load table as a dataframe
df = table.to_pandas_dataframe()