Datasets reference a specific collection of patient data (e.g. EHR, CLAIMS, etc.).


A dataset is one or more tables whose format is defined by a schema.


Datasets Can Share Schemas

For example, the datasets EHR_TUVA and EHR_TUVA_SAMPLE both have the schema EHR_TUVA_SCHEMA.

Datasets differ from schemas in that not all are available to users, and are dependent on your subscription plan. Users in the free trial begin with all the sample datasets, and can upgrade to the premium datasets.


The information you will receive when either listing datasets or getting information on a specific dataset is:

NameThe name of the dataset
SchemaThe schema of that dataset
DescriptionA description of that dataset
AvailableWhether that dataset is currently available to you


You can list all datasets here.

For a given dataset you can:



  • For sample datasets you can retrieve data without specifying a cohort.
  • For premium datasets a cohort must be specified and exports are limited to 50,000 patients per export.