CohortCharacteristics

  1. Installation
  2. Getting Started
  3. Core Concepts
  4. Summary Functions
    1. Core Characterization:
  5. Table Functions
    1. Formatted Output:
  6. Visualization Functions
    1. Plotting Capabilities:

The CohortCharacteristics R package is designed to summarize and visualize the characteristics of patient cohorts within data mapped to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). It provides a robust toolkit for analyzing patient demographics, cohort overlaps, attrition patterns, and temporal relationships. This package helps researchers understand the composition of their study cohorts, ensuring transparency and reproducibility in observational research.

Installation

You can install the latest version of CohortCharacteristics from CRAN:

install.packages("CohortCharacteristics")

Alternatively, to install the development version from GitHub, use the following command:

# install.packages("pak")
pak::pkg_install("darwin-eu/CohortCharacteristics")

Getting Started

Here’s a simple example to get you started with CohortCharacteristics. This tutorial walks you through connecting to a mock database, generating a cohort, and summarizing its basic characteristics.

# Load necessary libraries
library(CDMConnector)
library(CohortCharacteristics)
library(dplyr)

# 1. Create a mock CDM instance
cdm <- mockCohortCharacteristics()

# 2. Summarise cohort characteristics
characteristics <- summariseCharacteristics(
  cohort = cdm$cohort1
)

# 3. View the results as a table
tableCharacteristics(characteristics)

Core Concepts

The package follows a standardized three-tier analysis pattern: summarise*()plot*()table*(). This ensures a consistent workflow where summarization functions create standardized result objects, which then feed into visualization and table generation functions.

flowchart TD
    subgraph "Input Layer"
        CDM["OMOP CDM Database"]
        COHORT["cohort_table objects"]
    end

    subgraph "Analysis Pipeline"
        A[summarise* functions] --> B{summarised_result objects}
        B --> C[plot* functions]
        B --> D[table* functions]
    end

    subgraph "Output Layer"
        C --> E["Interactive Visualizations"]
        D --> F["Formatted Tables"]
    end

    CDM --> A
    COHORT --> A

Summary Functions

Core Characterization:

  • summariseCharacteristics(strata, cohortId, ageGroup, tableIntersect, cohortIntersect, conceptIntersect) - Generate comprehensive cohort characteristics.
  • summariseLargeScaleCharacteristics(window, eventInWindow, minimumFrequency) - Summarize all clinical events in time windows.
  • summariseCohortTiming(restrictToFirstEntry) - Analyze timing between cohort entries.
  • summariseCohortOverlap() - Analyze overlap between multiple cohorts.
  • summariseCohortAttrition() - Generate cohort attrition summaries.

Table Functions

Formatted Output:

  • tableCharacteristics(result, header) - Create formatted characteristic tables.
  • tableLargeScaleCharacteristics(result) - Format large-scale characteristic results.
  • tableTopLargeScaleCharacteristics(result, topConcepts) - Show top concepts from large-scale analysis.
  • tableCohortOverlap(result) - Format cohort overlap tables.
  • tableCohortTiming(result, timeScale, uniqueCombinations) - Format timing analysis tables.
  • tableCohortAttrition(result) - Format attrition tables.

Visualization Functions

Plotting Capabilities:

  • plotCharacteristics(result, plotStyle, facet, colour) - Create characteristic plots (boxplot, barplot).
  • plotCohortOverlap(result, uniqueCombinations) - Visualize cohort overlaps.
  • plotCohortTiming(result, plotType, timeScale, uniqueCombinations) - Create timing plots (boxplot, densityplot).
  • plotCohortAttrition(result, type) - Generate attrition flowcharts.
  • plotComparedLargeScaleCharacteristics(result, colour, reference) - Create comparative scatter plots for large-scale characteristics.

Back to top

Copyright © 2017-2025 IOMED Medical Solutions SL.