Quarto reports • visOmopResults

visOmopResults supports table and plot creation functions that can be used to generate publication-ready figures and tables. As an example, in this vignette we show how to create tables and figures for a DARWIN-EU study report.

Quarto document settings

In the header of the Quarto document, we can set a title to identify the study and the document. Additionally, we can point to a Word document containing the template styles for DARWIN reports. The DOCX template can be found here, together with the Quarto script used to generate the report.

A Quarto header for producing a Word report might look like this:

---
title: "DARWIN-EU PX-CY-Z: Tables and Figures"
format:
  docx:
    reference-doc: inst/darwinReportRef.docx
    fig-cap-location: top
execute:
  echo: false
  message: false
  warning: false
lof: true
---

Next, we use the first R chunk to load packages, load data, and set options/variables used across the script. The results we use are obtained from mock data created with the IncidencePrevalence and CohortCharacteristics packages, which we have stored in the [inst](https://github.com/darwin-eu/visOmopResults/tree/main/inst) folder of the packge.

# Load necessary packages ----
library(visOmopResults)
library(IncidencePrevalence)
library(CohortCharacteristics)
library(dplyr)
library(tidyr)
library(ggplot2)

# Load mock results stored in the package ----
data <- visOmopResults::data

# Global options ----
knitr::opts_chunk$set(
  out.width  = "95%",  # figures occupy ~95% of document width
  out.height = "auto",
  dpi        = 320,    # ensure figure quality
  fig.width  = 6,      # default aspect ratio (can be overridden per-figure)
  fig.height = 3,
  results    = "asis"  # enable Markdown produced via cat() inside chunks
)

# DARWIN style for visOmopResults plots and tables.
style <- "darwin"
tableType <- "flextable"
plotType  <- "ggplot"
setGlobalPlotOptions(style = style, type = plotType) 
setGlobalTableOptions(style = style, type = tableType)

# Calibri font in ggplot figures (requires the extrafont package to be available)
# Read vignette on styles to learn more
requireExtrafont()

For tables we choose the "flextable" type because it transfers best to Word documents.

The function requireExtrafont() installs (if needed) the extrafont package and loads the Calibri font for use with ggplot2 when using the DARWIN style.

Characterisation results

The report starts with a table showing baseline characteristics of the denominator cohort in our mock study.

data$summarised_characteristics |>
  dplyr::filter(variable_name != "Sex") |>
  tableCharacteristics(
    header = c("sex"),
    hide   = c("cdm_name", "cohort_name", "table_name"),
    type   = tableType,
    .options = list(style = style)
  )

Variable name	Variable level	Estimate name	Sex
Variable name	Variable level	Estimate name	overall	Female	Male
Number records	–	N	2,564	1,302	1,262
Number subjects	–	N	2,564	1,302	1,262
Cohort start date	–	Median [Q25 - Q75]	2008-01-01 [2008-01-01 - 2008-01-01]	2008-01-01 [2008-01-01 - 2008-01-01]	2008-01-01 [2008-01-01 - 2008-01-01]
		Range	2008-01-01 to 2008-01-01	2008-01-01 to 2008-01-01	2008-01-01 to 2008-01-01
Cohort end date	–	Median [Q25 - Q75]	2018-01-01 [2018-01-01 - 2018-01-01]	2018-01-01 [2018-01-01 - 2018-01-01]	2018-01-01 [2018-01-01 - 2018-01-01]
		Range	2008-01-17 to 2018-01-01	2008-01-17 to 2018-01-01	2009-03-24 to 2018-01-01
Age	–	Median [Q25 - Q75]	46 [36 - 56]	45 [36 - 55]	46 [37 - 56]
		Mean (SD)	47.54 (14.81)	47.52 (15.21)	47.57 (14.40)
		Range	21 to 99	21 to 99	21 to 98
Prior observation	–	Median [Q25 - Q75]	16,834 [13,477 - 20,588]	16,734 [13,320 - 20,410]	16,932 [13,636 - 20,634]
		Mean (SD)	17,545.28 (5,406.72)	17,535.78 (5,551.58)	17,555.07 (5,255.26)
		Range	7,729 to 36,260	7,929 to 36,260	7,729 to 36,115
Future observation	–	Median [Q25 - Q75]	4,010 [3,891 - 4,117]	4,018 [3,900 - 4,118]	4,005 [3,878 - 4,116]
		Mean (SD)	3,924.53 (418.47)	3,937.65 (405.74)	3,910.98 (430.96)
		Range	16 to 4,201	16 to 4,199	448 to 4,201
Days in cohort	–	Median [Q25 - Q75]	3,654 [3,654 - 3,654]	3,654 [3,654 - 3,654]	3,654 [3,654 - 3,654]
		Mean (SD)	3,594.61 (332.90)	3,601.79 (324.60)	3,587.20 (341.22)
		Range	17 to 3,654	17 to 3,654	449 to 3,654
Comorbidities flag -inf to 0	Hiv	N (%)	475 (18.53%)	245 (18.82%)	230 (18.23%)
	Depression	N (%)	2,531 (98.71%)	1,285 (98.69%)	1,246 (98.73%)
	Asthma	N (%)	1,234 (48.13%)	623 (47.85%)	611 (48.42%)
Comedications flag -180 to 0	Antidiabetes	N (%)	26 (1.01%)	11 (0.84%)	15 (1.19%)
	Opioids	N (%)	49 (1.91%)	27 (2.07%)	22 (1.74%)

If the default order of variable names provided by tableCharacteristics() does not match the study requirements, we can create a customised table with this package.

data$summarised_characteristics |>
  dplyr::filter(variable_name != "Sex") |>
  dplyr::mutate(
    variable_name = customiseText(
      variable_name, 
      custom = c(
        "Comorbidities"  = "Comorbidities flag -inf to 0", 
        "Comedications"  = "Comedications flag -180 to 0"
      )
    ),
    variable_level = customiseText(
      variable_level, 
      custom = c("HIV" = "Hiv")
    )
  ) |>
  visOmopTable(
    header = c("sex"),
    estimateName = c(
      "N (%)"               = "<count> (<percentage>%)",
      "N"                   = "<count>",
      "Median [Q25 - Q75]"  = "<median> [<q25> - <q75>]",
      "Mean (SD)"           = "<mean> (<sd>)",
      "Range"               = "<min> to <max>"
    ),
    factor = list(
      "sex" = c("overall", "Male", "Female"),
      "variable_name" = c(
        "Number records", "Number subjects", "Age", "Days in cohort", "Prior observation",
        "Future observation", "Cohort start date", "Cohort end date",
        "Comedications", "Comorbidities"
      ),
      "variable_level" = c(NA, "Asthma", "Depression", "HIV", "Opioids", "Antidiabetes")
    ),
    hide = c("cdm_name", "cohort_name")
  )

Variable name	Variable level	Estimate name	Sex
Variable name	Variable level	Estimate name	overall	Male	Female
Number records	–	N	2,564	1,262	1,302
Number subjects	–	N	2,564	1,262	1,302
Age	–	Median [Q25 - Q75]	46 [36 - 56]	46 [37 - 56]	45 [36 - 55]
		Mean (SD)	47.54 (14.81)	47.57 (14.40)	47.52 (15.21)
		Range	21 to 99	21 to 98	21 to 99
Days in cohort	–	Median [Q25 - Q75]	3,654 [3,654 - 3,654]	3,654 [3,654 - 3,654]	3,654 [3,654 - 3,654]
		Mean (SD)	3,594.61 (332.90)	3,587.20 (341.22)	3,601.79 (324.60)
		Range	17 to 3,654	449 to 3,654	17 to 3,654
Prior observation	–	Median [Q25 - Q75]	16,834 [13,477 - 20,588]	16,932 [13,636 - 20,634]	16,734 [13,320 - 20,410]
		Mean (SD)	17,545.28 (5,406.72)	17,555.07 (5,255.26)	17,535.78 (5,551.58)
		Range	7,729 to 36,260	7,729 to 36,115	7,929 to 36,260
Future observation	–	Median [Q25 - Q75]	4,010 [3,891 - 4,117]	4,005 [3,878 - 4,116]	4,018 [3,900 - 4,118]
		Mean (SD)	3,924.53 (418.47)	3,910.98 (430.96)	3,937.65 (405.74)
		Range	16 to 4,201	448 to 4,201	16 to 4,199
Cohort start date	–	Median [Q25 - Q75]	2008-01-01 [2008-01-01 - 2008-01-01]	2008-01-01 [2008-01-01 - 2008-01-01]	2008-01-01 [2008-01-01 - 2008-01-01]
		Range	2008-01-01 to 2008-01-01	2008-01-01 to 2008-01-01	2008-01-01 to 2008-01-01
Cohort end date	–	Median [Q25 - Q75]	2018-01-01 [2018-01-01 - 2018-01-01]	2018-01-01 [2018-01-01 - 2018-01-01]	2018-01-01 [2018-01-01 - 2018-01-01]
		Range	2008-01-17 to 2018-01-01	2009-03-24 to 2018-01-01	2008-01-17 to 2018-01-01
Comedications	Opioids	N (%)	49 (1.91%)	22 (1.74%)	27 (2.07%)
	Antidiabetes	N (%)	26 (1.01%)	15 (1.19%)	11 (0.84%)
Comorbidities	Asthma	N (%)	1,234 (48.13%)	611 (48.42%)	623 (47.85%)
	Depression	N (%)	2,531 (98.71%)	1,246 (98.73%)	1,285 (98.69%)
	HIV	N (%)	475 (18.53%)	230 (18.23%)	245 (18.82%)

Additionally, we can show the number of people in the overall cohort and stratified by sex. We use the CohortCharacteristics plotting function and then apply the DARWIN style and rotate the axis labels.

data$summarised_characteristics |>
  dplyr::filter(variable_name %in% c("Number records")) |>
  plotCharacteristics(colour = "sex") +
  themeVisOmop(style = style) +
  coord_flip()

Incidence results

For the incidence results, we want to plot incidence over time by sex group. We can do that with the plotting functions in IncidencePrevalence. We add the "darwin" style and rotate x-axis labels for readability.

data$incidence |>
  dplyr::filter(strata_name == "sex") |>
  plotIncidence(colour = "sex", facet = "sex", ribbon = TRUE) +
  themeVisOmop(style = style) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))

Additional results

Sometimes a study includes an analysis not yet supported by the OMOP Tidy R packages. In that case, we might or might not convert our result to a summarised_result, either way, there will be no package-specific table or plot function.

For these cases, we can use visOmopResults to create styled tables and plots that match the rest of the report.

As an example, we have mock results for a measurement change after an intervention. The tibble looks like this:

data$measurement_change
#> # A tibble: 9 × 8
#>   cohort_name sex     variable_name   median   min   max   q25   q75
#>   <chr>       <chr>   <chr>            <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 denominator overall value_before        60    24   117    55 77.2 
#> 2 denominator overall value_after         58    22   108    55 72   
#> 3 denominator overall change_in_value     -2    -2    -9     0 -5.25
#> 4 denominator male    value_before        65    30   120    61 80   
#> 5 denominator male    value_after         66    33   117    62 79   
#> 6 denominator male    change_in_value      1     3    -3     1 -1   
#> 7 denominator female  value_before        53    20   100    48 59   
#> 8 denominator female  value_after         50    19   100    47 60   
#> 9 denominator female  change_in_value     -3    -1     0    -1  1

First, we display results in a table. We pivot to the standard columns estimate_name, estimate_type, and estimate_value; this lets us format estimates and use the header argument in the visTable() function.

data$measurement_change |> 
  tidyr::pivot_longer(
    cols      = c("median", "min", "max", "q25", "q75"),
    names_to  = "estimate_name",
    values_to = "estimate_value"
  ) |>
  dplyr::mutate(
    estimate_type = "numeric",
    estimate_value = as.character(estimate_value),
    variable_name  = customiseText(variable_name),
    sex            = customiseText(sex)
  ) |>
  visTable(
    header = "sex",
    estimateName = c(
      "Median [Q25 - Q75]" = "<median> [<q25> - <q75>]",
      "Range"              = "<min> to <max>"
    ),
    hide   = c("cohort_name", "estimate_type"),
    rename = c("Estimate" = "estimate_name", "Variable" = "variable_name")
  )

Variable	Estimate	Sex
Variable	Estimate	Overall	Male	Female
Value before	Median [Q25 - Q75]	60.00 [55.00 - 77.25]	65.00 [61.00 - 80.00]	53.00 [48.00 - 59.00]
	Range	24.00 to 117.00	30.00 to 120.00	20.00 to 100.00
Value after	Median [Q25 - Q75]	58.00 [55.00 - 72.00]	66.00 [62.00 - 79.00]	50.00 [47.00 - 60.00]
	Range	22.00 to 108.00	33.00 to 117.00	19.00 to 100.00
Change in value	Median [Q25 - Q75]	-2.00 [0.00 - -5.25]	1.00 [1.00 - -1.00]	-3.00 [-1.00 - 1.00]
	Range	-2.00 to -9.00	3.00 to -3.00	-1.00 to 0.00

Additionally, we can visualise values before and after in a boxplot:

data$measurement_change |>  
  dplyr::filter(variable_name %in% c("value_before", "value_after")) |>
  dplyr::mutate(
    variable_name = customiseText(variable_name),
    sex           = customiseText(sex)
  ) |>
  boxPlot(x = "variable_name", facet = "sex", colour = "variable_name") +
  theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
  xlab("")

Note that we did not specify type or style in the visOmopResults functions above because we set these globally at the beginning of the document.

Captions and footers

In Markdown

In the reference Word document, styles are defined for figure legends (CaptionDarwin) and footers (FooterDarwin). These can be used in Quarto as follows:

:::{custom-style="CaptionDarwin"}
**Table 1:** Baseline population characteristics.
:::

In an R code chunk

We might want to create captions and footers inside an R code chunk (e.g., when generating multiple tables/figures in a loop, or when programmatically numbering them). In those cases, we can use cat() within a chunk:

num_table <- 1
cat(paste0(
  ':::{custom-style="CaptionDarwin"}\n',
  '**Table ', num_table, ':** Baseline population characteristics.\n',
  ':::\n'
))
num_table <- num_table + 1

It’s important that we use the chunk option results = 'asis' so that the content produced by cat() is correctly interpreted by Quarto/Pandoc as Markdown.

Report example

You can access an example Quarto script and the reference DOCX document here.