Skip to contents

Use achilles counts to filter a codelist to keep only the codes used in the database

Usage

subsetToCodesInUse(
  x,
  cdm,
  minimumCount = 0L,
  table = c("condition_occurrence", "device_exposure", "drug_exposure", "measurement",
    "observation", "procedure_occurrence", "visit_occurrence")
)

Arguments

x

A codelist

cdm

cdm_reference via CDMConnector

minimumCount

Any codes with a frequency under this will be removed.

table

cdm table

Value

Use achilles counts to filter codelist to only the codes used in the database

Examples

# \donttest{
cdm <- mockVocabRef("database")
#> Warning: ! 5 column in person do not match expected column type:
#>  `person_id` is numeric but expected integer
#>  `gender_concept_id` is numeric but expected integer
#>  `year_of_birth` is numeric but expected integer
#>  `race_concept_id` is numeric but expected integer
#>  `ethnicity_concept_id` is numeric but expected integer
#> Warning: ! 3 column in observation_period do not match expected column type:
#>  `observation_period_id` is numeric but expected integer
#>  `person_id` is numeric but expected integer
#>  `period_type_concept_id` is numeric but expected integer
#> Warning: ! 8 column in cdm_source do not match expected column type:
#>  `cdm_source_abbreviation` is logical but expected character
#>  `cdm_holder` is logical but expected character
#>  `source_description` is logical but expected character
#>  `source_documentation_reference` is logical but expected character
#>  `cdm_etl_reference` is logical but expected character
#>  `source_release_date` is logical but expected date
#>  `cdm_release_date` is logical but expected date
#>  `vocabulary_version` is logical but expected character
#> Warning: ! 3 column in concept do not match expected column type:
#>  `valid_start_date` is logical but expected date
#>  `valid_end_date` is logical but expected date
#>  `invalid_reason` is logical but expected character
#> Warning: ! 1 column in vocabulary do not match expected column type:
#>  `vocabulary_concept_id` is numeric but expected integer
#> Warning: ! 3 column in concept_relationship do not match expected column type:
#>  `valid_start_date` is logical but expected date
#>  `valid_end_date` is logical but expected date
#>  `invalid_reason` is logical but expected character
#> Warning: ! 1 column in concept_synonym do not match expected column type:
#>  `language_concept_id` is logical but expected integer
#> Warning: ! 2 column in concept_ancestor do not match expected column type:
#>  `min_levels_of_separation` is numeric but expected integer
#>  `max_levels_of_separation` is numeric but expected integer
#> Warning: ! 7 column in drug_strength do not match expected column type:
#>  `amount_value` is logical but expected numeric
#>  `amount_unit_concept_id` is numeric but expected integer
#>  `numerator_unit_concept_id` is numeric but expected integer
#>  `denominator_unit_concept_id` is numeric but expected integer
#>  `box_size` is logical but expected integer
#>  `valid_start_date` is logical but expected date
#>  `valid_end_date` is logical but expected date
#> Warning: ! 8 column in achilles_analysis do not match expected column type:
#>  `analysis_id` is numeric but expected integer
#>  `analysis_name` is numeric but expected character
#>  `stratum_1_name` is logical but expected character
#>  `stratum_2_name` is logical but expected character
#>  `stratum_3_name` is logical but expected character
#>  `stratum_4_name` is logical but expected character
#>  `stratum_5_name` is logical but expected character
#>  `category` is logical but expected character
#> Warning: ! 7 column in achilles_results do not match expected column type:
#>  `analysis_id` is numeric but expected integer
#>  `stratum_1` is numeric but expected character
#>  `stratum_2` is logical but expected character
#>  `stratum_3` is logical but expected character
#>  `stratum_4` is logical but expected character
#>  `stratum_5` is logical but expected character
#>  `count_value` is numeric but expected integer
#> Warning: ! 16 column in achilles_results_dist do not match expected column type:
#>  `analysis_id` is numeric but expected integer
#>  `stratum_1` is logical but expected character
#>  `stratum_2` is logical but expected character
#>  `stratum_3` is logical but expected character
#>  `stratum_4` is logical but expected character
#>  `stratum_5` is logical but expected character
#>  `min_value` is logical but expected integer
#>  `max_value` is logical but expected integer
#>  `avg_value` is logical but expected numeric
#>  `stdev_value` is logical but expected numeric
#>  `median_value` is logical but expected numeric
#>  `p10_value` is logical but expected numeric
#>  `p25_value` is logical but expected numeric
#>  `p75_value` is logical but expected numeric
#>  `p90_value` is logical but expected numeric
#>  `count_value` is numeric but expected integer
#> Warning: There are observation period end dates after the current date: 2024-11-11
#>  The latest max observation period end date found is 2025-12-31
#> Warning: ! 5 column in person do not match expected column type:
#>  `person_id` is numeric but expected integer
#>  `gender_concept_id` is numeric but expected integer
#>  `year_of_birth` is numeric but expected integer
#>  `race_concept_id` is numeric but expected integer
#>  `ethnicity_concept_id` is numeric but expected integer
#> Warning: ! 3 column in observation_period do not match expected column type:
#>  `observation_period_id` is numeric but expected integer
#>  `person_id` is numeric but expected integer
#>  `period_type_concept_id` is numeric but expected integer
#> Warning: There are observation period end dates after the current date: 2024-11-11
#>  The latest max observation period end date found is 2025-12-31
codes <- getCandidateCodes(cdm = cdm,
                           keywords = "arthritis",
                           domains = "Condition",
                           includeDescendants = FALSE)
#> Limiting to domains of interest
#> Getting concepts to include
#> Search completed. Finishing up.
#>  3 candidate concepts identified
#> Time taken: 0 minutes and 0 seconds
x <- subsetToCodesInUse(list("cs1" = codes$concept_id,
                               "cs2" = 999),
                                cdm = cdm)
#> No codes from codelist cs2 found in the database

x
#> $cs1
#> [1] 4 5
#> 
CDMConnector::cdmDisconnect(cdm)
# }