Generate candidate codelist for the OMOP CDM
getCandidateCodes.Rd
This function generates a set of codes that can be considered for creating a phenotype using the OMOP CDM.
Usage
getCandidateCodes(
cdm,
keywords,
exclude = NULL,
domains = "Condition",
standardConcept = "Standard",
searchInSynonyms = FALSE,
searchNonStandard = FALSE,
includeDescendants = TRUE,
includeAncestor = FALSE
)
Arguments
- cdm
cdm_reference via CDMConnector
- keywords
Character vector of words to search for. Where more than one word is given (e.g. "knee osteoarthritis"), all combinations of those words should be identified positions (e.g. "osteoarthritis of knee") should be identified.
- exclude
Character vector of words to identify concepts to exclude.
- domains
Character vector with one or more of the OMOP CDM domain.
- standardConcept
Character vector with one or more of "Standard", "Classification", and "Non-standard". These correspond to the flags used for the standard_concept field in the concept table of the cdm.
- searchInSynonyms
Either TRUE or FALSE. If TRUE the code will also search using both the primary name in the concept table and synonyms from the concept synonym table.
- searchNonStandard
Either TRUE or FALSE. If TRUE the code will also search via non-standard concepts.
- includeDescendants
Either TRUE or FALSE. If TRUE descendant concepts of identified concepts will be included in the candidate codelist.
- includeAncestor
Either TRUE or FALSE. If TRUE the direct ancestor concepts of identified concepts will be included in the candidate codelist.
Examples
# \donttest{
cdm <- CodelistGenerator::mockVocabRef()
#> Warning: ! 5 column in person do not match expected column type:
#> • `person_id` is numeric but expected integer
#> • `gender_concept_id` is numeric but expected integer
#> • `year_of_birth` is numeric but expected integer
#> • `race_concept_id` is numeric but expected integer
#> • `ethnicity_concept_id` is numeric but expected integer
#> Warning: ! 3 column in observation_period do not match expected column type:
#> • `observation_period_id` is numeric but expected integer
#> • `person_id` is numeric but expected integer
#> • `period_type_concept_id` is numeric but expected integer
#> Warning: ! 8 column in cdm_source do not match expected column type:
#> • `cdm_source_abbreviation` is logical but expected character
#> • `cdm_holder` is logical but expected character
#> • `source_description` is logical but expected character
#> • `source_documentation_reference` is logical but expected character
#> • `cdm_etl_reference` is logical but expected character
#> • `source_release_date` is logical but expected date
#> • `cdm_release_date` is logical but expected date
#> • `vocabulary_version` is logical but expected character
#> Warning: ! 3 column in concept do not match expected column type:
#> • `valid_start_date` is logical but expected date
#> • `valid_end_date` is logical but expected date
#> • `invalid_reason` is logical but expected character
#> Warning: ! 1 column in vocabulary do not match expected column type:
#> • `vocabulary_concept_id` is numeric but expected integer
#> Warning: ! 3 column in concept_relationship do not match expected column type:
#> • `valid_start_date` is logical but expected date
#> • `valid_end_date` is logical but expected date
#> • `invalid_reason` is logical but expected character
#> Warning: ! 1 column in concept_synonym do not match expected column type:
#> • `language_concept_id` is logical but expected integer
#> Warning: ! 2 column in concept_ancestor do not match expected column type:
#> • `min_levels_of_separation` is numeric but expected integer
#> • `max_levels_of_separation` is numeric but expected integer
#> Warning: ! 7 column in drug_strength do not match expected column type:
#> • `amount_value` is logical but expected numeric
#> • `amount_unit_concept_id` is numeric but expected integer
#> • `numerator_unit_concept_id` is numeric but expected integer
#> • `denominator_unit_concept_id` is numeric but expected integer
#> • `box_size` is logical but expected integer
#> • `valid_start_date` is logical but expected date
#> • `valid_end_date` is logical but expected date
#> Warning: ! 8 column in achilles_analysis do not match expected column type:
#> • `analysis_id` is numeric but expected integer
#> • `analysis_name` is numeric but expected character
#> • `stratum_1_name` is logical but expected character
#> • `stratum_2_name` is logical but expected character
#> • `stratum_3_name` is logical but expected character
#> • `stratum_4_name` is logical but expected character
#> • `stratum_5_name` is logical but expected character
#> • `category` is logical but expected character
#> Warning: ! 7 column in achilles_results do not match expected column type:
#> • `analysis_id` is numeric but expected integer
#> • `stratum_1` is numeric but expected character
#> • `stratum_2` is logical but expected character
#> • `stratum_3` is logical but expected character
#> • `stratum_4` is logical but expected character
#> • `stratum_5` is logical but expected character
#> • `count_value` is numeric but expected integer
#> Warning: ! 16 column in achilles_results_dist do not match expected column type:
#> • `analysis_id` is numeric but expected integer
#> • `stratum_1` is logical but expected character
#> • `stratum_2` is logical but expected character
#> • `stratum_3` is logical but expected character
#> • `stratum_4` is logical but expected character
#> • `stratum_5` is logical but expected character
#> • `min_value` is logical but expected integer
#> • `max_value` is logical but expected integer
#> • `avg_value` is logical but expected numeric
#> • `stdev_value` is logical but expected numeric
#> • `median_value` is logical but expected numeric
#> • `p10_value` is logical but expected numeric
#> • `p25_value` is logical but expected numeric
#> • `p75_value` is logical but expected numeric
#> • `p90_value` is logical but expected numeric
#> • `count_value` is numeric but expected integer
#> Warning: There are observation period end dates after the current date: 2024-11-11
#> ℹ The latest max observation period end date found is 2025-12-31
CodelistGenerator::getCandidateCodes(
cdm = cdm,
keywords = "osteoarthritis"
)
#> Limiting to domains of interest
#> Getting concepts to include
#> Adding descendants
#> Search completed. Finishing up.
#> ✔ 2 candidate concepts identified
#> Time taken: 0 minutes and 0 seconds
#> # A tibble: 2 × 6
#> concept_id found_from concept_name domain_id vocabulary_id standard_concept
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 4 From initial… Osteoarthri… Condition SNOMED S
#> 2 5 From initial… Osteoarthri… Condition SNOMED S
# }