SpadaHC: a database to improve the classification of variants in hereditary cancer genes in the Spanish population.
José Marcos Moreno-CabreraLidia FeliubadalóMarta PinedaPatricia Prada-DacasaMireia Ramos-MuntadaJesús Del ValleJoan BrunetBernat GelMaría Currás-FreixesBruna CalsinaMilton E Salazar-HidalgoMarta Rodríguez-BaladaBàrbara RoigSara Fernández-CastillejoMercedes Durán DomínguezMónica Arranz LedoMar Infante SanzAdela CastillejoEstela DámasoJosé L SotoMontserrat de MiguelBeatriz Hidalgo CaleroJosé M Sánchez-ZapardielTeresa Ramon Y CajalAdriana LasaAlexandra Gisbert-BeamudAnael López-NovoClara Ruiz-PonteMiriam PotronyMaría I Álvarez-MoraAna OsorioIsabel Lorda-SánchezMercedes RobledoAlberto CascónAnna RuizNino SpataroImma HernanEmma BorràsAlejandro Moles-FernándezJulie EarlJuan CadiñanosAna B Sánchez-HerasAnna BigasGabriel CapelláConxi LázaroPublished in: Database : the journal of biological databases and curation (2024)
Accurate classification of genetic variants is crucial for clinical decision-making in hereditary cancer. In Spain, genetic diagnostic laboratories have traditionally approached this task independently due to the lack of a dedicated resource. Here we present SpadaHC, a web-based database for sharing variants in hereditary cancer genes in the Spanish population. SpadaHC is implemented using a three-tier architecture consisting of a relational database, a web tool and a bioinformatics pipeline. Contributing laboratories can share variant classifications and variants from individuals in Variant Calling Format (VCF) format. The platform supports open and restricted access, flexible dataset submissions, automatic pseudo-anonymization, VCF quality control, variant normalization and liftover between genome builds. Users can flexibly explore and search data, receive automatic discrepancy notifications and access SpadaHC population frequencies based on many criteria. In February 2024, SpadaHC included 18 laboratory members, storing 1.17 million variants from 4306 patients and 16 343 laboratory classifications. In the first analysis of the shared data, we identified 84 genetic variants with clinically relevant discrepancies in their classifications and addressed them through a three-phase resolution strategy. This work highlights the importance of data sharing to promote consistency in variant classifications among laboratories, so patients and family members can benefit from more accurate clinical management. Database URL: https://spadahc.ciberisciii.es/.
Keyphrases
- copy number
- papillary thyroid
- end stage renal disease
- deep learning
- machine learning
- ejection fraction
- newly diagnosed
- chronic kidney disease
- prognostic factors
- quality control
- high resolution
- health information
- high throughput
- gene expression
- transcription factor
- childhood cancer
- single cell
- data analysis
- bioinformatics analysis