dfrA trimethoprim resistance genes found in Gram-negative bacteria: compilation and unambiguous numbering.

Stephanie J AmbroseRuth M Hall

Published in: The Journal of antimicrobial chemotherapy (2021)

To track the spread of antibiotic resistance genes, accurate identification of individual genes is essential. Acquired trimethoprim resistance genes encoding trimethoprim-insensitive homologues of the sensitive dihydrofolate reductases encoded by the folA genes of bacteria are increasingly found in genome sequences. However, naming and numbering in publicly available records (journal publications or entries in the GenBank non-redundant DNA database) has not always been unambiguous. In addition, the nomenclature has evolved over time. Here, the changes in nomenclature and the most commonly encountered problems and pitfalls affecting dfrA gene identification arising from historically incorrect or inaccurate numbering are explained. The complete set of dfrA genes/DfrA proteins found in Gram-negative bacteria for which readily searchable sequence information is currently available has been compiled using less than 98% identity for both the gene and the derived protein sequence as the criteria for assignment of a new number. In most cases, trimethoprim resistance has been demonstrated. The gene context, predominantly in a gene cassette or near the ori end of CR1 or CR2, is also covered. The RefSeq database that underpins the programs used to automatically identify resistance genes in genome data sets has been curated to assign all sequences listed to the correct number. This led to the assignment of corrected or new gene numbers to several mis-assigned sequences. The unique numbers assigned for the dfrA/DfrA set are now listed in the RefSeq database, which we propose provides a way forward that should end future duplication of numbers and the confusion that causes.

Keyphrases