July 2, 2019

[White Paper] A Comparison of NamesforLife 16S rDNA data vs. Silva v.132 and Greengenes 13.5.99

NamesforLife recently published a white paper contrasting the precision of 16S classification using the NamesforLife HQ16S data product, non-redundant Silva and Greengenes 16S databases.

This study demonstrates two lines of concrete evidence of the dangers of using incorrectly annotated data:

  • Both Silva and Greengenes were found to contain 16S sequences for type strains that had not been updated with the correct nomenclature and yield incorrect identifications, even when sequence similarity was > 99%.
  • Even after reannotation of both data sets the error rates in identification remain high due to incomplete taxonomic coverage.

These problems can only be overcome by continuously revising the underlying reference data and re-evaluating old or existing data in light of new information, including new names. Subscribers to NamesforLife data and nomenclature services are protected from these problems, as these accumulated changes are integrated into all of our services and data products.

WP-N4L-20190702 (301kB PDF)

[permalink] Posted July 2, 2019.

Back to top