Abstract
Background: The metabolic capacity for nitrogen fixation is known to be present in several prokaryotic species
scattered across taxonomic groups. Experimental detection of nitrogen fixation in microbes requires species-specific
conditions, making it difficult to obtain a comprehensive census of this trait. The recent and rapid increase in the
availability of microbial genome sequences affords novel opportunities to re-examine the occurrence and
distribution of nitrogen fixation genes. The current practice for computational prediction of nitrogen fixation is to
use the presence of the nifH and/or nifD genes.
Results: Based on a careful comparison of the repertoire of nitrogen fixation genes in known diazotroph species we
propose a new criterion for computational prediction of nitrogen fixation: the presence of a minimum set of six
genes coding for structural and biosynthetic components, namely NifHDK and NifENB. Using this criterion, we
conducted a comprehensive search in fully sequenced genomes and identified 149 diazotrophic species, including
82 known diazotrophs and 67 species not known to fix nitrogen. The taxonomic distribution of nitrogen fixation in
Archaea was limited to the Euryarchaeota phylum; within the Bacteria domain we predict that nitrogen fixation
occurs in 13 different phyla. Of these, seven phyla had not hitherto been known to contain species capable of
nitrogen fixation. Our analyses also identified protein sequences that are similar to nitrogenase in organisms that do
not meet the minimum-gene-set criteria. The existence of nitrogenase-like proteins lacking conserved co-factor
ligands in both diazotrophs and non-diazotrophs suggests their potential for performing other, as yet unidentified,
metabolic functions.
Conclusions: Our predictions expand the known phylogenetic diversity of nitrogen fixation, and suggest that this
trait may be much more common in nature than it is currently thought. The diverse phylogenetic distribution of
nitrogenase-like proteins indicates potential new roles for anciently duplicated and divergent members of this
group of enzymes