A gene family is a set of several similar genes, formed by duplication of a single original gene, that generally have similar biochemical functions . One such family are the genes for human haemoglobin subunits. The 10 genes are in two clusters on different chromosomes, called the α-globin and β-globin loci. Genes are categorized into families based on shared nucleotide or protein sequences. Phylogenetic techniques can be used as a more rigorous test. The positions of exons within the coding sequence can be used to infer common ancestry. Knowing the sequence of the protein encoded by a gene can allow researchers to apply methods that find similarities among protein sequences that provide more information than similarities or differences among DNA sequences. Furthermore, knowledge of the protein's secondary structure gives further information about ancestry, since the organization of secondary structural elements presumably would be conserved even if the amino acid sequence changes considerably.
Evolution of a Gene Family
Unequal crossing over generates gene families. The left side illustrates an unequal crossing over event and the two products that are generated. One product is deleted and the other is duplicated for the same region. In this example, the duplicated region contains a second complete copy of a single gene (B). The right side illustrates a second round of unequal crossing over that can occur in a genome that is homozygous of the original duplicated chromosome. In this case, the crossover event has occurred between the two copies of the original gene. Only the duplicated product generated by this event is shown. Over time, the three copies of the B gene can diverge into three distinct functional units (B1, B2, and B3) of a gene family cluster.
These methods often rely upon predictions based upon the DNA sequence. If the genes of a gene family encode proteins, the term protein family is often used in an analogous manner to gene family. The expansion or contraction of gene families along a specific lineage can be due to chance or can be the result of natural selection. To distinguish between these two cases is often difficult in practice. Recent work uses a combination of statistical models and algorithmic techniques to detect gene families that are under the effect of natural selection.
In contrast, gene complexes are simply tightly linked groups of genes, often created via gene duplication (sometimes called segmental duplication if the duplicates remain side-by-side). Here, each gene has a similar though slightly diverged function.