Prokaryotes contain clustered regularly interspaced short palindromic repeats (CRISPR) that confer resistance against invasive genetic elements. Although these repetitive sequences have been already discovered by initial sequencing efforts of E. coli in 1982, it was Jansen et al. who coined the acronym CRISPR in 2002 after similar structures have been observed in other bacteria and archaea genomes (Jansen et al., 2002). However, the function of these sequences remained unknown until 2005, when three independent reports showed that some parts of the CRISPR array are identical to fragments of viruses and plasmids (Bolotin et al., 2005; Mojica et al., 2005; Pourcel et al., 2005). These findings indicated that CRISPR might provide some sort of adaptive immunity to prokaryotes against foreign genetic elements. In 2007, Barrangou et al. could prove this hypothesis by showing that these sequences were identical to sequences integrated into the bacterial genome when bacteria were challenged with phages. Furthermore, removal or addition of CRISPR sequences modified the phage resistance phenotype (Barrangou et al., 2007). Further studies on the CRISPR system showed that it acts in three stages: 1) Adaptation, whereby short fragments of foreign DNA (Virus or plasmid; so-called ‘spacers’) are incorporated into the leader sequence (L) of the bacterial CRISPR array (Figure 1). The CRISPR array is located in close proximity to cas (CRISPR-associated) genes (Horvath and Barrangou, 2010). 2) Expression of the CRISPR array (and thereby the spacer sequences) results in the generation of a long pre-crRNAs (precursor CRISPR RNA) transcript, which gets further processed into mature crRNAs by the Cas proteins. During 3) interference, these crRNAs guide the Cas proteins to the complementary foreign DNA, resulting in its cleavage and subsequent degradation.
The CRISPR/Cas system can be classified into three different types (I, II, and III), which have been further divided into 10 subtypes (IA-IF, IIA-B, IIIA-B) based on major functional and structural differences (Chylinski et al., 2014). Common to all is the presence of two cas genes (cas1 and cas2), which are required for spacer acquisition. However, different requirements and further Cas proteins, which vary between the different types are needed for crRNA maturation and cleavage activity (Makarova et al., 2011). For example, both, type I and II systems require a well-defined short protospacer adjacent motif (PAM) for target recognition and cleavage utilized by the Cas proteins (Sternberg et al., 2014). Further, for the pre-crRNA
Although the Type IIA system is missing in archea and only present in 5% of bacterial genomes, it has been adopted for genetic engineering applications due to its minimal requirements (Horvath and Barrangou, 2013). Based on the Streptococcus pyogenes Type IIA system, Jinek et al. showed in a seminal paper that a single synthetic guide RNA (gRNA) consisting of a fusion of crRNA and tracrRNA is sufficient to achieve Cas9 endonuclease–mediated cleavage of a specific target DNA sequence (Jinek et al., 2012) (Error! Reference source not found.). Hence, only two components (Cas9 and gRNA) are required to introduce a site-specific double-strand break (DSB) in the genome. Constrained only by the PAM motif (NGG), basically any genomic sequence of the form N20NGG can be targeted with the CRISPR/Cas9 system. Interestingly, Cas9 cleavage activity is determined by two catalytic domains, the RuvC-like and HNH motif, which independently cleave the non-complementary and complementary strand, respectively. Because the tracrRNA is universal, one can direct Cas9 to new target sites by replacing the 20nt crRNA sequence. Thus, in contrast to ZFNs and TALENs, the specificity of RNA-guided endonucleases (RGENs) can be customized by simply replacing a short synthetic RNA molecule, without the need to engineer multiple protein domains for each target site (Gaj et al., 2013).
Shortly after the minimal components of the CRISPR system were identified, two independent research groups reported successful adaptation of Cas9/gRNA to achieve genome editing in mammalian cells (Cong et al., 2013; Mali et al., 2013a). Here, a codon–optimized version of the Cas9 protein bearing a nuclear localization signal was supplied together with two gRNAs simultaneously, resulting in cleavage of both loci, and demonstrating the possibility of multiplexed genome engineering. Since then, additional reports demonstrated efficient and precise mutagenesis across many model organisms, including mouse (Horii et al., 2014; Shen et al., 2013; Wang et al., 2013b)
Target recognition of Cas9 relies on base-pair complementarity between the first 20 nucleotides of an engineered gRNA and the respective genomic DNA sequence containing the PAM sequence. However, recent reports have demonstrated that multiple mismatches are tolerated between the gRNA and its complementary target sequence (Hsu et al., 2013; Jiang et al., 2013), thereby raising the concern of undesired off-target mutagenesis. Indeed, Fu et al. showed relatively high levels of off-target cleavage in human cells (Fu et al., 2013). In order to circumvent such off-target effects and thereby improve specificity of Cas9, a ‘double nicking strategy’ analogous to the heterodimeric FokI domain can be employed: mutant Cas9 containing an asparagine to alanine substitution in the catalytic RuvC domain (D10A) (Cong et al., 2013; Gasiunas et al., 2012; Jinek et al., 2012) alters Cas9 into a nickase, due to impaired cleavage activity (on the non-complementary strand). This leads to only a ‘nick’ instead of double-strand break (DSB) in the complementary DNA strand, thereby minimizing the risk of off-target cleavage because individual nicks are repaired by the cell via the high-fidelity base excision repair pathway (Dianov and Hübscher, 2013). Interestingly, when in close proximity, two nicks on opposite DNA strands, are still recognized by the cell as a DSB. Thus, using two gRNAs along with the nickase enables efficient mutagenesis (Mali et al., 2013b) (Figure 2). A systematic study testing various distances between the two gRNA revealed that effective cleavage is achieved with (i) an ‘off-set’ of minus 10 nt – 100 nt, whereby (ii) gRNAs need to have a tail-tail configuration (gRNAs oriented with the PAM sites most distal from one another) and (iii) the resulting nicks must produce 5’overhangs (Ran et al., 2013a). Notably, a direct overlap of both gRNAs on opposite strands abolishes cleavage activity, presumably due to steric hindrance of the nickase pair.