Part 3·3.3·10 min read

Alternative Splicing

Alternative splicing allows one gene to produce multiple protein isoforms — dramatically expanding the proteome from a limited genome.

splicingRNA processingisoformsproteomics

The human contains roughly 20,000 -coding . The human proteome — the complete set of — contains well over 100,000 distinct forms. How? The doesn't encode 100,000 . The discrepancy is resolved largely by : the ability of a single pre- to be spliced in multiple ways, producing different combinations of and therefore different isoforms.

is not an exception or an edge case. It's the rule: approximately 95% of human multi- undergo . It's one of the primary mechanisms through which eukaryotic complexity arises from a surprisingly small .

Splicing Recap

As established in the chapter, after , the pre- contains all . The spliceosome — a large complex of snRNAs (U1, U2, U4, U5, U6) and ~150 — identifies - boundaries by recognizing consensus sequences (5' splice site GU, branch point, polypyrimidine tract, 3' splice site AG) and catalyzes removal.

Constitutive removes every and joins all — the same outcome every time. uses different combinations of splice sites to produce distinct isoforms.

Modes of Alternative Splicing

There are five main patterns:

Exon Skipping

The most common mode (~40% of events). An is included in some transcripts and excluded from others. Inclusion/skipping is controlled by the relative strength of the splice sites flanking the and by regulatory .

Pre-mRNA:  [Exon 1]—[Exon 2]—[Exon 3]—[Exon 4]
Isoform A: [Exon 1]—[Exon 2]—[Exon 3]—[Exon 4]  (include exon 3)
Isoform B: [Exon 1]—[Exon 2]—[Exon 4]            (skip exon 3)

Alternative 5' Splice Site

Different 5' splice sites are used, changing the 5' boundary of an — thus including or excluding a portion of the upstream .

Alternative 3' Splice Site

Different 3' splice sites are used, changing the 3' boundary of an — including or excluding a portion of the downstream .

Intron Retention

An is retained in the mature rather than being spliced out. Often produces a non-functional transcript (with a premature stop codon, triggering NMD) but can also produce functional isoforms. More common in plants; less common in animals, though prevalent in .

Mutually Exclusive Exons

Two or more that are never included in the same transcript. The transcript always includes exactly one of them.

{ }Alternative splicing as feature flags in a compiled binary

Imagine a codebase where certain modules can be compiled in or out depending on build flags. The source is the same; the compiled binary differs. is this mechanism operating at the level. The (source) is fixed. Different , at different times or in different conditions, produce different "builds" by including or excluding .

The downstream consequence: two with identical can express functionally distinct from the same .

Regulation: Splicing Enhancers and Silencers

Splice sites alone don't fully determine which pattern occurs. Local sequences in the pre- regulate spliceosome assembly:

  • Exonic (ESEs): sequences within that promote inclusion
  • Exonic Silencers (ESSs): sequences within that promote skipping
  • Intronic (ISEs): promote inclusion when in adjacent
  • Intronic Silencers (ISSs): promote skipping

These sequences are bound by -binding (RBPs), especially SR (serine/arginine-rich — activators) and hnRNPs (heterogeneous nuclear ribonucleoproteins — often repressors). The balance of these RBPs determines which isoform is produced.

Key RBPs:

  • SRSF1 (ASF/SF2): canonical SR activator
  • hnRNP A1: often antagonizes SR ; promotes skipping
  • NOVA1/NOVA2: -specific regulators; control of many
  • PTBP1: represses inclusion of -specific in non-neural ; PTBP1 downregulation during differentiation allows inclusion of neural-specific

Functional Consequences of Isoforms

can change:

  • domain composition: including or excluding a domain changes the 's interactions and functions
  • Subcellular localization: a localization signal in an alternatively spliced can redirect the
  • stability: some isoforms are more stable; others have shorter half-lives
  • Enzymatic activity: active site residues can be affected
  • Dimerization: isoforms can differ in their ability to form homo- or heterodimers

Classic examples:

BRCA1 produces multiple isoforms through . Isoforms lacking functional BRCT domains can have altered repair and tumor suppressor activity.

BCL-X (BCL2L1 ): the long isoform BCL-XL is anti-apoptotic (prevents programmed death); the short isoform BCL-XS is pro-apoptotic. These are produced from the same by alternative 5' splice site usage. The balance between the two isoforms helps determine whether a lives or dies.

VEGF-A (vascular endothelial growth factor) has multiple isoforms with different binding affinities and diffusion properties — controlling whether the angiogenic signal is local or diffuses widely.

Tau (MAPT ): multiple are alternatively spliced, producing 6 isoforms with different microtubule-binding properties. An imbalance in tau isoforms is implicated in tauopathies including Alzheimer's disease.

Disease-Causing Splicing Mutations

that disrupt splice sites are a major class of pathogenic . They can cause:

  • skipping: loss of the downstream → truncated
  • retention: retained → premature stop codon → NMD
  • Cryptic splice site activation: a nearby sequence with partial homology to a splice site gets activated → aberrant isoform

~15–50% of pathogenic single- affect , either at canonical splice sites or in ESEs/ESSs. Many "missense" in coding sequence actually disrupt by eliminating an ESE rather than (only) changing the .

This has important implications for interpretation: a in the middle of an , with no predicted change, can still be pathogenic if it destroys an ESE. Standard annotation pipelines that only consider effects miss this class.

Splicing-targeted therapeutics

The ability to manipulate has become therapeutically useful. Antisense oligonucleotides (ASOs) can be designed to bind pre- sequences and block or expose splice sites:

  • Nusinersen (Spinraza): treats spinal muscular atrophy (SMA) by blocking an ISS in the SMN2 , forcing inclusion of 7 and producing functional SMN
  • Eteplirsen (Exondys 51): treats Duchenne muscular dystrophy by skipping 51, restoring the reading frame of the dystrophin

Splice-switching is now a validated therapeutic mechanism, with multiple approved drugs.

Measuring Alternative Splicing with RNA-seq

Standard workflows count per , averaging over all isoforms. Detecting requires:

Isoform-level quantification tools: Kallisto, Salmon, and RSEM directly quantify transcript isoforms (not just ) by probabilistically assigning to known transcripts. Output: TPM/estimated counts per transcript.

Differential analysis: rMATS, SUPPA2, and DEXSeq identify which events change between conditions. They quantify Percent Spliced In (PSI, ψ) — the fraction of transcripts that include a given — and test for differences.

PSI ranges from 0 ( always skipped) to 1 ( always included). A PSI change of 0.2 between conditions means the shifts from, say, 40% to 60% included — often biologically meaningful.

Long- (Oxford Nanopore, PacBio) full-length transcripts, directly revealing isoform structures without inference from short . Increasingly used for isoform discovery in tissues with complex patterns (especially brain).

Alternative Splicing and the Proteome

dramatically expands diversity beyond the ~20,000 count:

  • Multiple isoforms per
  • Isoforms with distinct interaction partners, localization, stability
  • Isoform ratios that change during differentiation, disease, and aging

This means that the same ( ) can have different effects depending on which isoforms are expressed in a given type. A might affect a domain present in the ubiquitous isoform but absent in the brain-specific isoform — so the is not brain-related.

For anyone working in genomics, is not an advanced topic. It's part of the baseline: every call, every analysis, every annotation query involves decisions about which isoforms count and how to handle them.