Research Summary

The Weissman laboratory is looking at how cells ensure that proteins fold into their correct shape, as well as the role of protein misfolding in disease and normal physiology. We are also developing experimental and analytical approaches for exploring the organizational principles of biological systems and globally monitoring protein translation through ribosome profiling.

Driven by technical and analytical advances, biology is going through a transformation from a descriptive to a more principled and information rich science. These efforts are critical to meet the central challenge of the post-genomic era: defining the information that is encoded in genomes and understanding how this information is expressed in space and time. The Weissman laboratory has been uniquely positioned both to help drive and to take advantage of this transformation. We have developed novel approaches that have transformed fields including ribosome profiling for monitoring how proteins are made in vivo as well as the CRISPRi, CRISPRa, and CRISPRoff approaches for turning down, up, or off respectively, the expression of any gene in a human cell. Critically, we have also used these approaches to make fundamental “text book” biological insights focusing on the broad area of how the cell ensures that proteins (the work horse of the cell) fold into their proper three-dimensional structure, a prerequisite to their function, as well as the causes and consequences of protein misfolding which underlies some of the most devastating human diseases such as Alzheimer’s Disease.

The ability to bridge large-scale approaches and mechanistic investigations is a key focus of the Weissman lab. Beyond the immediate payoff of these efforts, we hope our work will contribute to the broader goal of developing more principled, less ad hoc approaches for defining gene functions.

Below are descriptions of six broad project areas of the lab.

Ribosome Profiling

The advent of microarray technology, and more recently RNA-seq, has made it possible to monitor the internal state of a cell with unprecedented precision. However, these approaches monitor messenger RNA (mRNA) levels, whereas for the large majority of cases it is proteins that are directly responsible for mediating a gene's cellular functions. We know that mRNA levels are often a highly imperfect proxy for protein production due to the extensive use of translational control.

Invented by my lab in 2009, ribosome profiling has revolutionized our ability to monitor protein synthesis in vivo making it possible to determine the start, stop, reading frame, chaperone engagement, subcellular location, and rate of synthesis of virtually every protein in a cell. Ribosome profiling takes advantage of the fact that, as a ribosome reads a messenger RNA, it shields a small piece of this message from digestion. Sequencing of these protected fragments thus provides an exact snapshot of what part of a protein was being made by each ribosome in a cell.

We have gone on to exploit ribosome profiling to make a number of fundamental insights into the nature of translation. These include: (1) the discovery of the principle of proportional synthesis in which different subunits in multi-subunit complexes are made in direct proportion to their stoichiometry. This reveals an important aspect of cellular “economy” which provides critical insights into fields as disparate as engineering synthetic biological systems to understanding cancer. (2) Defining the complete protein coding capacity of genomes. We can now sequence the genome of any organism or individual rapidly and at low costs but in order to make sense of these data we have to know what proteins are encoded in these genomes. In every organism explored (bacterial, yeast, viruses and mammalian systems), ribosome profiling has revealed a universe of new proteins including a new class of “micropeptides” which are far shorter than standard proteins but nonetheless can play critical and diverse roles. We have now shown that hundreds of these micropeptides play essential and diverse roles in human cells. (3) Elucidation of the principles of cotranslational engagement and targeting of proteins. As soon as a protein emerges from the ribosome, it is engaged by a series of factors that help that protein reach its native state and its proper location in the cell. We have developed a variant of the ribosome profiling technique that allows one to watch this process with unprecedented precision and used this, among other things, to rewrite our textbook understanding of how proteins are targeted for secretion to the outside world.

Protein Folding in the Endoplasmic Reticulum

As a rule, proteins that enter the secretory pathway fold within the ER. The ER establishes and maintains a highly specialized environment optimized for folding. Understanding how this is accomplished is a major focus of our research.

But of course the ER is much more than just the entry point to the secretory pathway. This organelle integrates lipid biosynthesis, protein folding, glycosylation, quality control and degradation and plays a key role in cellular homeostasis, infection, and immune defense. Accordingly, our efforts have taken a more holistic approach, encompassing all aspects of ER function.

Major findings include the following:

Identification and characterization of Yos9 as a sugar sensor of misfolded proteins

How misfolded forms are recognized and distinguished from the myriad of folding substrates represents a fascinating problem in discrimination. Cells must maintain a balance between overly promiscuous destruction of slow-folding proteins, while preventing escape of toxic forms. For most luminal proteins, this discrimination depends not only on the folding status of a polypeptide, but also on its glycosylation state.

We, and others concurrently, identified the conserved luminal protein Yos9 as the lectin sensor responsible for probing the glycosylation state of misfolded proteins. This allowed us, in a series of papers, to work out the elaborate mechanism used to correctly mark and degrade terminally misfolded proteins. This strategy of marking potential substrates by a sugar modification and decoding the signal in a second recognition event provides a proofreading mechanism for enhancing substrate specificity. We are now recapitulating this modification and recognition cascade in vitro from purified components. I am particularly excited about this work as it represents a terrific opportunity to provide our most in-depth structural and mechanistic understanding to the fascinating question of how the ER tells right from wrong.

Discovery of a novel branch of the metazoan UPR involving targeted mRNA destruction

Elegant work pioneered by my UCSF/HHMI colleague and collaborator Peter Walter identified Ire1 as a key ER sensor of misfolded proteins. These studies revealed the canonical activation pathway in which unfolded proteins sensed in the ER lumen induce the cytoplasmic nuclease domain of Ire1 to cleave the mRNA encoding XBP-1 (Hac1 in yeast), enabling splicing and production of active transcription factor. This allows for the remodeling of the ER in response to stress. But transcriptional responses are slow. We found that in metazoans, activation of the Ire1 endonuclease independently induces the rapid turnover of many mRNAs encoding membrane and secreted proteins through a pathway we call regulated Ire1-dependent decay (RIDD). This response is well suited to complement other UPR mechanisms by selectively halting production of proteins that challenge the ER and clearing the translocation and folding machinery for the subsequent remodeling process. More recently, we, and independently the Papa group, provided evidence that cells use a multi-tiered mechanism by which different conditions in the ER lead to distinct outputs (transcriptional versus RIDD) from Ire1. This allows for a nuanced response to different stresses and provides the potential for therapeutics that "surgically" intervene with specific Ire1 outputs, preventing proapoptotic effects while keeping protective outputs intact.

Identification of the GET pathway: a conserved system responsible for the biogenesis of tail-anchored membrane proteins

Tail-anchored (TA) proteins, defined by the presence of a single C-terminal transmembrane domain (TMD), play critical roles throughout the secretory pathway and in mitochondria. Because of its position near the C terminus, the TMD of TA proteins is occluded by the ribosome until translation is completed. Thus, TA proteins cannot exploit the classic cotranslational SRP/Sec61 translocation mechanism used by most secretory pathway proteins. Starting from our Genetic Interaction Map approach (see below), we identified the Guided Entry of Tail-anchor (GET) proteins as three components of a conserved machinery responsible for the biogenesis of TA proteins. Further genetic interactions maps identified two additional ribosome-associated components (Get4 and Get5). These and other observations suggest a pathway for the biogenesis of TA proteins in which Get4/5 recognize TA proteins as synthesis is completed. The TA is then delivered to a soluble ATPase, Get3, which is subsequently recruited to the ER membrane by a receptor composed of the Get1/2 proteins, thereby allowing proper membrane insertion.

Discovery of a molecular caliper mechanism for determining the length of very long-chain fatty acids

Despite their numerous essential functions, our understanding of how very long-chain fatty acids (VLCFAs) are made was highly limited. From an academic standpoint, the chain length diversity of VLCFAs, which enables their functional diversity, represents a fascinating example of how polymerization of simple building blocks can be used to build chemically complex macromolecules. From a practical standpoint, there are potential therapeutic and commercial benefits to be gained from manipulating VLCFA synthesis. A key obstacle to these endeavors has been the fact that VLCFAs are made by a collection of detergent-labile, multi-enzyme complexes embedded in the ER membrane. Consequently, even though the basic outline of the chemistry underlying VLCFA synthesis was established in the 1960’s, the minimal set of proteins, including the key dehydratase component, needed for VLCFA synthesis was not known.

We identified Yjl097w (Phs1p) as the founding member of a new family of membrane proteins that act as the VLCFA dehydratase. This missing piece allowed us to reconstitute VLCFA synthesis in proteoliposomes containing purified Phs1p and three other membrane proteins. We used this in vitro system to address what is arguably the most important question regarding VLCFA synthesis: how does the biosynthetic machinery instruct the precise number of two-carbon additions that yields a defined length product. Our studies revealed a fascinating mechanism, akin to a caliper, in which the chain length is determined by the distance between the cytosolic active site that mediates two-carbon addition and a lysine near the luminal end of a transmembrane helix. By stepping this lysine residue along one face of the helix toward the cytosol, we engineered novel synthases with correspondingly shorter VLCFA outputs.

Identification of the Orm family of proteins as critical mediators of sphingolipid homeostasis

Although recent studies have shed light on the critical roles of sphingolipids both as structural components of membranes and critical signaling molecules, how cells sense and regulate the levels of sphingolipids has remained a mystery. We found that members of the highly conserved but previously uncharacterized Orm protein family inhibit the first and rate-limiting step of de novo sphingolipid production. Moreover, the activity of Orm proteins is dynamically regulated by a key feedback loop: when sphingolipid synthesis is disrupted, Orm proteins are inactivated by phosphorylation, thus providing cells with an elegant mechanism to couple sphingolipid production to metabolic demand.

Our findings are of particular interest in light of the recent identification of a major genetic risk factor for asthma near the locus of the human ORM gene homolog ORMDL3. Polymorphisms which alter ORMDL3 expression account for up to ~20% of childhood asthma cases and have also been implicated in other inflammation-based diseases such as Type I diabetes. Thus, our finding that alterations in ORM gene activity profoundly affect sphingolipid metabolism raises the testable hypothesis that misregulation of sphingolipids contributes directly to the pathogenesis of asthma. We are now extending our initial studies to define the mechanism by which Ormdl proteins are regulated in mammalian cells. Recent progress in identifying upstream components of the yeast phosphorylation feedback pathway will guide these efforts and may also enable us to define a molecular sensor of sphingolipid levels. Thus, our work on Orm family proteins promises to inform a fundamental question in cell biology and may also reveal novel roles for sphingolipids in inflammatory diseases.

CRISPR-based engineering of the epigenome

While the catalog of mammalian genes and when and where these genes are turned on is rapidly expanding, our understanding of their function lags behind. The Weissman lab and collaborators developed robust technologies that enables the systematic investigation of the cellular consequences of turning on or off individual genes. We have identified rules for specific targeting of transcriptional repressors (CRISPRi), typically achieving 90-99% knockdown with minimal off-target effects, and activators (CRISPRa) to endogenous genes via endonuclease-deficient Cas9. Using these rules, we constructed and validated genome-scale CRISPRi and CRISPRa libraries that enable systematic analysis of gene function including both essential and nonessential as well as long noncoding RNAs. Collectively, this work established CRISPRi and CRISPRa as powerful tools that provide rich and complementary information for mapping complex pathways. We have also adapted this approach to allow the largescale analysis of double knockdowns. This enables the systematic search for synthetic lethal interactions that will inform the rational design of combination drug therapies. We are broadly applying the CRISPRi/a approach to understanding disease mechanisms, defining drug targets, and even potentially treating disease by reversibly regulating gene expression without permanently altering patients’ DNA. Most recently, we have developed a variant of CRISPRi/a termed CRISPRoff and CRISPRon that recruits DNA methylase and demethylases, respectively, enabling the rewriting of the DNA methylome allowing heritable and programmable turning off or on of the vast majority of genes. This has provided an unprecedented tool for the heritable control of gene expression and for understanding the mechanism of epigenetic control of gene expression.

Mapping Cancer Evolution

The Weissman lab has designed, built, and optimized molecular recorder technologies, which makes it possible in principle to capture critical features of a cell’s life—environmental insults, developmental decisions, external and internal signals, ancestry and progeny—in a defined and compact region within its genome. These recordings can then be read out in a massively parallel manner using droplet-based single-cell RNA-sequencing technology. Just as the flight recorder of a plane provides critical forensic information about the normal operation of a plane and how these operations failed, molecular recorders provide an unprecedented view of normal biology and disease. They have built upon their molecular recorder technology in several ways that make it both more easily deployable to study various biological contexts of interest and more information-rich for larger, longer, and deeper recording experiments. Together with Nir Yosef’s group, we have developed a computational pipeline for processing lineage tracing data from large recording experiments, algorithms for phylogenetic tree reconstruction, and a framework for interpreting meaningful biology from lineage tracing data. With these improvements, we have expanded the possible applications for molecular recording and lineage tracing experiments. In addition to our initial application to mammalian embryogenesis, a major focus of the Weissman lab is to use these molecular recorder technologies to study critical aspects of different stages of cancer progression including tracing the rate, routes and drivers of cancer metastasis as well the evolutionary process by which a cell containing oncogenic mutation develops into an aggressive tumor and evades therapeutic challenge.

Understanding the Organizational Principles of Cellular Systems

How cellular and organismal complexity emerges from combinatorial expression of genes is a central question in biology that has been a major focus of my lab for nearly two decades. We helped develop the first systematic quantitative genetic interaction maps in 2005 and applied this to understanding how the ER functions as a protein folding organelle. We have gone on to build similar maps in other organisms, including fission yeast (S. pombe), and mammalian cells, and use these to explore a remarkably diverse range of biological phenomenon including how combinatorial perturbations can drive complex cell differentiations. We have also extended this genetic interaction paradigm to study other quantitative phenotypes including, most recently phenotypically-rich, high-content phenotyping approaches such as Perturb-seq (single-cell RNA-sequencing pooled CRISPR screens). To achieve this, we developed an analytical framework for interpreting high-dimensional landscapes of cell states (manifolds) constructed from transcriptional phenotypes and complemented this with machine learning approaches to predict interactions enabling the exploration of vastly larger genetic interaction manifolds.

Information-rich Genotype-Phenotype Maps

A central goal of genetics is to define the relationships between genotypes and phenotypes. High-content phenotypic screens such as Perturb-seq (pooled CRISPR-based screens with single-cell RNA-sequencing readouts) enable massively parallel functional genomic mapping but, to date, have been used at limited scales. We recently performed the first genome-scale Perturb-seq targeting all expressed genes with CRISPR interference (CRISPRi) across >2.5 million human cells and developed a framework to power biological discovery with the resulting genotype-phenotype map. We used transcriptional phenotypes to predict the function of poorly-characterized genes, uncovering new regulators of ribosome biogenesis (including CCDC86, ZNF236, and SPATA5L1), transcription (C7orf26), and mitochondrial respiration (TMEM242). In addition to assigning gene function, single-cell transcriptional phenotypes allowed for in-depth dissection of complex cellular phenomena – from RNA processing to differentiation. We leveraged this ability to systematically identify the genetic drivers and consequences of aneuploidy and to discover an unanticipated layer of stress-specific regulation of the mitochondrial genome. Our information-rich genotype-phenotype map thus reveals a multidimensional portrait of gene function and cellular behavior.