The mammalian liver consists of hexagonal-shaped lobules, radially polarized by blood flow and morphogens1–4. Key liver genes have been shown to be differentially expressed along the lobule axis, a phenomenon termed zonation5,6, but a detailed genome-wide reconstruction of this spatial division of labor has not been achieved. Here we measure the entire transcriptome of thousands of mouse liver cells and infer their lobule coordinates based on a panel of zonated landmark genes, characterized with single molecule fluorescence in-situ hybridization7. Using this approach we obtain the zonation profiles of all liver genes with high spatial resolution. We find that more than 50% of liver genes are significantly zonated and uncover abundant non-monotonic profiles that peak at the mid-lobule layers. These include a spatial order of bile-acid biosynthesis enzymes that matches their position in the enzymatic cascade. Our approach can facilitate reconstruction of similar spatial genomic blueprints for other mammalian organs. The liver is a heterogeneous tissue that consists of hepatocytes operating in repeating anatomical units termed lobules. The hexagonally shaped lobules are composed of about 15 concentric layers of hepatocytes1,4. Blood flowing from portal veins and hepatic arteries at the corners of the lobules towards draining central veins creates gradients of oxygen, nutrients and hormones. In addition, Wnt morphogens secreted from the central vein generate an inverse polarizing field2. In line with this graded microenvironment, different radial layers sub-specialize in distinct processes, a phenomenon termed ‘liver zonation’6,8. This spatial division of labor is thought to confer optimality for liver function. For example, the outer highly oxygenated periportal lobule layers express higher levels of enzymes involved in energy-demanding tasks such as gluconeogenesis and ureagenesis, whereas the inner pericentral layers specialize in glycolysis and xenobiotic metabolism. Zonation patterns of different liver functions have been traditionally studied using RNA in-situ hybridization and immunohistochemistry5,8. These techniques are limited in their sensitivity and throughput. High-throughput methods identified genome-wide differences in expression between cells enriched for portal and central hepatocytes, however the spatial resolution of these studies was limited to two populations, isolated via either perfusion techniques9 or laser capture microdissection10. Whole-genome reconstruction of liver zonation requires a technique to measure the entire transcriptome and the lobule coordinates of many liver cells. Single cell RNA sequencing (scRNAseq) enables measuring the genome-wide expression patterns of thousands of cells11–14, however this technique requires tissue dissociation, thus losing the spatial context of each cell. Combining scRNAseq with in-situ expression measurements of landmark genes can enable inference of the original tissue coordinates. Such approach has recently been applied to dissect spatial gene expression signatures during embryogenesis15,16. Liver tissue is particularly challenging, as most liver genes exhibit spatially-graded rather than binary expression patterns. We have previously applied single molecule Fluorescence in-situ Hybridization (smFISH) to measure the mRNA content of hepatocytes in intact mouse liver7. This technique has the sensitivity and dynamic range to measure precise mRNA distributions at distinct lobule coordinates. We therefore combined smFISH with scRNAseq to achieve a genome-wide reconstruction of liver zonation (Fig. 1). As a panel of landmark genes we chose six highly expressed liver genes with diverse zonation patterns – the pericentral genes Glul and Cyp2e19 and the periportal genes Ass110, Asl10, Alb8 and Cyp2f29, (Fig. 2a, Extended Data Fig. 1). We segmented thousands of liver cells spanning the porto-central lobule axis of fasted mice, and reconstructed the spatially-dependent probability of observing cells with distinct mRNA levels (Fig. 2b, Supplementary information). In parallel, we used massively parallel single cell RNA-seq (MARS-seq)17 to measure the entire transcriptome of more than 1,500 dissociated liver cells (Table S1). Our measurements revealed three distinct populations consisting of Kupffer cells, endothelial cells and hepatocytes (Extended Data Fig. 2a-c). Hepatocytes were highly heterogeneous in their expression patterns and exhibited clear gradients that matched our landmark gene profiles (Fig. 3a, b). The urea cycle gene Ass1 and the glutamine synthetase gene Glul were expressed in a mutually exclusive manner in the single-cell data, demonstrating the purity of the single cell isolation (Extended Data Fig. 2d-e). To combine the scRNAseq results with our smFISH-obtained zonation profiles we divided the porto-central lobule axis into nine layers and developed a probabilistic inference algorithm to compute the likelihood of each cell to belong to any of these layers, based on the expression of our panel of landmark genes (Extended Data Fig. 3, Supplementary information, Tables S2, S7). Our reconstruction accuracy approached saturation at 6 genes. It was strongly dependent on the extent of zonation of our landmark genes, and only weakly dependent on the intra-layer cell-to-cell variability (Extended Data Fig. 4). We validated the precision of our reconstructed zonation profiles using smFISH on 20 genes with diverse profiles and found an excellent overall correspondence between the predicted and measured profiles (Extended Data Fig. 5). Our analysis revealed that around 50% of the expressed liver genes are non-randomly spatially zonated (Fig. 3c, Table S3, 3496 of 7277 genes with qval Hsd3b7->Cyp8b1->Cyp27a1, exhibited a spatial order that matched the position in the enzymatic cascade (Fig. 4b, c). While Cyp7a1 and Hsd3b7 were most abundant in the pericentral layer 1, the next enzyme – Cyp8b1 peaked in layers 2-3 (Fig. 4b,c, Extended Data Fig. 10f). Baat, the gene encoding bile acid-CoA:amino acid N-acyltransferase, the enzyme that conjugates bile acids prior to their secretion, was also depleted in layer 1 (Fig. 4b). The lower levels of Cyp8b1 in layer 1 compared to layer 2 may potentially indicate that intermediates are transferred between layers 1 and 2. Alternatively it could suggest that Cyp7a1 and Hsd3b7 harbor additional functions in layer 1 or that Cyp8b1 harbors additional functions in layer 2. Using the database of KEGG metabolic networks we identified additional zonated liver enzyme pairs that are significantly spatially discordant and either operate on the same substrate, produce the same product, or act sequentially such that the product of one enzyme is the substrate of the other (Table S6). This constitutes a resource for further exploration of liver division of labor. Our study challenges the traditional binary classification of liver into periportal and pericentral hepatocytes and reveals multiple roles for the intermediate lobule coordinates. The broad spatial heterogeneity revealed here may represent an optimal balance between the diverse liver tasks28. Our work provides the resource to explore the design principles governing this global spatial division of labor. The detailed spatial information of liver gene expression presented here can serve as an important tier in efforts to achieve comprehensive tissue-level metabolic reconstructions29. Liver gene expression is highly dynamic, with multiple genes showing circadian and metabolic variations30. While our study focused on zonation profiles of fasted mice, similar reconstructions will decipher the dynamics of liver zonation profiles along these temporal axes. Another natural extension is the analysis of liver pathologies. Reconstruction of liver zonation profiles in response to liver intoxication will expose the plasticity of zonation and the extent to which different layers compensate for the functional loss of other layers. Our approach for combining smFISH and scRNAseq can be readily applied to extract spatial blueprints of other structured mammalian organs in diverse physiological and pathological states. Methods Mice and tissues All animal studies were approved by the Institutional Animal Care and Use Committee of WIS. C57bl6 male mice age 2 month were housed under reverse phase cycle, and fasted for 5 hours starting at 7AM. All mice were anesthetized with an intraperitoneal injection of a ketamine (100 mg/kg) and xylazine (10 mg/kg) mixture. For smFISH, liver tissues were harvested and fixed in 4% paraformaldehyde for 3 hours; incubated overnight with 30% sucrose in 4% paraformaldehyde and then embedded in OCT. 7 µm cryosections were used for hybridization. Mouse liver cells for RNAseq were extracted from four mice and each smFISH result was based on at least 2 mice. Our study did not include randomization or blinding. Hybridization and imaging Probe library constructions, hybridization procedures and imaging conditions were previously described7,31,32. All smFISH probe libraries (Table S8), with the exception of Glutamine Synthetase (Glul), were coupled to Cy5. To detect cell borders alexa fluor 488 conjugated phalloidin (Rhenium A12379) was added to the GLOX buffer wash32. Portal node was identified morphologically on DAPI images based on bile ductile, central vein was identified using smFISH for Glul in TMR, included in all hybridizations. For zonation profiles, images were taken as scans spanning the portal node to the central vein. To compute single cell mRNA concentration of our landmark genes we used imageM32 to detect dots, extracted the sum of all dot intensities of the smFISH signal for each segmented cell within the first 3μm of the Z-stack, and divided by the segmented cell volume to obtain the intensity concentration per cell. Since the landmark genes were highly abundant, dots often coalesced, leading to underestimates of the true cellular gene expression when counting dots, whereas the sum of dot intensities was found to better capture the full dynamic range33. The gene expression distributions of the landmark genes were based on at least 800 cells from at least 10 lobules and 2 mice per gene. To validate the predicted zonation we imaged 20 additional genes on at least 10 lobules and 2 mice per gene (Extended Data Fig. 5b). Profiles for Igfbp1, Cyp27a1, Glud1, Cyp8b1, Igfbp2, Pck1, Cps1, Arg1, G6pc, Uox, Igf1, Pigr and Acly were generated by counting dots and dividing the number of dots in radial layers spanning the porto-central axis by the layer volume. Profiles for Cyp1a2, Rnase4, Gsta3, Ugt1a1, Hamp, Mup3 and Apoa1 were generated by quantifying the average background-subtracted intensity of the smFISH images at sequential lobule layers. Hepatocytes isolation Mouse hepatocytes were isolated by a modification of the two-step collagenase perfusion method of Seglen34 from 5 hours fasted, 2 months old male C57bl6 mice. Single cells were isolated from four mice. Digestion step was performed with Liberase Blendzyme 3 recombinant collagenase (Roche Diagnostics) according to the manufacturer’s instruction. Isolated hepatocytes were taken directly to sorting. Single-cell sorting Cells were sorted with SORP-FACSAriaII machine using a 130 μm nozzle. Dead cells were excluded on the basis of 5 μg/ml propidium iodide (Invitrogen) incorporation. Cells adhering to each other (i.e., doublets) were eliminated on the basis of pulse width. We used a 25,000-250,000 FSC-A gate. For three of the mice a #1.5 ND filter and was used to enrich for hepatocytes, whereas for the fourth mouse a #1 ND filter was used to enrich for non-parenchymal cells. Hepatocytes were sorted into 384-well cell capture plates containing 2 μl of lysis solution and barcoded poly(T) reverse-transcription (RT) primers for single-cell RNA-seq17. Barcoded single cell capture plates were prepared with a Bravo automated liquid handling platform (Agilent) as described previously17. Four empty wells were kept in each 384-well plate as a no-cell control during data analysis. Immediately after sorting, each plate was spun down to ensure cell immersion into the lysis solution, snap frozen on dry ice and stored at -80°C until processed. Massively Parallel Single Cell RNA-Seq (MARS-Seq) library preparation Single cell libraries were prepared, as described in Jaitin et. al.17. Briefly, mRNA from cells sorted into MARS-seq capture plates were barcoded and converted into cDNA and pooled using an automated pipeline. The pooled sample was then linearly amplified by T7 in vitro transcription and the resulting RNA was fragmented and converted into sequencing ready library by tagging the samples with pool barcodes and Illumina sequences during ligation, reverse transcription and PCR. Each pool of cells was tested for library quality and concentration was assessed as described in Jaitin et. al.17. Mapping of single-cell reads to mouse reference genome (mm9) was done using HISAT version 0.1.635 and reads with multiple mapping positions were excluded. Reads were associated with genes if they were mapped to an exon defined by a reference set obtained from the UCSC genome browser. Exons of different genes that share genomic position on the same strand were considered as a single gene with concatenated gene symbol. Corrected read counts were evaluated based on unique molecular identifiers (UMI) as described in Jaitin et. al.17. Immunohistochemistry Formalin-fixed and paraffin-embedded livers were sectioned (5μm), deparaffinized, rehydrated and permeabilized in 0.3% hydrogen peroxide in PBS for 20min at RT. Antigen retrieval was carried out by boiling in citrate buffer (pH=6) for 10min in a pressure cooker. Sections were blocked with normal goat serum (normal goat serum Vector labs, S-100) and incubated for 1h at RT with anti-Cyp8b1 (Santa Cruz, sc-23515, diluted 1:50 in PBS). Detection was performed by incubation with a peroxidase-conjugated anti-rabbit antibody (Zytomed Systems, ZUC032) using DAB as chromogen. Statistical analysis P-values for comparison of porto-central bias of different groups (Wnt+ and Wnt-, Ras+ and Ras- and hypoxia-induced and repressed) were obtained by Wilcoxon rank-sum tests on the porto-central bias of all genes, defined as the difference between the zonation profile values in layers 1 and 9 divided by the mean over all layers. The concise set of non-monotonic genes (Extended Data Fig. 10a) were defined as genes with Kruaskal-Wallis q 2) or decrease (d), log fold 1) or lower (f, fold<1)expression in hypopituitary mice23 (FDR-corrected p-value<0.05). Extended Data Figure 9 KEGG pathways enriched for zonated genes. (a) Average max-normalized zonation profiles of KEGG pathways enriched for zonated genes. Panel displays all pathways with more than 10 genes and hypergeometric q-value<0.1 (Table S5 exhibits full results). Each profile was normalized by its maximum along all layers. (b) Periportal bias of liver secreted proteins. Genes are based on an annotated liver secretome37. (c) Pericentral bias for liver detoxification genes. Shown are genes for cytochrome P450, Uridine 5'-diphospho-glucuronosyltransferase and glutathione S transferase. Images in (b-c) include significantly zonated genes from each pathway (Kruskal-Wallis qval<0.01 and more than 5*10-5 of the total cellular UMIs on average in at least one of the 9 layers), each profile normalized to maximum of 1. Extended Data Figure 10 Non-monotonic zonation profiles of liver genes. (a) Max-normalized zonation profiles of the concise set of 46 non-monotonic genes (Methods). (b) Mup3 is highly variable but exhibits a peak at layer 7 and a decreased expression in both the pericentral layer 1 and periportal layer 9. (c) Igfbp2 exhibits a non-monotonic zonation profile, peaking at layers 3-6. Scale bar-20µm. CV – central vein. Enlarged rectangles in (b-c) mark the periportal layers (blue), mid-lobule layers (green) and pericentral layers (magenta). Scale bar-8µm. Micrographs are representative of at least 10 lobules and at least two mice per gene. (d) Igfbp2, Igfbp1 and Igfbp4 are downstream to Igf1, the secreted protein that they bind and stabilize. (e) Non-monotonic zonation profiles observed do not stem from ploidy-specific regulation. Each dot represents a pair of adjacent cells (within 30µm of each other), x-axis is the higher ploidy cell, y-axis is the lower ploidy cell. Pvalues are Wilcoxon sign-ranked tests. Gene expression for Igfbp2, Hamp, and Cyp8b1 was quantified as the number of smFISH dots divided by the segmented cell volumes; Mup3 expression was quantified as the average intensity, due to its higher abundance. Quantification for each gene based on at least 60 pairs. (f) Immunohistochemistry of Cyp8b1 demonstrates higher protein concentration in layers 2-3 (dashed curve) compared to layer 1 (dotted curve). Scale bar-50μm. Micrographs are representative of 7 lobules in 2 mice. Supplementary Material Supplementary Information is linked to the online version of the paper at www.nature.com/nature. Supplementary Guide Supplementary Information Supplementary Table 1 Supplementary Table 2 Supplementary Table 3 Supplementary Table 4 Supplementary Table 5 Supplementary Table 6 Supplementary Table 7 Supplementary Table 8