Hematopoietic stem cells (HSCs) represent a rare population of cells residing in the Bone Marrow (BM) at the top of hematopoietic hierarchy. A critical balance is maintained between self-renewal and lineage differentiation of HSCs to maintain hematopoietic homeostasis. With aging, this balance is altered with an increase of long term HSC self-renewal
and a myeloid biased
differentiation, which favors the appearance of myeloid leukemias and anemias. This report presents the implemented data analysis strategy and results that characterize the mouse HSC pool in bone marrow (BM) and its intrinsic cellular changes and variations during aging with single cell RNA-seq.
We generated single cell RNA-seq data from pools of young and old HSPCs, isolated from mouse BMs. We used Seurat (Stuart et al. 2019) and Monocle (Qiu et al. 2017) R packages to respectively cluster the cells and order them along their differentiation process. We also analyse regulon activities with pyScenic (Aibar et al. 2017) to identify transcription factor activity changes upon aging.
## Load environnement
suppressMessages(library(Seurat))
suppressMessages(library(monocle))
suppressMessages(library(gProfileR))
suppressMessages(library(RColorBrewer))
suppressMessages(library(gridExtra))
suppressMessages(library(grid))
suppressMessages(library(getopt))
suppressMessages(library(scales))
suppressMessages(library(biomaRt))
suppressMessages(library(plyr))
suppressMessages(library(stringr))
suppressMessages(library(knitr))
suppressMessages(library(reshape2))
suppressMessages(library(cowplot))
suppressMessages(library(ggplotify))
suppressMessages(library(EnhancedVolcano))
suppressMessages(library(readxl))
suppressMessages(library(ggforce))
suppressMessages(library(mgcv))
suppressMessages(library(pheatmap))
suppressMessages(library(yaml))
suppressMessages(library(rlang))
source("../R_src/getSigPerCells.R")
source("../R_src/Enrichment.R")
source("../R_src/funForSeurat.R")
source("../R_src/funForPlot.R")
source("../R_src/makeBranchedHeatmap.R")
source("../R_src/DoMultipleBarHeatmap.R")
blank <- grid.rect(gp=gpar(col="white"),draw = F)
colorTrajState <-c("#999999","#56B4E9","#E69F00")
colorPhases <- c(brewer.pal(9,"RdPu"))[c(3,6,9)]
colorCellType <- c(brewer.pal(4,"Set2"))[c(1,3,4,2)]
divColor <- c(brewer.pal(12,"Paired"))[c(1,2)]
color <- F #wether or not colour the cluster in table enrichment
colorBin <- c(grey.colors(4)[c(1,3)],"#56B4E9","#E69F00","#924900")
colorAnnot <- c(brewer.pal(12,"Paired"))
legendPointSize <- 5
legendBoxSize <-10
dir.create(path = "paperImages/svg",recursive = T)
## Warning in dir.create(path = "paperImages/svg", recursive = T):
## 'paperImages/svg' already exists
dir.create(path = "paperImages/png",recursive = T)
## Warning in dir.create(path = "paperImages/png", recursive = T):
## 'paperImages/png' already exists
Unsupervised clustering of young and old HSPCs revealed 15 clusters gathering mainly undifferentiated and to a lesser extend lineage-primed HSPCs
Seurat integration and clustering workflow was done in the snakemake workflow (code available in the git repository). The results are loaded from its output.
## Load Seurat results
hspc.combined <- readRDS("../output/Seurat3_integration/combined.rds")
# Define levels order
hspc.combined@meta.data$predicted <- factor(hspc.combined@meta.data$predicted, levels = c("LTHSC","STHSC","MPP2","MPP3"))
hspc.combined@meta.data$phases <- mapvalues(hspc.combined@meta.data$phases,from = c("G1_G0","S","G2_M"), to = c("G1/G0","S","G2/M"))
hspc.combined@meta.data$phases <- factor(hspc.combined@meta.data$phases, levels = c("G1/G0","S","G2/M"))
hspc.combined@meta.data$predicted <- factor(hspc.combined@meta.data$predicted, levels = c("LTHSC","STHSC","MPP2","MPP3"))
hspc.combined@meta.data$AGE <- factor(hspc.combined@meta.data$AGE, levels = c("Young","Old"))
Seurat was used to integrate the 4 samples together, visualize the cells in a 2 dimensionnal space with the UMAP reduction dimension technique and cluster the cells with the Louvain algorithm as previously described (Stuart et al. 2019). UMAP plot of all 15 000 HSPCs from young and old ages. In total, 15 clusters are identified and are visualized by different colors. Cluster names are attributed according to enrichment analysis of cluster markers using gprofiler and previous published HSC signatures.
Idents(hspc.combined) <- "integrated_snn_res.0.5"
new.ident.name = c("diff","np2","np3","rep","np4","np1",
"div","tgf","pEr","ifn",
"pMk","pNeu","pMast","pT","pB")
names(x = new.ident.name) <- levels(x = hspc.combined)
hspc.combined <- RenameIdents(object = hspc.combined, new.ident.name)
## We choose an order that roughly follow proportion of cell type assigned with CaSTLe see below
clustersOrderFig1 <- c("tgf","np1","np2","ifn","np3","np4","diff","rep","div","pMk","pT","pB","pEr","pNeu","pMast")
Idents(hspc.combined) <- factor(Idents(hspc.combined), levels = clustersOrderFig1)
hspc.combined@meta.data$numclust <-Idents(hspc.combined)
colorClusters <- c("#999999","#004949","#009292","#ff6db6","#ffb6db",
"#490092","#006ddb","#b66dff","#6db6ff","#b6dbff",
"#920000","#924900","#db6d00","#24ff24","#ffff6d")
#hspc.combined.umap <- hspc.combined
Idents(hspc.combined) <- factor(Idents(hspc.combined), levels = rev(clustersOrderFig1))
umap <- DimPlot(object = hspc.combined, reduction = "umap", label = F, pt.size = 0.25,cols = rev(colorClusters)) +
xlab("UMAP 1") +
ylab("UMAP 2")
svg(file="paperImages/svg/fig1B_umap.svg",height = 6,width = 7)
umap
dev.off()
## png
## 2
umap
Zoom on primed B cells
umap_pB <- DimPlot(object = hspc.combined, reduction = "umap", label = F, pt.size = 1,cols = rev(colorClusters))
umap_pB <- umap_pB +xlim(c(12.2,12.5))+ ylim(c(1.1,1.5)) +
scale_fill_manual( values= colorClusters) +
xlab("UMAP 1") +
ylab("UMAP 2")
svg(file="paperImages/svg/fig1B2_umap.svg",height = 3,width = 3.5)
umap_pB
## Warning: Removed 14877 rows containing missing values (geom_point).
dev.off()
## png
## 2
umap_pB
## Warning: Removed 14877 rows containing missing values (geom_point).
remove(umap)
remove(umap_pB)
Load DEG analysis results from the workflow for cluster markers. Rename the cluster with their final name according to our study. Write this result in the supplementary table 1.
markers <- read.table("../output/Seurat3_integration/biology/markers_res0.5/markers.tsv",sep = "\t",header = T)
exclusiveMarkers <- markers[which(markers$gene %in% names(which(table(markers$gene) == 1))),]
markers$cluster <- mapvalues(x = markers$cluster,from = unique(markers$cluster),to = new.ident.name)
markers$cluster <- factor(markers$cluster,levels = rev(clustersOrderFig1))
markers <- markers[order(markers$cluster),c("cluster","gene","p_val_adj","p_val","avg_logFC","pct.1","pct.2")]
WriteXLS::WriteXLS(x =markers,"paperTables/clusterMarkersTableS1.xlsx",row.names = F)
Perform a new gProfiler analysis of cluster markers with no hierarchical filtering. Write this result in the supplementary table 2.
col <- c("term.id","domain","term.name","p.value","intersection","term.size","query.size","overlap.size")
domainOrder <- c("BP", "MF", "CC", "cor","hp","keg","rea", "tf")
dir.create("paperTables/gProfileR/clusterMarkers/",recursive = T)
## Warning in dir.create("paperTables/gProfileR/clusterMarkers/", recursive =
## T): 'paperTables/gProfileR/clusterMarkers' already exists
gprofileTables <- list()
for (c in rev(clustersOrderFig1)) {
gprofileTables[[c]] <- gprofiler(query = as.vector(markers$gene[which(markers$cluster == c)]),
organism = "mmusculus",
custom_bg = rownames(hspc.combined),
hier_filtering = "none",
ordered_query = F)
gprofileTables[[c]] <- gprofileTables[[c]][,col]
gprofileTables[[c]]$domain <- factor(gprofileTables[[c]]$domain,levels = domainOrder)
gprofileTables[[c]] <- gprofileTables[[c]][order(gprofileTables[[c]]$domain,gprofileTables[[c]]$p.value),]
write.table(x = gprofileTables[[c]],
file = paste("paperTables/gProfileR/clusterMarkers/",
c,
".tsv",sep = ""),
sep = '\t',quote = F,row.names = F)
}
WriteXLS::WriteXLS(x =gprofileTables,"paperTables/clusterMarkersGprofileTableS2.xlsx")
Load summary table of clusters characteristcs generated in the workflow. Rename the cluster with their final name according to our study. Write this result in the supplementary table 3.
tableSnakemake <- read.table("../output/Seurat3_integration/biology/markers_res0.5/clusters_table.tsv")
rownames(tableSnakemake) <- c("diff","np2","np3","rep","np4","np1",
"div","tgf","pEr","ifn",
"pMk","pNeu","pMast","pT","pB")
colnames(tableSnakemake) <- c("number of cells", "percentage of cells",
"STHSC","MPP3","LTHSC", "MPP2",
"Young","Old","Young_B","Old_B","Young_A","Old_A",
"G1/G0","S","G2/M","median nUMI","median percentage mitochondrial RNA","HSC chambers","B cells Chambers","NK cells Chambers","T cells, naive Chambers","Monocytes Chambers", "Granulocytes Chambers","Nuc. erythrocytes Chambers",
"Differenciated cells Chambers","Lymphoid Chambers","Myeloid Chambers","Mm HSC Runx1 Wu","Mm HSC Tcf7 Wu","Mm HSC Ivanova",
"Mm HSC Ramalho","HSC Explorer","Mm LT HSC Venezia",
"Mm Proliferation Venezia","Mm Quiescence Venezia",
"Mm Adult HSC UP Venezia", "Polarity factors Ting",
"Novel HSC regul polar Ting",paste(colnames(tableSnakemake)[c(39:50)],"Rodriguez Fraticelli" ))
colnameS3a <- c("number of cells", "percentage of cells",colnames(tableSnakemake)[c(18:50)])
colnameS3b <- c("LTHSC","STHSC","MPP2","MPP3")
colnameS3c <- c("G1/G0","S","G2/M")
colnameS3d <- c("Young","Old","Young_A","Young_B","Old_A","Old_B")
colnameS3all <- c(colnameS3a,colnameS3b,colnameS3c,colnameS3d,"median nUMI","median percentage mitochondrial RNA")
tableSnakemake <- tableSnakemake[rev(clustersOrderFig1),]
tableS3 <- list(S3a_signatures = tableSnakemake[rev(clustersOrderFig1),colnameS3a],
S3b_subtypes = tableSnakemake[rev(clustersOrderFig1),colnameS3b],
S3c_phases = tableSnakemake[rev(clustersOrderFig1),colnameS3c],
S3d_ages_samples = tableSnakemake[rev(clustersOrderFig1),colnameS3d],
S3all = tableSnakemake[rev(clustersOrderFig1),colnameS3all])
#tableSnakemake <- tableSnakemake[,colnameS3all]
WriteXLS::WriteXLS(x =tableS3,"paperTables/clusterTablesS3.xlsx", row.names = T)
Selected enrichments and corresponding p-values adjusted for multiple testing (padj) obtained for cluster makers using gprofiler. You need to install condformat (not available in conda) to draw the colored table version
#Load a handmade summary of enrichment analysis of cluster markers
tableRecap <- read_excel("results/tableRecapSeuratClusters.xlsx")
rn <- tableRecap$`cluster Name`
tableRecap <- data.frame(tableRecap[,c("cluster Name","Relevant enrichment","pval")])
rownames(tableRecap) <- rn
tableRecap <- tableRecap[rev(clustersOrderFig1),]
tableRecap$pval <- format(as.numeric(tableRecap$pval),digits = 2,format = "e")
tableRecap$Relevant.enrichment <- gsub(tableRecap$Relevant.enrichment,pattern = "to ", replacement = "to\n")
tableRecap$Relevant.enrichment <- gsub(tableRecap$Relevant.enrichment,pattern = "organ ", replacement = "organ\n")
colnames(tableRecap) <- c("Cluster","Selected enrichment","padj")
rownames(tableRecap) <- NULL
if(color == T) {
library("condformat")
enrichmentTable <- condformat(tableRecap) %>%
rule_fill_discrete(
columns = Cluster,
expression = Cluster,
colours = c("tgf" = colorClusters[1],
"np1" = colorClusters[2],
"np2" = colorClusters[3],
"ifn" = colorClusters[4],
"np3" = colorClusters[5],
"np4" = colorClusters[6],
"diff" = colorClusters[7],
"rep" = colorClusters[8],
"div" = colorClusters[9],
"pMk" = colorClusters[10],
"pT" = colorClusters[11],
"pB" = colorClusters[12],
"pEr" = colorClusters[13],
"pNeu" = colorClusters[14],
"pMast" = colorClusters[15])) %>%
theme_grob(rows = NULL) %>% condformat2grob()
} else {
enrichmentTable <- tableGrob(tableRecap,theme=ttheme_default(base_size = 8) ,rows = NULL)
}
grid.arrange(enrichmentTable)
svg(file="paperImages/svg/fig1C_enrichSummary.svg",height = 6,width = 4.5)
grid.arrange(enrichmentTable)
dev.off()
## png
## 2
remove(enrichmentTable)
We plot the expression of markers for each clusters in the umap.
conserved.markers.to.plot <- c("Egr1","Mycn","Procr","Stat1","Mllt3",
"Abhd2","Cd34","Lig1","Gpsm2","Itga2b",
"Il7r","Cd79a","Klf1","Mpo","Hdc")
legendFeaturePlot <- g_legend(FeaturePlot(hspc.combined,features = "Cd34")+scale_colour_gradient(name
= "level",low = "lightgrey",high = "blue",labels = c("0",rep("",2),"max") ),direction = "vertical")
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
getFeaturePlot <- function(seurat,feature,markers = NULL) {
if (!is.null(markers)) {
associated.clusters <- paste(unique(markers$cluster[which(markers$gene == feature)]),collapse = ", ")
p <- FeaturePlot(seurat,features = feature) +
NoLegend() +
xlab("UMAP 1") +
ylab("UMAP 2") +
labs(title = bquote(~italic(.(feature))), subtitle = bquote(~(.(associated.clusters)))) +
theme(axis.ticks = element_blank(),
axis.title = element_blank(),
axis.text = element_blank(),
plot.subtitle = element_text(hjust = 0.5,size = 12))
print(associated.clusters)
} else {
p <- FeaturePlot(seurat,features = feature) +NoLegend() + theme(axis.ticks = element_blank() )
}
return(p)
}
featureplots <- lapply(rev(conserved.markers.to.plot),getFeaturePlot,seurat = hspc.combined,markers = markers)
## [1] "pMast, pNeu"
## [1] "pMast, pNeu"
## [1] "pEr"
## [1] "pB"
## [1] "pT"
## [1] "pMast, pMk"
## [1] "div"
## [1] "pNeu, pEr, pMk, div, rep"
## [1] "pNeu, rep, diff"
## [1] "np4"
## [1] "np3, ifn, np2, tgf"
## [1] "ifn"
## [1] "np2, tgf"
## [1] "np1, tgf"
## [1] "tgf"
plot <- cowplot::plot_grid(plotlist = featureplots,nrow=5)
fig2C <- arrangeGrob(plot,legendFeaturePlot,widths =c(11,2))
grid.arrange(fig2C)
svg(file="paperImages/svg/fig1D_markersUmap.svg",height = 14,width = 9)
grid.arrange(fig2C)
dev.off()
## png
## 2
png(file="paperImages/png/fig1D_markersUmap.png",height = 14,width = 9,units = "in",res = 300)
grid.arrange(fig2C)
dev.off()
## png
## 2
remove(plot)
We use a scRNA seq HSPCs dataset with cells FACS sorted in the different sub-classes of HSPCs (Rodriguez-Fraticelli et al. 2018) to label our cells with CaSTLe (Lieberman, Rokach, and Shay 2018). This tool consists in transfer learning from a source dataset with annotated cells to a target dataset (our cells).
for the source dataset:
Fist control of HSPC subtype proportions in each ages and each sample both assigned with CaSTLe and determined by FACS.
Load Facs data.
facs <- read.csv("results/FACSsorting.csv",sep = "\t",row.names = 1)
facs <- facs[c("old_A","old_B","young_A","young_B"),]
facs2 <- facs*table(hspc.combined$sampleName)*0.01
facs3 <- facs2
facs3$AGE <- c("Old","Old","Young","Young")
ct_facs_total <- apply(facs2,MARGIN = 2,FUN = "as.numeric")
ct_facs_total <- colSums(ct_facs_total)
ct_facs <- as.data.frame(facs3 %>% dplyr::group_by(AGE) %>% dplyr::summarize_all(sum))
ct_facs <- melt(ct_facs,id.vars = "AGE")
ct_facs <- ct_facs[order(ct_facs$AGE,decreasing = T),]
We compute the proportion of HSPC sub-classes assign with CaSTLe for each cluster.
# predicted cell type by AGE
ctAge <- ddply(hspc.combined@meta.data,~AGE + predicted,nrow)
ctAge <- ctAge[order(ctAge$AGE,decreasing = T),]
ctAge$AGE <- factor(ctAge$AGE, levels = c("Young","Old"))
ctAge$Assign <- "CaSTLe"
ct_facs$Assign <- "FACS"
colnames(ct_facs) <- c("AGE","CellType","V1","Assign")
colnames(ctAge) <- c("AGE","CellType","V1","Assign")
ctAge <- rbind(ct_facs,ctAge)
ctAge$AGE <- factor(ctAge$AGE, levels = c("Young","Old"))
ctAge$Assign <- factor(ctAge$Assign, levels = c("FACS","CaSTLe"))
#Predicted and FACS cell type by sample
ctAge2 <- ddply(hspc.combined@meta.data,~sampleName + predicted,nrow)
facs3$Sample <- c("Old_A","Old_B","Young_A","Young_B")
ct_facs2 <- melt(facs3[,-5],id.vars = "Sample")
ct_facs2$Assign <- "FACS"
ctAge2$Assign <- "CaSTLe"
colnames(ct_facs2) <- c("Sample","CellType","V1","Assign")
ctAge2$Sample <- firstup(ctAge2$sampleName)
ctAge2 <- ctAge2[,-1]
colnames(ctAge2) <- c('CellType',"V1","Assign","Sample")
ctAge2 <- rbind(ct_facs2,ctAge2)
ctAge2$Sample <- factor(ctAge2$Sample, levels = c("Young_A","Young_B","Old_A","Old_B"))
ctAge2$Assign <- factor(ctAge2$Assign, levels = c("FACS","CaSTLe"))
Barplot of CaSTLe versus FACS HSPC subclass proportion for all data
ctAll <- ggplot(ctAge, aes(fill = CellType,y = V1, x=Assign)) + labs(fill = "cell types:") +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorCellType)+
scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") + theme(text = element_text(size = 10)) +
theme(axis.text = element_text(size = 10))+
theme(axis.title = element_text(size=10),axis.text.x = element_text(angle = 45, hjust = 1)) +
theme(legend.title=element_blank())
ctAll
svg("paperImages/svg/figS1B_ctAll.svg",height = 5,width = 3)
ctAll
dev.off()
## png
## 2
#remove(ctAll)
Barplot of CaSTLe versus FACS HSPC subclass proportion at each age
ctAGE <- ggplot(ctAge, aes(fill = CellType,y = V1, x=Assign)) + labs(fill = "cell types:") +
facet_wrap(~AGE,nrow = 1) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorCellType)+
scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") +
theme(legend.title=element_blank(),axis.text.x = element_text(angle = 45, hjust = 1))
ctAGE
svg(file="paperImages/svg/figS1C_ctAge.svg",height = 7,width = 5)
ctAGE
dev.off()
## png
## 2
#remove(ctAGE)
Barplot of CaSTLe versus FACS HSPC subclass proportion in each sample
ctSAMPLE <- ggplot(ctAge2, aes(fill = CellType,y = V1, x=Assign)) + labs(fill = "cell types:") +
facet_wrap(~Sample,nrow = 1) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorCellType)+
scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") + theme(text = element_text(size = 10)) +
theme(axis.text = element_text(size = 10),axis.text.x = element_text(angle = 45, hjust = 1))+
theme(axis.title = element_text(size=10)) +
theme(legend.title=element_blank())
ctSAMPLE
svg(file="paperImages/svg/figS1D_ctAgeSmp.svg",height = 5,width = 7)
ctSAMPLE
dev.off()
## png
## 2
Then we can vizualize HSPC subtype proportions in each cluster.
#The getPredictedPropPerClustBarplot function is in the file R_src/funForseurat.R
predictedClust <- getPredictedPropPerClustBarplot(hspc.combined) + scale_fill_manual(values = colorCellType) + scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
predictedClust
svg(file="paperImages/svg/fig1E_bpCtClust.svg",height = 5,width = 5)
grid.arrange(predictedClust)
dev.off()
## png
## 2
#remove(predictedClust)
We compute the percentage of computationally assigned cell cycle (G1_G0, S and G2_M) phases in each of the 15 clusters with cyclone function (Scialdone et al. 2015) from the scran package (Lun, McCarthy, and Marioni 2016).
#The getPhasePropPerClustBarplot function code is in the file R_src/funForseurat.R
phasesClust <- getPhasePropPerClustBarplot(hspc.combined) + scale_fill_manual( values= colorPhases) + scale_y_continuous(name = "Cell cycle phases (%)", labels = c(0,25,50,75,100))
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
phasesClust
svg(file="paperImages/svg/fig1F_bpPhaseClust.svg",height = 5,width = 5)
grid.arrange(phasesClust)
dev.off()
## png
## 2
remove(phasesClust)
Violin plots of selected markers for clusters obtained with Seurat function VlnPlot.
vlnS1 <- VlnPlot(object = hspc.combined,features = rev(conserved.markers.to.plot),pt.size = 0,ncol = 3,cols = rev(colorClusters),combine = F)
modifyVln <- function(v) {
v <- v +
theme(plot.title = element_text(size = 18,face = "italic")) +
xlab("") +
NoLegend()
}
vlnS1 <- lapply(vlnS1, modifyVln )
vlnS1 <- cowplot::plot_grid(plotlist = vlnS1,nrow=5)
svg("paperImages/svg/figS2_vlnClust.svg",height = 15,width = 12.5)
vlnS1
dev.off()
## png
## 2
We look for aging differences in each cluster. We analyse first the proportion of young and old cells in each cluster.
Umap with cell are colored according to their age.
## png
## 2
zoom on primed B cells
umap_pB <- DimPlot(object = hspc.combined,group.by = "AGE", reduction = "umap", label = F, pt.size = 1,cols = hue_pal()(2)[c(2,1)])
umap_pB <- umap_pB +
xlim(c(12.2,12.5))+ ylim(c(1.1,1.5)) +
xlab("UMAP 1") +
ylab("UMAP 2")
svg(file="paperImages/svg/fig2A2_umap.svg",height = 3,width = 3.5)
umap_pB
## Warning: Removed 14877 rows containing missing values (geom_point).
dev.off()
## png
## 2
umap_pB
## Warning: Removed 14877 rows containing missing values (geom_point).
remove(umap_pB)
Percentage of young and old HSPCs calculated for each cluster. Clusters for which the proportion of old or young cells are significantly higher than expected (pvalue of hyper-geometric test < 0.05) are colored.
#getAGEPropPerClustBarplot and getEnrichAge are in the file R_src/funForSeurat.R.
age <- suppressMessages(getAGEPropPerClustBarplot(hspc.combined) + labs(fill = "age:")+scale_fill_manual(values = hue_pal()(2)[c(1,2)])+ scale_y_continuous( labels = c("0%","25%","50%","75%","100%")))
ageEnrich <- getEnrichAge(hspc.combined,clustCol ="numclust",metaCol = "AGE")
ageEnrich <- as.data.frame(t(ageEnrich))
ageEnrich$color <- "black"
ageEnrich[which(as.numeric(as.vector(ageEnrich$phyper)) < 0.05 & ageEnrich$enrich == "Young"),"color"] <- hue_pal()(2)[2]
ageEnrich[which(as.numeric(as.vector(ageEnrich$phyper)) < 0.05 & ageEnrich$enrich == "Old"),"color"] <- hue_pal()(2)[1]
age <- age +
theme(axis.text.y = element_text(colour = ageEnrich[,'color'])) +
scale_fill_discrete(breaks=c("Young","Old")) +
guides(colour = guide_legend(override.aes = list(size=legendBoxSize)))
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
age
svg(file="paperImages/svg/fig2B_bpAge.svg",height = 5,width = 5)
age
dev.off()
## png
## 2
Violin plots of Ly6d and Trp53inp1 in the primed B cluster in comparison to the other HSPCs.
## png
## 2
We got a clear batch effect between our sample that may come from the two single-cell RNA seq platform so we decide to do DEG with analysis separately for each platform separately and gathered the results after thanks to the FindConservedMarkers function of Seurat.
Make the DEG analysis with aging in all the cells without and with batch consideration. Print aging markers first identified without batch consideration with an opposite variation with aging in the two batches.
hspc.combined$platform <- str_split_fixed(hspc.combined$sampleName,"_",2)[,2]
Idents(object = hspc.combined) <- "AGE"
# Test global aging without batch consideration
ageMarkersWoBatchCr <- FindMarkers(hspc.combined,ident.1 = "Young",ident.2 = "Old",logfc.threshold = 0.25)
ageMarkersWoBatchCr <- ageMarkersWoBatchCr[which(ageMarkersWoBatchCr$p_val_adj < 0.05),]
# Test batch separately
agingGlobal <- FindConservedMarkers(object = hspc.combined,ident.1 = "Old",ident.2 = "Young",logfc.threshold = 0,grouping.var = "platform",verbose = T)
## Testing group B: (Old) vs (Young)
## Testing group A: (Old) vs (Young)
agingGlobalBatchEffect <- agingGlobal[which(agingGlobal$B_avg_logFC/agingGlobal$A_avg_logFC < 0),]
# Print aging markers identified without batch consideration with an opposite variation with aging in the two batches.
rownames(ageMarkersWoBatchCr)[which(rownames(ageMarkersWoBatchCr) %in% rownames(agingGlobalBatchEffect))]
## [1] "Rpl23a-ps3" "Gm9493" "Rpl9-ps6" "Gm8730" "Rpl10-ps3"
## [6] "Rpl6l" "Rpl13a-ps1" "Rpl13-ps3" "Gm10036" "Rps18-ps3"
## [11] "Gm10260" "Rpl10" "Gm11808" "Gm2000" "Gapdh"
## [16] "Gm10269" "Phpt1" "H2afy" "Rps27rt" "Rbm3"
## [21] "Gm6133" "Gm17541" "Gm10116" "Lrrc58" "Wdr89"
## [26] "Rps26-ps1" "Rpl15" "Gm6576" "Gm8797" "Bri3"
## [31] "Gm8444" "Gm5093" "Hdgf" "Gm6525" "Anp32b"
## [36] "Comt" "Rpl27-ps3" "Igfbp4" "Gm10076" "Gm15013"
## [41] "Rpl36-ps3"
Keep only aging markers identified in the two batch separately that have the same sens of variation with aging on both batches and we write the results in the supplementary table 4.
agingGlobal <- agingGlobal[which(agingGlobal$B_avg_logFC/agingGlobal$A_avg_logFC > 0),]
agingGlobal$Gene <- rownames(agingGlobal)
agingGlobal$min_avg_logFC <- NA
for (g in rownames(agingGlobal)) {
if(agingGlobal[g,"A_avg_logFC"] < 0) {
agingGlobal[g,"min_avg_logFC"] <- max(agingGlobal[g,c("A_avg_logFC","B_avg_logFC")])
} else {
agingGlobal[g,"min_avg_logFC"] <- min(agingGlobal[g,c("A_avg_logFC","B_avg_logFC")])
}
}
agingGlobal <- agingGlobal[order(agingGlobal$min_avg_logFC),]
write.table(x = agingGlobal,"paperTables/agingGlobalMarkers.tsv",sep = "\t",quote = F,row.names = F,col.names = T)
agingGlobal$Sens <- "down in Old vs Young"
agingGlobal$Sens[which(agingGlobal$min_avg_logFC > 0)] <- "up in Old vs Young"
colS4 <- c("Gene","Sens","minimump_p_val","max_pval","min_avg_logFC",
"A_p_val_adj","A_p_val","A_avg_logFC","A_pct.1","A_pct.2",
"B_p_val_adj","B_p_val","B_avg_logFC","B_pct.1","B_pct.2")
agingGlobal <- agingGlobal[,colS4]
WriteXLS::WriteXLS(x =agingGlobal,"paperTables/agingMarkersGlobalTableS4.xlsx",row.names = F)
Volcano plot of global differential expression upon aging test on all cells. black dots indicate significant DEGs (qval < 0.05 and average log fold change > 0.25). 6733 genes (expressed in at least 10% of young and 10% of old cells) are tested.
labs <- as.vector(agingGlobal$Gene[which(agingGlobal$avg_logFC < -1 | agingGlobal$avg_logFC>0.5)])
labs <- c("Vwf","Clu","Slamf1","Procr","Nupr1","Plac8","Sox4","Cd34","Cd48")
rownames(agingGlobal) <- agingGlobal$Gene
volcShort <- EnhancedVolcano(agingGlobal,
lab = rownames(agingGlobal),
x = 'min_avg_logFC',
y = 'minimump_p_val',
axisLabSize = 12,
titleLabSize = 12,
FCcutoff = 0.25,
transcriptLabSize = 3,
boxedlabels = TRUE,drawConnectors = T,
selectLab = labs,legendPosition = "bottom",title = NULL,subtitle = NULL,
col = c("grey", "grey", "grey", "black"),
legend = c("NS", "LogFC", "P", "P & 'LogFC'") ) +xlab(label = "LogFC") + ylab(label="-Log10(P)")
## Warning: One or more P values is 0. Converting to minimum possible value...
volcShort + NoLegend()
svg(filename="paperImages/svg/figS3_volc.svg", width=8, height=6)
volcShort + NoLegend()
dev.off()
## png
## 2
remove(volcShort)
DEG with aging analysis per clusters. Like for the global analysis we perform separately the test in the two batches and gathered the results. We write the result in the suplementary table S5.
FindAgingMarkers4 <- function(cluster,
hspc.combined,
max.cells.per.ident = Inf,
filterOnPadj = T,
logfc.threshold = 0.25,
min.pct = 0.1) {
hspc.combined$cluster.AGE <- paste(hspc.combined$numclust, hspc.combined$AGE, sep = "_")
Idents(object = hspc.combined) <- "cluster.AGE"
agingMarkers <- FindConservedMarkers(hspc.combined,
grouping.var = "platform",
test.use="wilcox",
ident.1 = paste0(cluster,"_Old"),
ident.2 = paste0(cluster,"_Young"),
verbose = TRUE, min.pct = min.pct,
logfc.threshold = logfc.threshold,
max.cells.per.ident = max.cells.per.ident
)
#keep only markers with conserved logfc sign
agingMarkers <- agingMarkers[which(agingMarkers$B_avg_logFC/agingMarkers$A_avg_logFC > 0),]
#Add a column the min abs(logFC)
agingMarkers$min_avg_logFC <- NA
for (g in rownames(agingMarkers)) {
if(agingMarkers[g,"A_avg_logFC"] < 0) {
agingMarkers[g,"min_avg_logFC"] <- max(agingMarkers[g,c("A_avg_logFC","B_avg_logFC")])
} else {
agingMarkers[g,"min_avg_logFC"] <- min(agingMarkers[g,c("A_avg_logFC","B_avg_logFC")])
}
}
agingMarkers$Cluster <- paste(cluster,"_Old_up",sep ="")
agingMarkers$Cluster[which(agingMarkers$min_avg_logFC < 0)] <- paste(cluster,"_Old_down",sep ="")
agingMarkers$Gene <- rownames(agingMarkers)
agingMarkers <- agingMarkers[order(agingMarkers$min_avg_logFC),]
if (filterOnPadj) {
agingMarkers <- agingMarkers[which(agingMarkers$A_p_val_adj < 0.05 & agingMarkers$B_p_val_adj < 0.05),]
}
return(agingMarkers)
}
Idents(object = hspc.combined) <- hspc.combined@meta.data$numclust
# DEG with agning analysis in cluster pB (number 12) does not identified agning markers because on around 40 cells it contains only 4 young cells and we cannot use FindConservedMarkers because there is less than 3 young cells belonging to pB in one batch
tryCatch(
expr = {
FindAgingMarkers4("pB",hspc.combined,logfc.threshold = 0.25,filterOnPadj = F)
},
error = function(e){
print(e)
},
warning = function(w){
},
finally = {
}
)
## Testing group B: (pB_Old) vs (pB_Young)
## <simpleError in FindMarkers.default(object = data.use, slot = data.slot, counts = counts, cells.1 = ident.1, cells.2 = ident.2, features = features, reduction = reduction, logfc.threshold = logfc.threshold, test.use = test.use, min.pct = min.pct, min.diff.pct = min.diff.pct, verbose = verbose, only.pos = only.pos, max.cells.per.ident = max.cells.per.ident, random.seed = random.seed, latent.vars = latent.vars, min.cells.feature = min.cells.feature, min.cells.group = min.cells.group, pseudocount.use = pseudocount.use, ...): Cell group 2 has fewer than 3 cells>
#We perform the DEG with aging analysis for the others:
agingMarkersPerClust <- lapply(clustersOrderFig1[-12],
FindAgingMarkers4,hspc.combined,logfc.threshold = 0.25,filterOnPadj = F)
## Testing group B: (tgf_Old) vs (tgf_Young)
## Testing group A: (tgf_Old) vs (tgf_Young)
## Testing group B: (np1_Old) vs (np1_Young)
## Testing group A: (np1_Old) vs (np1_Young)
## Testing group B: (np2_Old) vs (np2_Young)
## Testing group A: (np2_Old) vs (np2_Young)
## Testing group B: (ifn_Old) vs (ifn_Young)
## Testing group A: (ifn_Old) vs (ifn_Young)
## Testing group B: (np3_Old) vs (np3_Young)
## Testing group A: (np3_Old) vs (np3_Young)
## Testing group B: (np4_Old) vs (np4_Young)
## Testing group A: (np4_Old) vs (np4_Young)
## Testing group B: (diff_Old) vs (diff_Young)
## Testing group A: (diff_Old) vs (diff_Young)
## Testing group B: (rep_Old) vs (rep_Young)
## Testing group A: (rep_Old) vs (rep_Young)
## Testing group B: (div_Old) vs (div_Young)
## Testing group A: (div_Old) vs (div_Young)
## Testing group B: (pMk_Old) vs (pMk_Young)
## Testing group A: (pMk_Old) vs (pMk_Young)
## Testing group B: (pT_Old) vs (pT_Young)
## Testing group A: (pT_Old) vs (pT_Young)
## Testing group B: (pEr_Old) vs (pEr_Young)
## Testing group A: (pEr_Old) vs (pEr_Young)
## Testing group B: (pNeu_Old) vs (pNeu_Young)
## Testing group A: (pNeu_Old) vs (pNeu_Young)
## Testing group B: (pMast_Old) vs (pMast_Young)
## Testing group A: (pMast_Old) vs (pMast_Young)
agingMarkersPerClustTable <- agingMarkersPerClust[[1]]
for (cm in c(2:(length(agingMarkersPerClust)))) {
agingMarkersPerClustTable <- rbind(agingMarkersPerClustTable,agingMarkersPerClust[[cm]])
}
# Filter non significant markers on combined p value
agingMarkersPerClustTable <- agingMarkersPerClustTable[which(agingMarkersPerClustTable$minimump_p_val<0.05),]
colS5 <- c("Gene","Cluster","Sens","minimump_p_val","max_pval","min_avg_logFC",
"A_p_val_adj","A_p_val","A_avg_logFC","A_pct.1","A_pct.2",
"B_p_val_adj","B_p_val","B_avg_logFC","B_pct.1","B_pct.2")
agingMarkersPerClustTable$Sens <- "down in Old vs Young"
agingMarkersPerClustTable$Sens[which(agingMarkersPerClustTable$min_avg_logFC > 0)] <- "up in Old vs Young"
agingMarkersPerClustTable$Cluster <- str_split_fixed(agingMarkersPerClustTable$Cluster,pattern = "_",n = 2)[,1]
agingMarkersPerClustTable$Cluster <- factor(agingMarkersPerClustTable$Cluster,levels = rev(clustersOrderFig1))
agingMarkersPerClustTable <- agingMarkersPerClustTable[order(agingMarkersPerClustTable$Cluster,agingMarkersPerClustTable$Sens,-abs(agingMarkersPerClustTable$min_avg_logFC)),]
agingMarkersPerClustTable <- agingMarkersPerClustTable[,colS5]
WriteXLS::WriteXLS(x =agingMarkersPerClustTable,"paperTables/agingMarkersPerClustTableS5.xlsx",row.names = F)
Multiple heatmaps for the top aging markers (|min_avg_log_fold_change| > 0.5 an combined p_val < 0.05) per clusters, primed clusters gathered together.
# select the top aging markers
markersOld <- as.character(unique(agingMarkersPerClustTable[which(agingMarkersPerClustTable$min_avg_logFC >0.5),"Gene"]))
markersYoung <- as.character(unique(agingMarkersPerClustTable[which(agingMarkersPerClustTable$min_avg_logFC < -0.5),"Gene"]))
# get the desired order of the cells for the heatmap (per cluster and per age)
orderCellsTable <- hspc.combined@meta.data[,c("numclust","AGE")]
orderCellsTable <- orderCellsTable[order(orderCellsTable$numclust,orderCellsTable$AGE),]
# heatmap are build on the scaled data for the selected markers
hspc.combined <- ScaleData(hspc.combined,features =unique(c(markersOld,markersYoung)))
## Centering and scaling data matrix
dataOld <- GetAssayData(hspc.combined,slot="scale.data")[unique(c(markersOld)),rownames(orderCellsTable)]
dataYoung <- GetAssayData(hspc.combined,slot="scale.data")[unique(c(markersYoung)),rownames(orderCellsTable)]
# get the markers order by hierarchical clustering of the scaled data (first positive markers then negative) use of UPGMA method
ordOld <- hclust( dist(dataOld, method = "euclidean"),method = "average")$order
ordYoung <- hclust( dist(dataYoung, method = "euclidean"),method = "average")$order
#Group primed clusters
hspc.combinedForDoHeatmap <- hspc.combined
levels(hspc.combinedForDoHeatmap$AGE) <- c("Old","Young")
Idents(hspc.combinedForDoHeatmap)[which(Idents(hspc.combinedForDoHeatmap) %in% c("pT","pEr","pMk","pNeu","pMast","pB"))] <- "pMk"
new.ident.name0 = rev(c("tgf","np1","np2","ifn","np3","np4","diff","rep","div","primed"))
names(x = new.ident.name0) <- rev(levels(x = hspc.combinedForDoHeatmap))
hspc.combinedForDoHeatmap <- RenameIdents(object = hspc.combinedForDoHeatmap,new.ident.name0 )
hspc.combinedForDoHeatmap$Cluster <- Idents(hspc.combinedForDoHeatmap)
#Rename for ordering
#Cluster need to be in the alphabetic order tp have the correct order with the DoMultipleHeatmap function
new.ident = paste(rev(LETTERS[1:length(unique(Idents(hspc.combinedForDoHeatmap)))]),sep = "_",levels(hspc.combinedForDoHeatmap))
names(x = new.ident) <- levels(x = hspc.combinedForDoHeatmap)
hspc.combinedForDoHeatmap <- RenameIdents(object = hspc.combinedForDoHeatmap,new.ident )
hspc.combinedForDoHeatmap$Cluster <- Idents(hspc.combinedForDoHeatmap)
colorHMDE1 <- c(colorClusters[c(1:length(levels(x = hspc.combinedForDoHeatmap))-1)],"#e9c39b")
# Load the code for the DoMultiBarHeatmap
#source("../R_src/DoMultipleBarHeatmap.R")
MultiBarHeatmap1 <- DoMultiBarHeatmap(hspc.combinedForDoHeatmap,
assay = 'RNA',
features = c(unique(markersYoung)[ordYoung],unique(markersOld)[ordOld]),
group.by='Cluster',
additional.group.by = 'AGE',
group.col = colorHMDE1) +
scale_color_manual(values = colorHMDE1) + theme(axis.text.y = element_text(size = 10,face = "italic")) + guides(color = FALSE)
MultiBarHeatmap1
dir.create("paperImages/png/")
## Warning in dir.create("paperImages/png/"): 'paperImages/png' already exists
png("paperImages/png/fig3A_DEGAgingHeatmap.png",units = "in",height = 9 , width = 14,res = 220)
MultiBarHeatmap1
dev.off()
## png
## 2
MultiBarHeatmap of top DEG upon aging only for primed clusters.
hspc.combinedForDoHeatmap <- hspc.combined
levels(hspc.combinedForDoHeatmap$AGE) <- c("Old","Young")
hspc.combinedForDoHeatmap$Cluster <- hspc.combinedForDoHeatmap$numclust
Idents(hspc.combinedForDoHeatmap) <- hspc.combinedForDoHeatmap$Cluster
hspc.combinedForDoHeatmap$Cluster <- factor(hspc.combinedForDoHeatmap$Cluster,levels = unique(hspc.combinedForDoHeatmap$Cluster))
hspc.combinedForDoHeatmap <- subset(hspc.combinedForDoHeatmap,ident =clustersOrderFig1[c(10:15)])
#renaming cluster name same as above
new.ident = paste(LETTERS[1:length(unique(Idents(hspc.combinedForDoHeatmap)))],sep = "_",levels(hspc.combinedForDoHeatmap))
names(x = new.ident) <- levels(x = hspc.combinedForDoHeatmap)
hspc.combinedForDoHeatmap <- RenameIdents(object = hspc.combinedForDoHeatmap,new.ident )
hspc.combinedForDoHeatmap$Cluster <- Idents(hspc.combinedForDoHeatmap)
colorHMDE2 <- colorClusters[c(10:15)]
MultiBarHeatmap2 <- DoMultiBarHeatmap(hspc.combinedForDoHeatmap,
assay = 'RNA',
features = unique(c(markersYoung[ordYoung],markersOld[ordOld])),
group.by='Cluster',
additional.group.by = 'AGE',
group.col = colorHMDE2) + theme(axis.text.y = element_text(size = 8,
face = "italic"))+ guides(color = FALSE)
MultiBarHeatmap2
png("paperImages/png/figS4_DEGAgingHeatmap.png",width = 8, height = 9,res = 220, units = "in")
MultiBarHeatmap2
dev.off()
## png
## 2
We compare LFC global and LFC per clusters. We put a 0 global log fold change for all aging cluster markers not recovered in global aging markers (eg with opposite sign of log fold change depending of the platform)
agingMarkersPerClustTable$global_avg_logFC <- 0
globalRecoveredMarkers <- agingMarkersPerClustTable$Gene[which(agingMarkersPerClustTable$Gene %in% agingGlobal$Gene)]
for (g in globalRecoveredMarkers) {
agingMarkersPerClustTable[which(agingMarkersPerClustTable$Gene == g),'global_avg_logFC'] <- agingGlobal[g,'min_avg_logFC']
}
agingMarkersPerClustTable$Cluster <- factor(agingMarkersPerClustTable$Cluster,levels = rev(clustersOrderFig1))
fmt_dcimals <- function(decimals=1){
function(x) format(x,nsmall = decimals,scientific = FALSE)
}
plot <- ggplot(data = agingMarkersPerClustTable) +
aes(y = min_avg_logFC,x=global_avg_logFC) +
geom_point(size = 0.5) +
scale_x_continuous(breaks = c(-1,0,1)) +
facet_wrap(~Cluster,ncol = 5,drop = T)+
theme(legend.position = "bottom") +
xlab("Global logFC")+
ylab("Cluster logFC") +
theme(axis.text = element_text(size = 10))
plot
svg("paperImages/svg/figS5_LFCperClustVsGlobal.svg",height = 5,width = 7)
plot
dev.off()
## png
## 2
Enrichment analysis of aging markers per cluster gathered by age (with gprofiler using no hierarching filter). We write the results in the suppelementary table 6.
# Rearrange table to use gProfileAnalysis homemade function
agingMarkersPerClustTable2 <- agingMarkersPerClustTable
agingMarkersPerClustTable2$Cluster2 <- agingMarkersPerClustTable2$Cluster
agingMarkersPerClustTable2$Cluster <- gsub(x = agingMarkersPerClustTable2$Sens,
pattern = " ",replacement = "_")
gprofiler_aging_results <- suppressMessages(gProfileAnalysis(deg_clust = agingMarkersPerClustTable2,
outdir = paste("./paperTables/gProfileR/agingMarkersCombinedTestGlobalEnrich", sep =""),
background = rownames(hspc.combined),
hier_filtering= "none",
colors = hue_pal()(length(unique(agingMarkersPerClustTable2$Cluster)))))
gprofiler_aging_results <-list(Old_down = gprofiler_aging_results$down_in_Old_vs_Young[,col],
Old_up = gprofiler_aging_results$up_in_Old_vs_Young[,col])
WriteXLS::WriteXLS(x =gprofiler_aging_results,"paperTables/agingEnrichTableS6.xlsx")
Computation of expression score for some aging feature selected from enrichment analysis.
trMisRegInCancer <- gprofiler_aging_results$Old_up[which(gprofiler_aging_results$Old_up$term.name == "Transcriptional misregulation in cancer"),"intersection"]
trMisRegInCancer <- firstup(str_split(tolower(trMisRegInCancer),",")[[1]])
hspc.combined <- scoreCells3(hspc.combined,signature = trMisRegInCancer,sigName = "TMC",outdir = NULL)
hemostasis <- gprofiler_aging_results$Old_up[which(gprofiler_aging_results$Old_up$term.name == "Hemostasis"),"intersection"]
hemostasis <- firstup(str_split(tolower(hemostasis),",")[[1]])
hspc.combined <- scoreCells3(hspc.combined,signature = hemostasis,sigName = "HEM",outdir = NULL)
CAM <- gprofiler_aging_results$Old_up[which(gprofiler_aging_results$Old_up$term.name == "Cell adhesion molecules (CAMs)" ),"intersection"]
CAM <- firstup(str_split(tolower(CAM),",")[[1]])
CAM <- c(paste0(str_split_fixed(CAM,"-",2)[,1],"-", firstup(str_split_fixed(CAM,"-",2)[,2])))
CAM[which(endsWith(CAM,"-"))] <- str_split_fixed(string = CAM[which(endsWith(CAM,"-"))],
pattern = "-",n = 2)[,1]
hspc.combined <- scoreCells3(hspc.combined,signature = CAM,sigName = "CAM",outdir = NULL)
hemDev <- gprofiler_aging_results$Old_down[which(gprofiler_aging_results$Old_down$term.name == "hematopoietic or lymphoid organ development"),"intersection"]
hemDev <- firstup(str_split(tolower(hemDev),",")[[1]])
hspc.combined <- scoreCells3(hspc.combined,signature = hemDev,sigName = "HLOD",outdir = NULL)
MHC <- gprofiler_aging_results$Old_up[which(gprofiler_aging_results$Old_up$term.name == "MHC protein complex"),"intersection"]
MHC <- firstup(str_split(tolower(MHC),",")[[1]])
MHC <- c(paste0(str_split_fixed(MHC,"-",2)[,1],"-", firstup(str_split_fixed(MHC,"-",2)[,2])))
MHC[which(endsWith(MHC,"-"))] <- str_split_fixed(string = MHC[which(endsWith(MHC,"-"))],
pattern = "-",n = 2)[,1]
hspc.combined <- scoreCells3(hspc.combined,signature = MHC,sigName = "MHC",outdir = NULL)
Combined violin plots of these scores per clusters and per ages and vlnPlot of some selected aging markers. We also make some violin plots for aging markers related to specific clusters.
vlnPlotAgeCol <- function(feature,seurat,noClustLab = T,italic = T, legend = F) {
plot <- VlnPlot(object = seurat,
features = feature,
group.by = "numclust",
pt.size = 0,
split.by = "AGE",
ncol = 1,
cols = hue_pal()(2)[c(1,2)]
) +
xlab(label = NULL) +
ylab(label = NULL) +
theme(axis.text.x = element_text(angle = 0, hjust = 0.5,size =12),
plot.title = element_text(size = 18))
if(feature == 'CAM') #For a better visibility we arrange the scale for CAM signature
plot <- plot + scale_y_continuous(breaks = c(-0.5,0.5,1.5))
if(noClustLab) {
plot <-plot + theme(axis.text.y = element_blank())
}
if(italic) {
plot <- plot + labs(title = bquote(~italic(.(feature))))
}
if(!legend) {
plot <- plot + NoLegend()
}
plot <- plot + coord_flip()
return(plot)
}
selectedMarkers <- c("Tcrg-C4","Mcpt8","Fcer1a","Alcam","Cdkn1a","Nr4a1")
vlnGene1 <- ggplotGrob(vlnPlotAgeCol(feature = selectedMarkers[1],hspc.combined,noClustLab = F) + xlab(label = " \n "))
vlnListGenes <- lapply(selectedMarkers[-1], vlnPlotAgeCol,seurat = hspc.combined)
vlnListGenes <- lapply(vlnListGenes,ggplotGrob)
selectedSig <- c("HLOD","HEM","CAM","MHC","TMC")
vlnSig1 <- ggplotGrob(vlnPlotAgeCol(feature = selectedSig[1],hspc.combined,noClustLab = F,italic = F))
vlnListSigs <- lapply(selectedSig[-1], vlnPlotAgeCol,seurat = hspc.combined,italic = F)
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
vlnListSigs <- lapply(vlnListSigs,ggplotGrob)
grobSigs <- arrangeGrob(cbind(vlnSig1,
vlnListSigs[[1]],vlnListSigs[[2]],
vlnListSigs[[3]],vlnListSigs[[4]],
vlnGene1,
vlnListGenes[[1]],
vlnListGenes[[2]],vlnListGenes[[3]],
vlnListGenes[[4]],vlnListGenes[[5]],
size = "last"))
legend3bc <- g_legend(vlnPlotAgeCol(feature = "Mcpt8",hspc.combined,legend = T)+
guides(colour = guide_legend(override.aes = list(size=10)),"horizontal"))
grid.arrange(grobSigs,
blank,
legend3bc, heights = c(10,1,1))
png("paperImages/png/fig3BC_vlns.png",width = 15, height = 6,units = "in",res = 300)
grid.arrange(grobSigs,
blank,
legend3bc, heights = c(10,1,1))
dev.off()
## png
## 2
svg("paperImages/svg/fig3BC_vlns.svg",width = 15, height = 6)
grid.arrange(grobSigs,
blank,
legend3bc, heights = c(10,1,1))
dev.off()
## png
## 2
Pseudotime ordering of young and old HSPCs
Pseudotime ordering consists in computationally order cells along trajectories,allowing the unbiased study of cellular dynamic processes. In our study we use Monocle 2 (Qiu et al. 2017) to order HSPCs along early hematopoiesis differenciation.
We used the integrated matrix of Seurat workflow as input of Monocle. Primed B cluster are excluded from this analysis because of its high distance from the other cells in the UMAP. This exclusion is also supported by the fact that monocle 2 put the pB cells away from the trajectory of old cells ordered alone (see below supplementary figure 6).
Pseudotime ordering was performed in the snakemake workflow. We load the results.
monocle <- readRDS("../output/MonocleIntegrationWoPb/monocleWithCounts.rds")
#Need to update the object (new version of MonocleIntegration)
fData(monocle) <- data.frame(gene_short_name = rownames(Biobase::exprs(monocle)))
rownames(fData(monocle)) <- fData(monocle)$gene_short_name
monocle@expressionFamily <- negbinomial.size()
pData(monocle)$predicted <- factor(monocle$predicted,levels = c("LTHSC","STHSC","MPP2","MPP3"))
pData(monocle)$phases <- mapvalues(x=pData(monocle)$phases,from = c("G1_G0","S","G2_M"),to =c("G1/G0","S","G2/M") )
pData(monocle)$AGE <- factor(monocle$AGE,levels = c("Young","Old"))
pData(monocle)$Cluster <- hspc.combined$numclust[rownames(pData(monocle))]
pData(monocle)$numclust <- hspc.combined$numclust[rownames(pData(monocle))]
Plotting monocle States
monocle$monocleState <- monocle$State
colorTrajStateFinal <- c("#999999", "#E69F00", "#56B4E9", "#009E73","#F0E442")
colorTrajStateFinal <- colorTrajStateFinal[c(1,5,4,2,3)]
ddrtreeState <- plot_cell_trajectory(monocle,theta = 180,show_branch_points = F) + scale_color_manual(values = colorTrajStateFinal)
ddrtreeState
svg("paperImages/svg/fig4A_ddrtreeState.svg",height = 5,width = 5)
ddrtreeState + guides(colour = guide_legend(override.aes = list(size=5))) +
theme(legend.text = element_text(size = 16),
axis.title = element_text(size = 18))
dev.off()
## png
## 2
Plotting cell type proportion by monocle states to get the departure of the trajectory.
getStateCt <- function(monocle,splitByAge = F,coord_flip = T) {
stateAge <- ddply(pData(monocle),~AGE + State + predicted,nrow)
stateAge$predicted <- factor(stateAge$predicted, levels = c("MPP3","MPP2","STHSC","LTHSC"))
if (!splitByAge) {
stateAGE <- ggplot(data.frame(rev(stateAge)), aes(fill = predicted,y = V1, x=State)) +
geom_bar( stat="identity", position="fill")+
scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") +
scale_fill_manual(breaks = c("LTHSC","STHSC","MPP2","MPP3"),values = rev(colorCellType)) +
theme(legend.title=element_blank(),
legend.position = "top",
axis.title.x = element_text(size = 12))+
guides(fill =guide_legend(ncol=2,nrow=2,byrow=TRUE))
}else {
stateAGE <- ggplot(data.frame(rev(stateAge)), aes(fill = predicted,y = V1, x=State,alpha = AGE)) +
geom_bar( stat="identity", position="fill")+
scale_y_continuous(name = "HSPC subtypes (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") +
scale_alpha_manual(values = c(0.5,1)) +
scale_fill_manual(breaks = c("LTHSC","STHSC","MPP2","MPP3"),values = rev(colorCellType)) +
theme(legend.title=element_blank(),
legend.position = "top",
axis.title.x = element_text(size = 12)) +
guides(colour =guide_legend(ncol=2,nrow=2,byrow=TRUE))
}
if(coord_flip) {
stateAGE <- stateAGE + coord_flip()
}
return(stateAGE)
}
stateCt <- getStateCt(monocle)
stateCt
svg("paperImages/svg/fig4B_ctStateBp.svg",height = 3,width = 3.5)
stateCt
dev.off()
## png
## 2
Plot cell pseudotime values in the trajectory.
ddrtreePt <- plot_cell_trajectory(monocle,color_by = "Pseudotime",theta = 180,show_branch_points = F)
ddrtreePt <- ddrtreePt + viridis::scale_color_viridis(option = "D") +
theme(legend.text = element_text(size = 18),
axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_text(size = 18),
legend.key.width = unit(0.3, "in"))
ddrtreePt
svg("paperImages/svg/fig4C_ddrtreePt.svg",height = 5,width = 5)
ddrtreePt
dev.off()
## png
## 2
We identified the state markers with seurat FindAllMarkers function (default wilcox test).
# Add monocle results to seurat combined object
hspc.combined.wopB <- subset(hspc.combined, idents = "pB",invert = TRUE)
hspc.combined.wopB$Pseudotime <- pData(monocle)[rownames(hspc.combined.wopB@meta.data),"Pseudotime"]
hspc.combined.wopB$State <- pData(monocle)[rownames(hspc.combined.wopB@meta.data),"State"]
Idents(hspc.combined.wopB) <- "State"
stateMarkers <- FindAllMarkers(hspc.combined.wopB,only.pos = T,logfc.threshold= 0.25)
## Calculating cluster 1
## Calculating cluster 2
## Calculating cluster 3
## Calculating cluster 4
## Calculating cluster 5
stateMarkers <- stateMarkers[which(markers$p_val_adj < 0.05),]
State 2 share a lot of its markers with state 4. We identified markers of state 2 versus state 4. We write the results of state markers tests in supplementary table 7.
markers2vs4 <- FindMarkers(hspc.combined.wopB,ident.1 = "2",ident.2 = "4",logfc.threshold = 0.25)
markers2vs4 <- markers2vs4[which(markers2vs4$p_val_adj < 0.05),]
markers2vs4$sens <- "down in 2 vs 4"
markers2vs4$sens[which(markers2vs4$avg_logFC > 0)] <- "up in 2 vs 4"
markers2vs4$gene = rownames(markers2vs4)
markers2vs4 <- markers2vs4[,c("gene","sens","p_val_adj","p_val","avg_logFC","pct.1","pct.2")]
markers2vs4 <- markers2vs4[order(markers2vs4$sens,markers2vs4$avg_logFC,-abs(markers2vs4$avg_logFC),decreasing = c(T,F,F)),]
#write the two result tables
stateMarkers <- stateMarkers[order(stateMarkers$cluster),c("cluster","gene","p_val_adj","p_val","avg_logFC","pct.1","pct.2")]
colnames(stateMarkers)[1] <- "state"
stateMarkers <- list(All = stateMarkers,
markers2vs4 = markers2vs4)
WriteXLS::WriteXLS(x =stateMarkers,"paperTables/stateMarkersTableS7.xlsx",row.names = F)
We characterize the state with some signatures (Chambers et al. 2007) from the litterature and genes we identified as state markers. Signature score were computed in the snakemake workflow with the AddModuleScore function of Seurat.
genes <- c("Procr","Hdc","Pf4","Gata3")
# Function to pass log-normalised data by Seurat to Monocle
addSeuratSlotDataToMonocle <- function(short_names,monocle,seurat,slot = "data") {
for (g in short_names) {
pData(monocle)[, g] <- GetAssayData(seurat,slot = slot)[g,rownames(pData(monocle))]
}
return(monocle)
}
monocle <- addSeuratSlotDataToMonocle(genes,monocle,hspc.combined)
plotFeature <- function(monocle,
feature,
shortName = NULL,
threeCol = F,
colHigh = "red",
colLow = "grey80",
#sameScaleForAllSig =T,
colLowIf3col = "blue"
) {
p <- plot_cell_trajectory(monocle,
color_by = feature,
theta = 180,
show_branch_points = F,
cell_size = 0.5,
show_tree = F) +
scale_color_gradient(low =colLow, high = colHigh) +
theme(axis.ticks = element_blank(),
axis.text = element_blank(),
axis.title = element_blank(),
plot.title = element_text(size = 16))
if (threeCol) {
p <- p +
scale_color_gradient2(low = colLowIf3col, mid =colLow, high = colHigh,
breaks = c(min(pData(monocle)[,feature]),0, max(pData(monocle)[,feature])),
limits = c(min(pData(monocle)[,feature]),
max(pData(monocle)[,feature])),
labels = c(round(min(pData(monocle)[,feature]),digits = 2),
0,round(max(pData(monocle)[,feature]),digits = 2))) +
theme(legend.title = element_blank(),
legend.key.width = unit(0.4, "in"),
legend.key.height = unit(0.15, "in"),
legend.justification = "center")
}
if(is.null(shortName)) {
p <- p + labs(title = bquote(~italic(.(feature))))
} else {
p <- p + labs(title = shortName)
}
return(p)
}
trajChambersHSC <- plotFeature(monocle,feature = "HSC_Chambers",shortName = "HSC",threeCol = T)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
trajChambersNaiveT <- plotFeature(monocle,"NaiveT_Chambers","Naive T",threeCol = T)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
trajChambersMye <- plotFeature(monocle, "Mye_Chambers","Myeloid",threeCol = T)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
trajChambersNer <- plotFeature(monocle, "N.er_Chambers","Erythroid",threeCol = T)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
ProcrTraj <- plotFeature(monocle,feature = "Procr")
HdcTraj <- plotFeature(monocle,feature = "Hdc")
Gata3Traj <- plotFeature(monocle,feature = "Gata3")
Pf4Traj <- plotFeature(monocle,feature = "Pf4")
legendGene <- g_legend(Pf4Traj + scale_colour_gradient(name = "Level",
breaks = c(0,max(monocle$Pf4)),
labels = c("0","max"),
low= 'grey80',high='red'),
direction="vertical")
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
grobSig2 <- arrangeGrob(rbind(cbind(ggplotGrob(trajChambersHSC),
ggplotGrob(trajChambersNaiveT),
ggplotGrob(trajChambersMye),
ggplotGrob(trajChambersNer),size = 'last'),
cbind(ggplotGrob(ProcrTraj + NoLegend()),
ggplotGrob(Gata3Traj+ NoLegend()),
ggplotGrob(HdcTraj+ NoLegend()),
ggplotGrob(Pf4Traj + NoLegend()),size = 'last'),
size = 'last'))
grid.arrange(grobSig2,arrangeGrob(blank,legendGene,ncol = 1,heights = c(1,1)),nrow=1,widths = c(10,1))
png("paperImages/png/fig4D_testGRadient2.png",width = 12,height = 6,res = 300,units = "in")
grid.arrange(grobSig2,arrangeGrob(blank,legendGene,ncol = 1,heights = c(1,1)),nrow=1,widths = c(10,1))
dev.off()
## png
## 2
svg("paperImages/svg/fig4D_testGRadient2.svg",width = 12,height = 6)
grid.arrange(grobSig2,arrangeGrob(blank,legendGene,ncol = 1,heights = c(1,1)),nrow=1,widths = c(10,1))
dev.off()
## png
## 2
Analysis of Seurat cluster spreading along the trajecotry (4E). Analysis of state proprotion in each Seurat cluster (4F).
# Define a new order for the cluster depending of their median pseudotime value and their majority state
clusterOrder <- c("np2","np3","tgf","ifn","np1","np4","diff","pT","pNeu","pMast","div","pEr","pMk","rep")
monocle$Cluster <- factor(monocle$Cluster,levels = clusterOrder)
monocle$numclust <- factor(monocle$numclust,levels = clusterOrder)
clusterStatePlot2 <- function(monocle,splitByAge = F) {
clusterStateAge <- ddply(pData(monocle),~numclust + State + AGE,nrow)
table(clusterStateAge$numclust,clusterStateAge$State)
clusterStateAge$AGE <- factor(clusterStateAge$AGE,levels = c("Old","Young"))
if (splitByAge) {
clusterStatePlot <- ggplot(clusterStateAge,
aes(fill=factor(as.character(State),levels = rev(levels(monocle$State))),
alpha = AGE ,x=numclust,y=V1)) + geom_bar( stat="identity", position="fill")+
scale_y_continuous(name = "Monocle state (%)", labels = c(0,25,50,75,100)) +
#ylab(label = "State")+xlab(label = "") + coord_flip() +
theme(legend.title=element_blank())+
xlab(label = "") +
scale_fill_manual(breaks= levels(monocle$State),values = rev(colorTrajStateFinal)) +
scale_alpha_manual(breaks = c("Young","Old"),values = c(0.5,1)) + coord_flip() +
theme(legend.position = "bottom")
} else {
clusterStatePlot <- ggplot(clusterStateAge,
aes(fill=factor(as.character(State),
levels = rev(levels(monocle$State))),
x=numclust,y=V1)) + geom_bar( stat="identity", position="fill")+
scale_y_continuous(name = "Monocle states (%)", labels = c(0,25,50,75,100)) +
#ylab(label = "State")+xlab(label = "") + coord_flip() +
theme(legend.title=element_blank())+
xlab(label = "") +
scale_fill_manual(breaks= levels(monocle$State),values = rev(colorTrajStateFinal)) +
coord_flip() +
theme(legend.position = "bottom")
}
}
getMajClust <- function(meta,factor) {
meta$majState <- NA
for (c in unique(meta[,factor])) {
meta$majState[which(meta[,factor] == c)] <- names(which.max(table(meta$State[which(meta[,factor] == c)])))
}
return(meta)
}
getPseudotimeBoxPlot <- function(monocle,factor ="finalclust",stretched =F,splitByAge = F) {
metaData <- pData(monocle)
if(stretched) {
metaData$Pseudotime <- 100 * metaData$Pseudotime/max(metaData$Pseudotime)
}
meta <- metaData
meta <- getMajClust(meta,factor)
if(splitByAge) {
ggplot(meta,aes_string(x=factor,y="Pseudotime",fill = "majState",alpha = "AGE")) +
geom_boxplot() +
ylab(label = "Pseudotime") +
xlab(label = "") +
scale_fill_manual(values = colorTrajStateFinal[as.numeric(sort(unique(meta$majState)))])+
scale_alpha_manual(values = c(0.5,1)) +
coord_flip()
} else {
ggplot(meta,aes_string(x=factor,y="Pseudotime",fill = "majState")) +
geom_boxplot() +
ylab(label = "Pseudotime") +
xlab(label = "") +
scale_fill_manual(values = colorTrajStateFinal[as.numeric(sort(unique(meta$majState)))])+
scale_alpha_manual(values = c(0.5,1)) +
coord_flip()
}
}
pseudotimeBoxPlot <- getPseudotimeBoxPlot(monocle,factor = "numclust")
clusterStatePlot <- clusterStatePlot2(monocle)
legClusterPt <- g_legend(clusterStatePlot)
clustStatePt <- arrangeGrob(arrangeGrob(pseudotimeBoxPlot+NoLegend(),clusterStatePlot+theme(axis.text.y = element_blank())+NoLegend(),nrow= 1,widths = c(3,2)),legClusterPt,ncol = 1,heights =c(10,1))
grid.arrange(clustStatePt)
svg("paperImages/svg/fig4EF_clustStatePt.svg",height=5,width=5)
grid.arrange(clustStatePt)
dev.off()
## png
## 2
Plot a tree view of the differentiation trajectory
treeTraj <- plot_complex_cell_trajectory(monocle,show_branch_points = F) + scale_color_manual(values = colorTrajStateFinal) + coord_flip() + scale_y_reverse() + scale_x_reverse()
treeTraj
svg("paperImages/svg/fig5A_complexTrajVert.svg",height = 5,width = 4)
treeTraj + guides(colour = guide_legend(override.aes = list(size=5))) +
theme(legend.text = element_text(size = 14))
dev.off()
## png
## 2
State proportions at each age (4H). Primed States (states 2,4,5) proportion at each age (4I).
statesAge <- ddply(pData(monocle),~AGE + State,nrow)
statesAge$AGE <- factor(statesAge$AGE, levels = c("Young","Old"))
bpStateAge1 <- ggplot(data.frame(rev(statesAge)),
aes(fill = State,y = V1, x=AGE,alpha = AGE)) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorTrajStateFinal)+
scale_alpha_manual(breaks = c("Young","Old"),values = c(1,0.5)) +
scale_y_continuous(name = "Monocle state (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
guides(alpha = FALSE)
statesAge <- ddply(pData(monocle)[which(pData(monocle)$State %in% c(2,4,5)),],~AGE + State,nrow)
statesAge$AGE <- factor(statesAge$AGE, levels = c("Young","Old"))
bpStateAge2 <- ggplot(data.frame(rev(statesAge)), aes(fill = State,y = V1, x=AGE,alpha = AGE)) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorTrajStateFinal[c(2,4,5)])+
scale_y_continuous(name = "Monocle state (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") +
scale_alpha_manual(breaks = c("Young","Old"),values = c(1,0.5)) +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
guides(alpha = FALSE)
bpStateAgeSupp <- arrangeGrob(arrangeGrob(bpStateAge1+NoLegend(),bpStateAge2 + NoLegend(),nrow= 1),g_legend(bpStateAge1,"horizontal"),nrow = 2,heights = c(10,1))
grid.arrange(bpStateAgeSupp)
svg(file="paperImages/svg/fig4HI_bpStateAge2.svg",height = 6,width = 4)
grid.arrange(bpStateAgeSupp)
dev.off()
## png
## 2
loading results of young and old separtely ordered with monocle from seurat3 integration results
# load monocle results
monocleYoung <- readRDS("../output/young/MonocleIntegration/monocleWithCounts.rds")
monocleOld <- readRDS("../output/old/MonocleIntegration/monocleWithCounts.rds")
oldCells <- colnames(hspc.combined)[which(startsWith(colnames(hspc.combined),"old"))]
youngCells <- colnames(hspc.combined)[which(startsWith(colnames(hspc.combined),"young"))]
# reorder factors
pData(monocleYoung)$predicted <- factor(monocleYoung$predicted,levels = c("LTHSC","STHSC","MPP2","MPP3"))
pData(monocleOld)$predicted <- factor(monocleOld$predicted,levels = c("LTHSC","STHSC","MPP2","MPP3"))
# Add seurat cluster (determined on the whole data)
pData(monocleOld)[oldCells,"cluster"] <- hspc.combined@meta.data[oldCells,"numclust"]
pData(monocleYoung)[youngCells,"cluster"] <- hspc.combined@meta.data[youngCells,"numclust"]
#reorder cluster
clusterOrder <- c("pB","np2","np3","tgf","ifn","np1","np4","diff","pT","pNeu","pMast","div","pEr","pMk","rep")
monocleOld$numclust <- factor(monocleOld$cluster,levels = clusterOrder)
monocleYoung$numclust <- factor(monocleYoung$cluster,levels = clusterOrder)
# Rename states to avoid confusion
monocleYoung$State <- factor(monocleYoung$State,level = c("1","2","3","6","7","8"))
monocleOld$State <- factor(monocleOld$State,level = c("1","2","3","6","7","8"))
monocleYoung$State[which(monocleYoung$State == "1")] <- "6"
monocleYoung$State[which(monocleYoung$State == "2")] <- "7"
monocleYoung$State[which(monocleYoung$State == "3")] <- "8"
# switch state in old traj to have the same order as the two other trajectory
monocleOld$State[which(monocleOld$State == "1")] <- "6"
monocleOld$State[which(monocleOld$State == "2")] <- "8"
monocleOld$State[which(monocleOld$State == "3")] <- "7"
# Drop old levels
monocleYoung$State <- factor(monocleYoung$State,level = c("6","7","8"))
monocleOld$State <- factor(monocleOld$State,level = c("6","7","8"))
# redefine states by specifying a pB State that correspond to cells of the pB cluster in the whole dataset
monocleOld$StateWithPb <- monocleOld$State
monocleYoung$StateWithPb <- monocleYoung$State
monocleYoung$StateWithPb <- factor(monocleYoung$State,level = c(c("6","7","8","pB")))
monocleOld$StateWithPb <- factor(monocleOld$State,level = c(c("6","7","8","pB")))
monocleOld$StateWithPb[which(monocleOld$numclust == "pB")] <- "pB"
monocleYoung$StateWithPb[which(monocleYoung$numclust == "pB")] <- "pB"
# Make the plot A B & C young
ddrtreeStateYoung <- plot_cell_trajectory(monocleYoung,
theta = 180,
color_by = "StateWithPb",
show_branch_points = F,
cell_size = 0.75) +
scale_color_manual(values = colorTrajStateFinal[c(1,4,5)]) +
scale_color_manual(values = c(colorTrajStateFinal[c(1,4,5)],"#924900"))
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
stateCtYoung <- getStateCt(monocleYoung,coord_flip = F) + scale_fill_manual(breaks = c("LTHSC", "STHSC", "MPP2", "MPP3"), values = rev(colorCellType))
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
ddrtreePtYoung <- plot_cell_trajectory(monocleYoung,
color_by = "Pseudotime",
theta = 180,
show_branch_points = F) +
viridis::scale_color_viridis(option = "D") +
theme(axis.text = element_blank(),
axis.ticks = element_blank()
)
# Make the plot A B & C old
ddrtreeStateOld <- plot_cell_trajectory(monocleOld,
theta = 180,
color_by = "StateWithPb",
show_branch_points = F,
cell_size = 0.75) +
scale_color_manual(values = c(colorTrajStateFinal[c(1,4,5)],"#924900"))
stateCtOld <- getStateCt(monocleOld,coord_flip = F) + scale_fill_manual(breaks = c("LTHSC", "STHSC", "MPP2", "MPP3"), values = rev(colorCellType)) + theme(legend.position = "bottom")
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
ddrtreePtOld <- plot_cell_trajectory(monocleOld,color_by = "Pseudotime",theta = 180,show_branch_points = F)+ viridis::scale_color_viridis(option = "D") +
viridis::scale_color_viridis(option = "D") +
theme(axis.text = element_blank(),
axis.ticks = element_blank()
)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
# Make C & D plot young
pseudotimeBoxPlotYoung <- getPseudotimeBoxPlot(monocleYoung,factor = "numclust") +
scale_fill_manual(breaks = c(6,7,8),
values = colorTrajStateFinal[c(1,4,5)])
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
clusterStatePlotYoung <- clusterStatePlot2(monocleYoung)+
scale_fill_manual(breaks = c(6,7,8),
values = rev(colorTrajStateFinal[c(1,4,5)])) #rev color order because level reversed in the clusterStatePlot2 function
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
# Make C & D plot old
pseudotimeBoxPlotOld <- getPseudotimeBoxPlot(monocleOld,factor = "numclust") +
scale_fill_manual(breaks = c(6,7,8),
values = colorTrajStateFinal[c(1,4,5)])
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
clusterStatePlotOld <- clusterStatePlot2(monocleOld) +
scale_fill_manual(breaks = c(6,7,8),
values = rev(colorTrajStateFinal[c(1,4,5)]))
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
Draw the figure
legClusterPt <- g_legend(ddrtreePtOld,"horizontal")
legDdrtreeState <- g_legend(ddrtreeStateYoung +
theme(legend.text = element_text(size = 16),
legend.title = element_blank()) +
guides(colour = guide_legend(override.aes = list(size=5))),
"horizontal")
legPredicted <- g_legend(stateCtYoung,"horizontal")
clusterStatePlotOldForLeg <- clusterStatePlotOld+
guides(colour = guide_legend(override.aes = list(size=legendBoxSize)))
legStateBox <- g_legend(clusterStatePlotOldForLeg,"horizontal")
youngMonocleGrob <- arrangeGrob(ddrtreeStateYoung+NoLegend(),
stateCtYoung + NoLegend(),
ddrtreePtYoung + NoLegend(),
pseudotimeBoxPlotYoung + NoLegend(),
clusterStatePlotYoung + NoLegend() + theme(axis.text.y = element_blank()),nrow=1, widths = c(2,1,2,1.75,1.25))
oldMonocleGrob <- arrangeGrob(ddrtreeStateOld+NoLegend(),
stateCtOld+NoLegend(),
ddrtreePtOld + NoLegend(),
pseudotimeBoxPlotOld+ NoLegend(),
clusterStatePlotOld+ NoLegend() + theme(axis.text.y = element_blank()),nrow = 1, widths = c(2,1,2,1.75,1.25))
gridS6 <- arrangeGrob(youngMonocleGrob,oldMonocleGrob,
arrangeGrob(legDdrtreeState,blank,legPredicted,blank,legClusterPt,blank,legStateBox,nrow=1,
widths = c(2,0.5,1.2,0.25,1.9,2,2)),
ncol = 1,
heights= c(6,6,1))
grid.arrange(gridS6)
svg("paperImages/svg/figS6_monocleByAgeNew.svg",height=7,width=14)
grid.arrange(gridS6)
dev.off()
## png
## 2
png("paperImages/png/figS6_monocleByAgeNew.png",height=7,width=14,res = 300,units = "in")
grid.arrange(gridS6)
dev.off()
## png
## 2
Analysis of Seurat cluster repartition in the trajectory
# reorder cluster depending of their median pseudotime value and their majority state
clusterOrder <- c("np2","np3","tgf","ifn","np1","np4","diff","pT","pNeu","pMast","div","pEr","pMk","rep")
monocle$Cluster <- factor(monocle$Cluster,levels = clusterOrder)
monocle$numclust <- factor(monocle$numclust,levels = clusterOrder)
# Plot all clusters in ddrtree one by one
plotClusterInTraj <- function(cluster,monocle) {
pData(monocle)[,cluster] <- "other"
pData(monocle)[which(pData(monocle)$numclust == cluster),cluster] <- cluster
pData(monocle)[,cluster] <- factor(pData(monocle)[,cluster],levels = c(cluster,"other"))
ddrtreeClust <- plot_cell_trajectory(monocle,color_by = cluster,theta = 180,show_branch_points = F,cell_size = 0.25) +
theme(axis.title = element_blank(),
axis.text = element_blank(),
axis.ticks = element_blank()) +
ggtitle(cluster) +scale_colour_manual(values = c("red","grey"))+
geom_point(aes_string(alpha=cluster,colour = cluster),size = 0.25)+
scale_alpha_manual(values = c(1,0)) + NoLegend()
return(ddrtreeClust)
}
plotClustInTrajList <- lapply(clusterOrder,plotClusterInTraj,monocle = monocle)
figS4 <- cowplot::plot_grid(plotlist = plotClustInTrajList,nrow=3)
# figS4B <- arrangeGrob(figSNB,top=grid::textGrob("B",x=0,hjust=0,gp=gpar(fontface="bold")),
# left = grid::textGrob(" Old ",x=0,hjust=0,gp=gpar(fontface="bold")))
grid.arrange(figS4)
svg("paperImages/svg/figureS4_clustInTraj.svg",height = 9,width = 14)
grid.arrange(figS4)
dev.off()
## png
## 2
png("paperImages/png/figureS4_clustInTraj.png",height = 9,width = 14,res = 300,units = "in")
grid.arrange(figS4)
dev.off()
## png
## 2
Same plots as 4EF but with a split on the age (S8 A & B).
pseudotimeBoxPlotByAge <- getPseudotimeBoxPlot(monocle,factor = "numclust",splitByAge = T)
clusterStatePlotByAge <- clusterStatePlot2(monocle,splitByAge = T) + scale_y_continuous(name = "Monocle states (%)",labels = c(0,25,50,75,100))
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
legClusterPtByAge <- g_legend(clusterStatePlotByAge)
clustStatePtByAge <- arrangeGrob(arrangeGrob(pseudotimeBoxPlotByAge+NoLegend(),
clusterStatePlotByAge+theme(axis.text.y = element_blank())+NoLegend(),nrow= 1,widths = c(3,2)),legClusterPtByAge,ncol = 1,heights =c(10,1))
# grid.arrange(clustStatePtByAge)
# svg("paperImages/svg/figS8_clustStatePtByAge.svg",height=6,width=7)
# grid.arrange(clustStatePtByAge)
# dev.off()
Add a plot for branchhing age dependant localisation of each cluster (S8C)
clusterOrder <- c("np2","np3","tgf","ifn","np1","np4","diff","pT","pNeu","pMast","div","pEr","pMk","rep")
monocle$Cluster <- factor(monocle$Cluster,levels = clusterOrder)
monocle$numclust <- factor(monocle$numclust,levels = clusterOrder)
clusterStatePlotAGE2 <- function(monocle,splitByAge = F) {
clusterStateAge <- ddply(pData(monocle),~numclust + State + AGE,nrow)
clusterStateAge$AGE <- factor(clusterStateAge$AGE,levels = c("Old","Young"))
clusterStateAge$AGE <- substr(clusterStateAge$AGE, start = 1, stop = 1)
clusterStateAge$clustAGE <- paste(clusterStateAge$numclust,clusterStateAge$AGE,sep = "_")
ageOrder <- rep(c("O","Y"),14)
clusterAgeOrder <- paste(c("np2","np2","np3","np3","tgf","tgf",
"ifn","ifn","np1","np1",
"np4","np4","diff","diff",
"pT","pT","pNeu","pNeu","pMast","pMast",
"div","div","pEr","pEr",
"pMk","pMk","rep","rep"),ageOrder,sep = "_")
clusterStateAge$clustAGE <- factor(clusterStateAge$clustAGE,levels = clusterAgeOrder)
table(clusterStateAge$numclust,clusterStateAge$State)
clusterStatePlot <- ggplot(clusterStateAge,
aes(fill=factor(as.character(State),levels = c("5","4","3","2","1")),
alpha = AGE ,x=clustAGE,y=V1)) + geom_bar( stat="identity", position="fill")+
scale_y_continuous(name = "Monocle state (%)", labels = c(0,25,50,75,100)) +
#ylab(label = "State")+xlab(label = "") + coord_flip() +
theme(legend.title=element_blank())+
xlab(label = "") +
scale_fill_manual(breaks= c(1:5),values = rev(colorTrajStateFinal)) +
scale_alpha_manual(breaks = c("Young","Old"),values = c(0.5,1)) + coord_flip() +
theme(legend.position = "bottom")
}
clusterStatePlotByAge2 <- clusterStatePlotAGE2(monocle) + scale_y_continuous(name = "Monocle states (%)",labels = c(0,25,50,75,100)) + theme(axis.text.y = element_text(size = 9))
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
clustStatePtByAge <- arrangeGrob(arrangeGrob(pseudotimeBoxPlotByAge+NoLegend(),clusterStatePlotByAge+theme(axis.text.y = element_blank())+NoLegend(),clusterStatePlotByAge2+NoLegend(),nrow= 1,widths = c(3,2, 2.5)),legClusterPtByAge,ncol = 1,heights =c(10,1))
grid.arrange(clustStatePtByAge)
svg("paperImages/svg/figS8_clustStatePtByAge.svg",height=6,width=10)
grid.arrange(clustStatePtByAge)
dev.off()
## png
## 2
We use SCENIC (Aibar et al. 2017) to analyse Regulons of young and old HSPCs in the differentiation trajectory. We decide to focus only on transcription factors identified as clusters markers with Seurat. We decide to exclude negative regulons because of the high drop out rate in 10X chromium data that biaises inhibition detection.
Regulons analysis of young and old HSPCs in the differentiation trajectory in state 1,2 and 3 at the first branching between monocle state 2 (conducing to primed T fate) and 3 (tconducing the second branching).
regulonsTable <- read.csv("../output/ScenicAll/AUCell/regulons_enrichment.csv",check.names = F)
rownames(regulonsTable) <- regulonsTable[,1]
# Exclude pB cells to plot regulons in Traj without pB cells
regulonsTable <- regulonsTable[rownames(pData(monocle)),]
# Keep only positive regulons
pData(monocle)[,colnames(regulonsTable)[which(endsWith(colnames(regulonsTable),suffix = "+)"))]] <- regulonsTable[rownames(pData(monocle)),colnames(regulonsTable)[which(endsWith(colnames(regulonsTable),suffix = "+)"))]]
#Load function to make branched heatmap of regulons
#source("../R_src/makeBranchedHeatmap.R")
hmYoungOldBranchT_avg <- regulons_heatmap2(monocle,
regulonsTable,
first_traj_states=c(1,2),
second_traj_states=c(1,3),
cluster=4,
legend=TRUE,
add_annotation_row = F,
showTree = F,
clusterRow =T,
gaps_row = NULL,
legend_breaks = NA,
clusterFirstData = T,
clustering_method = "average",
colorStates = colorTrajStateFinal[c(1,2,3)],
breaksAdapt = F)
svg("paperImages/svg/fig5A_regulonsFirstP_avg.svg",height = 7,width = 7)
grid.arrange(hmYoungOldBranchT_avg$gtable)
dev.off()
## png
## 2
png("paperImages/png/fig5A_regulonsFirstP_avg.png",height = 7,width = 7,units = 'in',res = 300)
grid.arrange(hmYoungOldBranchT_avg$gtable)
dev.off()
## png
## 2
Regulons analysis of young and old HSPCs in the differentiation trajectory in states 1,3,4,5 at the second branching between monocle state 4 (conducing to primed NeuMast fate) and 3 (conducing to MkEr fate).
h1Pt <- max(pData(monocle)$Pseudotime[which(pData(monocle)$State == 3)]) + max(pData(monocle)$Pseudotime[which(pData(monocle)$State == 2)])
h2Pt <- max(pData(monocle)$Pseudotime[which(pData(monocle)$State == 4)]) + max(pData(monocle)$Pseudotime[which(pData(monocle)$State == 5)])
h2FigWidth <- h2Pt*7/h1Pt
hmYoungOldBranchME_avg <- regulons_heatmap2(monocle,
regulonsTable,
first_traj_states=c(1,3,4),
second_traj_states=c(1,3,5),
cluster=3,
legend=TRUE,
add_annotation_row = F,
showTree = F,
clusterRow =T,
gaps_row = NULL,
legend_breaks = NA,
clusterFirstData = T,
clustering_method = "average",
colorStates = colorTrajStateFinal[c(1,3,4,5)])
svg("paperImages/svg/fig5C_regulonsMyeP_avg.svg",height = 7,width = h2FigWidth)
grid.arrange(hmYoungOldBranchME_avg$gtable)
dev.off()
## png
## 2
png("paperImages/png/fig5C_regulonsMyeP_avg.png",height = 7,width = h2FigWidth,units = 'in',res = 300)
grid.arrange(hmYoungOldBranchME_avg$gtable)
dev.off()
## png
## 2
Make the regulons table excluding negative ones and write it in supplementary table 8.
regulons <- RJSONIO::fromJSON("../output/ScenicAll//cis_target/regulons.json")
# keep only positive regulons
regulons <- regulons[which(endsWith(names(regulons),"(+)"))]
# Interaction table of positive regulons and their gene targets for the analysis in cytoscape
regulonsTable<- matrix(ncol = 2)
for (i in names(regulons)) {
add <- cbind(rep(i,length(regulons[[i]])),regulons[[i]])
regulonsTable<- rbind(regulonsTable,add)
}
regulonsTable<- regulonsTable[-1,]
colnames(regulonsTable) <- c("regulon","gene")
regulonsTable <- as.data.frame(regulonsTable)
# keep only the name of the tf for the regulons
regulonsTable$regulon <- str_split_fixed(regulonsTable$regulon,"\\(",n=2)[,1]
dir.create("paperTables/Cytoscape",showWarnings = F,recursive = T)
write.table(regulonsTable,file = "paperTables/Cytoscape/regulonGene.csv",sep = "\t",col.names = T,row.names = F)
#Regulons table for the paper
regulonsTable2 <- data.frame(regulon = names(regulons), target = rep(NA,length(regulons)))
for (i in c(1:length(regulonsTable2$regulon))){
regulonsTable2[i,"target"] <- paste(regulons[[regulonsTable2[i,"regulon"]]],collapse = ",")
}
#write.table(regulonsTable2,file = "paperTables/regulonsTableTableS8.csv",sep = "\t",col.names = T,row.names = F)
WriteXLS::WriteXLS(x =regulonsTable2,"paperTables/regulonsTableTableS8.xlsx",row.names = F)
Plot the score for Gata3 and Klf4 regulons toward T Lymphocyte fate
plotScoreRegulon <- function(monocle,
traj,
regulon,
se = T,
ptBranching =max(pData(monocle)[which(pData(monocle)$State ==1),"Pseudotime"])) {
plot <- ggplot(pData(monocle)[which(pData(monocle)$State %in% traj),] ,aes(color = AGE,y = get(regulon), x=Pseudotime)) +
geom_smooth(se = se) +
ylab("AUCell score") +
scale_color_manual(values = hue_pal()(2)[c(2,1)]) +
geom_vline(aes(xintercept=ptBranching), color="black", linetype="dashed", size=1)
return(plot)
}
regGata3 <- plotScoreRegulon(monocle,c(1,2),"Gata3(+)")
regKlf4 <- plotScoreRegulon(monocle,c(1,2),"Klf4(+)")
legendRegulonScore <- g_legend(regKlf4 +theme(title = element_blank()),"vertical")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
regKlf4 <- regKlf4+NoLegend()+ ggtitle("Klf4")+
theme(plot.title = element_text(size = 16))
regGata3 <- regGata3+NoLegend()+ggtitle("Gata3") +
theme(plot.title = element_text(size = 16))
regScores <- arrangeGrob(regKlf4,
blank,
regGata3,
ncol = 1,
heights = c(8,2,8))
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
grid.arrange(regScores)
svg("paperImages/svg/fig5D_Gata3Klf4reg.svg",height=5,width=4)
grid.arrange(regScores,blank,legendRegulonScore,nrow=1,widths = c(5,0.2,2))
dev.off()
## png
## 2
Inspect proportion of Gata3 targets in pT cluster markers an Klf4 targets in tgf cluster markers
Gata3TargetsInpT <- regulons$`Gata3(+)`[which(regulons$`Gata3(+)` %in% markers[which(markers$cluster == "pT"),"gene"])]
Klf4TargetsIntgf <- regulons$`Klf4(+)`[which(regulons$`Klf4(+)` %in% markers[which(markers$cluster == "tgf"),"gene"])]
# proportion of markers in the targets
length(Gata3TargetsInpT)/length(regulons$`Gata3(+)`)
## [1] 0.2666667
length(Klf4TargetsIntgf)/length(regulons$`Klf4(+)`)
## [1] 0.625
Barplot of targets number per regulons
regulonTargetCount <- ddply(as.data.frame(regulonsTable),~regulon,nrow)
regulonTargetCountBp <- ggplot(regulonTargetCount[,c("regulon","V1")], aes(y = V1, x=regulon)) +
geom_bar(stat = "identity",position = "dodge")+
theme(axis.text.x = element_text(angle = 45, hjust = 1))+
xlab(label = "")+
ylab(label = "number of targets")+
scale_y_log10() +
theme(legend.title=element_blank(),legend.position = "bottom")
regulonTargetCountBp
svg("paperImages/svg/figS8_regulonTargetNumber.svg", height = 6,width = 12)
regulonTargetCountBp
dev.off()
## png
## 2
HSPC subtypes and Cell cycle phases analysis of young and old cells along the differentiation trajectory
Proportion of cell cycle phases assigned with cyclone in young and old HSPCs.
phasesAge <- ddply(hspc.combined@meta.data,~AGE + phases,nrow)
phasesAge$AGE <- factor(phasesAge$AGE, levels = c("Young","Old"))
phasesAge$predicted <- "HSPC"
phasesAGE <- ggplot(data.frame(rev(phasesAge)), aes(fill = phases,y = V1, x=AGE)) +
facet_wrap(~predicted,nrow = 1) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorPhases)+
scale_y_continuous(name = "Cell cycle phases (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "") + theme(axis.text.x = element_text(angle = 45, hjust = 1))
#phasesAGE
# svg("paperImages/svg/fig6B_phaseAge.svg",height = 5,width = 3)
# phasesAGE
# dev.off()
#
# remove(phasesAGE)
Proportion of cell cycle phases assigned with cyclone in young and old HSPCs by subtypes.
# Per age
phaseCtAge <- ddply(hspc.combined@meta.data,~AGE + predicted + phases,nrow)
phaseCtAge$AGE <- factor(phaseCtAge$AGE, levels = c("Young","Old"))
ctPhasesAGE <- ggplot(data.frame(rev(phaseCtAge)), aes(fill = phases,y = V1, x=AGE)) +
facet_wrap(~predicted,nrow = 1) +
geom_bar( stat="identity", position="fill")+
scale_fill_manual( values= colorPhases)+
scale_y_continuous(name = "Cell cycle phases (%)", labels = c(0,25,50,75,100))+
ylab(label = "")+xlab(label = "")+ theme(axis.text.x = element_text(angle = 45, hjust = 1))
#ctPhasesAGE
legPhases <- g_legend(ctPhasesAGE,"horizontal")
grobPhaseAge <- arrangeGrob(arrangeGrob(phasesAGE+ NoLegend(),ctPhasesAGE + NoLegend(),nrow=1,widths = c(1.5,4)),
legPhases,ncol = 1,heights = c(10,1))
grid.arrange(grobPhaseAge)
svg("paperImages/svg/fig6C_phaseAgeCt.svg",height = 4,width = 6)
grid.arrange(grobPhaseAge)
dev.off()
## png
## 2
png("paperImages/png/fig6C_phaseAgeCt.png",height = 4,width = 6,res = 300,units = "in")
grid.arrange(grobPhaseAge)
dev.off()
## png
## 2
remove(ctPhasesAGE)
Chi2 independence test for several couple of variable and associated p-values.
### Do the tests
### With All data
phases_AGE_chi2 <- chisq.test(table(hspc.combined@meta.data$phases,hspc.combined@meta.data$AGE))
phases_ct_chi2 <- chisq.test(table(hspc.combined@meta.data$phases,hspc.combined@meta.data$predicted))
chitest_all <- data.frame(AGE = phases_AGE_chi2$p.value,cell_type = phases_ct_chi2$p.value)
rownames(chitest_all) <- "Pearson's Chi-squared test p-value"
### By cell type
chisqByCellType <- function(cellType,table) {
tableST <- table[which(table$predicted==cellType),]
result <- chisq.test(table(tableST$phases,tableST$AGE))
print(cellType)
print(result)
return(cellType = result$p.value)
}
chiTest_phase_AGE_ByCT <- lapply(c("LTHSC","STHSC","MPP2","MPP3"),FUN = chisqByCellType,hspc.combined@meta.data)
## [1] "LTHSC"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$AGE)
## X-squared = 6.6329, df = 2, p-value = 0.03628
##
## [1] "STHSC"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$AGE)
## X-squared = 2.5772, df = 2, p-value = 0.2757
##
## [1] "MPP2"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$AGE)
## X-squared = 10.202, df = 2, p-value = 0.00609
##
## [1] "MPP3"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$AGE)
## X-squared = 0.54738, df = 2, p-value = 0.7606
names(chiTest_phase_AGE_ByCT) <- c("LTHSC","STHSC","MPP2","MPP3")
chiTest_phase_AGE_ByCT <- as.data.frame(chiTest_phase_AGE_ByCT)
rownames(chiTest_phase_AGE_ByCT) <- "Pearson's Chi-squared test p-value"
### By Age
chisqByAGE <- function(AGE,table) {
tableST <- table[which(table$AGE==AGE),]
result <- chisq.test(table(tableST$phases,tableST$predicted))
print(AGE)
print(result)
return(AGE = result$p.value)
}
chiTest_phase_ct_ByAGE <- lapply(c("Young","Old"),FUN = chisqByAGE,hspc.combined@meta.data)
## [1] "Young"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$predicted)
## X-squared = 345.41, df = 6, p-value < 2.2e-16
##
## [1] "Old"
##
## Pearson's Chi-squared test
##
## data: table(tableST$phases, tableST$predicted)
## X-squared = 170.33, df = 6, p-value < 2.2e-16
names(chiTest_phase_ct_ByAGE) <- c("Young","Old")
chiTest_phase_ct_ByAGE <- as.data.frame(chiTest_phase_ct_ByAGE)
rownames(chiTest_phase_ct_ByAGE) <- "Pearson's Chi-squared test p-value"
Results saved in supplementary table 9.
# Adapt the format
table1 <- t(chitest_all)
table1 <- format(table1,digits = 2,scientific = T)
table2 <- t(chiTest_phase_AGE_ByCT)
table2 <- format(table2,digits = 2,scientific = T)
table3 <- t(chiTest_phase_ct_ByAGE)
table3 <- format(table3,digits = 2,scientific = T)
# Rename the row
colnames(table1) <- "Chi2 pval"
rownames(table1) <- c("Cell cycle - age","Cell cycle - cell type")
colnames(table2) <- "Chi2 pval"
rownames(table2) <- c("Cell cycle - age for LTHSC",
"Cell cycle - age for STHSC",
"Cell cycle - age for MPP2",
"Cell cycle - age for MPP3")
colnames(table3) <- "Chi2 pval"
rownames(table3) <- c("Cell cycle - cell type for Young",
"Cell cycle - cell type for Old")
# Color in red significant dependance
tableChi2 <- rbind(table1,table2,table3)
tableChi2Color <- rep("black", dim(tableChi2)[1])
tableChi2Color[which(as.numeric(tableChi2) < 0.05)] <- "red"
table6d <- tableGrob(tableChi2,theme=ttheme_default(base_size = 12,core=list(fg_params=list(col=tableChi2Color))))
WriteXLS::WriteXLS(x =as.data.frame(tableChi2),"paperTables/chi2TableS9.xlsx", row.names = T)
Quiescence and Proliferation signature (Venezia et al. 2004) analysis.
The signatures in differenciation trajectory with a 2 divergent color gradient:
trajProliferationVenezia <- plotFeature(monocle,
feature = "Mm_Proliferation_Venezia",
threeCol = T,
shortName = "Proliferation") +
theme(legend.key.width = unit(0.5,units = "in")) +
theme(axis.title = element_blank(),
legend.text = element_text(size = 14),
legend.position = "right",
legend.key.width = unit(0.2, "in"),
legend.key.height = unit(0.4, "in"),
plot.title = element_text(size = 20)
)
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
trajQuiescenceVenezia <- plotFeature(monocle,
feature = "Mm_Quiescence_Venezia",
threeCol = T,
shortName = "Quiescence") +
theme(legend.key.width = unit(0.5,units = "in")) +
theme(axis.title = element_blank(),
legend.text = element_text(size = 14),
legend.position = "right",
legend.key.width = unit(0.2, "in"),
legend.key.height = unit(0.4, "in"),
plot.title = element_text(size = 20))
## Scale for 'colour' is already present. Adding another scale for
## 'colour', which will replace the existing scale.
grobScoreInTraj <- arrangeGrob(rbind(ggplotGrob(trajQuiescenceVenezia),
ggplotGrob(trajProliferationVenezia),size = 'last'),ncol = 1)
The signature score in differentiation pseudotime.
ptBranchingT <- max(pData(monocle)[which(pData(monocle)$State ==1),"Pseudotime"])
ptBranchingM <- max(pData(monocle)[which(pData(monocle)$State ==3),"Pseudotime"])
plotScore <- function(monocle,
traj,
sig,
shortName,
se = T,
ptBranching1 = NULL,
ptBranching2 = NULL) {
plot <- ggplot(pData(monocle)[which(pData(monocle)$State %in% traj),] ,aes(color = AGE,y = get(sig), x=Pseudotime)) +
geom_smooth(se = se) +
ylab("Score")+
scale_color_manual(values = hue_pal()(2)[c(1,2)]) +
ggtitle(shortName) +
theme(legend.text = element_text(size = 18),
axis.text = element_text(size = 16),
axis.title = element_text(size = 18),
plot.title = element_text(size = 20))+
guides(colour = guide_legend(override.aes = list(size=8)))
if (!is.null(ptBranching1)) {
plot <- plot + geom_vline(aes(xintercept=ptBranching1), color="red", linetype="dashed", size=1)
}
if (!is.null(ptBranching2)) {
plot <- plot + geom_vline(aes(xintercept=ptBranching2), color="black", linetype="dashed", size=1)
}
return(plot)
}
prolifSig <- plotScore(monocle,c(1,2,3,4,5),
"Mm_Proliferation_Venezia",
"Proliferation",
ptBranching1 = ptBranchingM, ptBranching2 = ptBranchingT
)
quiescSig <- plotScore(monocle,c(1,2,3,4,5),
"Mm_Quiescence_Venezia",
"Quiescence",
ptBranching1 = ptBranchingM, ptBranching2 = ptBranchingT)
grobScoreInPt <- arrangeGrob(arrangeGrob(quiescSig + NoLegend(),prolifSig + NoLegend(),ncol =1),
g_legend(quiescSig + theme(legend.title = element_blank()),"vertical"),nrow = 1,
widths = c(10,3))
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
png("paperImages/png/fig6C_signature.png",res = 300,width = 8,height = 5,units = 'in')
grid.arrange(grobScoreInTraj,grobScoreInPt,nrow=1,widths = c(1.5,2))
dev.off()
## png
## 2
Density of young and old cells per fate (T,NM and ME) along pseudotime.
pData(monocle)$AGE <- factor(pData(monocle)$AGE,levels = c("Old","Young"))
densityFateT <- ggplot(pData(monocle)[which(pData(monocle)$State %in% c(1,2)),],aes(x=Pseudotime,fill = AGE)) +
geom_density(alpha=0.5) +
geom_vline(xintercept = max(pData(monocle)[which(pData(monocle)$State == 1),"Pseudotime"]), linetype="dashed", color = "black",size=1) +
theme(legend.position = c("bottom")) +
xlab(label = "Pseudotime")+
xlim(c(0,max(pData(monocle)$Pseudotime))) +
guides(colour =guide_legend(ncol=1,nrow=2,byrow=TRUE))+
scale_fill_manual(breaks=c("Young","Old"),values = hue_pal()(2)) +
annotate("text", label = "1", x = 6, y = 0.22, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "2", x = 12, y = 0.22, size = 8, colour = colorTrajStateFinal[2])
densityFateM <- ggplot(pData(monocle)[which(pData(monocle)$State %in% c(1,3,4)),],aes(x=Pseudotime,fill = AGE)) +
geom_density(alpha=0.5) +
geom_vline(xintercept = max(pData(monocle)[which(pData(monocle)$State == 3),"Pseudotime"]), linetype="dashed", color = "red",size=1) +
geom_vline(xintercept = max(pData(monocle)[which(pData(monocle)$State == 1),"Pseudotime"]), linetype="dashed", color = "black",size=1) +
theme(legend.position = c("bottom")) +
xlab(label = "Pseudotime")+
xlim(c(0,max(pData(monocle)$Pseudotime))) +
guides(colour =guide_legend(ncol=1,nrow=2,byrow=TRUE))+
scale_fill_manual(breaks=c("Young","Old"),values = hue_pal()(2))+
annotate("text", label = "1", x = 6, y = 0.15, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 10.4, y = 0.15, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "4", x = 16, y = 0.15, size = 8, colour = colorTrajStateFinal[4])
densityFateE <- ggplot(pData(monocle)[which(pData(monocle)$State %in% c(1,3,5)),],aes(x=Pseudotime,fill = AGE)) +
geom_density(alpha=0.5) +
geom_vline(xintercept = max(pData(monocle)[which(pData(monocle)$State == 3),"Pseudotime"]), linetype="dashed", color = "red",size=1) +
geom_vline(xintercept = max(pData(monocle)[which(pData(monocle)$State == 1),"Pseudotime"]), linetype="dashed", color = "black",size=1) +
theme(legend.position = c("bottom")) +
xlab(label = "Pseudotime")+
xlim(c(0,max(pData(monocle)$Pseudotime))) +
guides(colour =guide_legend(ncol=1,nrow=2,byrow=TRUE))+
scale_fill_manual(breaks=c("Young","Old"),values = hue_pal()(2))+
annotate("text", label = "1", x = 6, y = 0.099, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 10.4, y = 0.099, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "5", x = 16, y = 0.099, size = 8, colour = colorTrajStateFinal[5])
A function to make stacked plot of cell metadata along pseudotime (cut in a given number of bin).
getAreaPlot <- function(monocle,
traj,
column = "predicted",
colorFactor = NULL,
order = NULL,
branchingT = NULL,
branchingME = NULL,
breaks = 20) {
if(!is.null(order)) {
pData(monocle)[,column] <- factor(pData(monocle)[,column],levels = order)
}
if(is.null(colorFactor)) {
colorFactor <- hue_pal()(length(unique(pData(monocle)[,column])))
}
if (is.null(pData(monocle)$PseudotimeBin)){
pData(monocle)$PseudotimeBin <- cut(pData(monocle)$Pseudotime,breaks = breaks,labels = FALSE)
}
if (is.null(branchingT)){
branchingT <- max(pData(monocle)[which(pData(monocle)$State == 1),"Pseudotime"])*breaks/max(pData(monocle)$Pseudotime)
}
if (is.null(branchingME)){
branchingME <- max(pData(monocle)[which(pData(monocle)$State == 3),"Pseudotime"])*breaks/max(pData(monocle)$Pseudotime)
}
column <<- column
metaDataYoung <- ddply(pData(monocle)[which(pData(monocle)$State %in% traj & pData(monocle)$AGE == "Young"),],
~eval(parse(text = column)) + PseudotimeBin+AGE,nrow)
metaDataOld <- ddply(pData(monocle)[which(pData(monocle)$State %in% traj & pData(monocle)$AGE == "Old"),],
~eval(parse(text = column)) + PseudotimeBin + AGE,nrow)
colnames(metaDataYoung)[1] <- column
colnames(metaDataOld)[1] <- column
if(!2 %in% traj) {
plot <- ggplot() +
geom_area(data = metaDataYoung, aes(x=PseudotimeBin, y= V1, alpha = AGE, fill=eval(parse(text = column)))) +
geom_area(data = metaDataOld, aes(x=PseudotimeBin, y= -V1, alpha = AGE, fill=eval(parse(text = column)))) +
geom_vline(xintercept = branchingT, linetype = "dashed",color = "black",size = 1)+
geom_vline(xintercept = branchingME, linetype = "dashed",color = "red",size = 1)+
geom_hline(yintercept=0,color='white',size = 1)+
scale_alpha_manual(values = c(0.5,1)) +
xlim(c(0,breaks)) +
scale_fill_manual( values= colorFactor,name = column)+
xlab("Pseudotime bin") +
ylab("Cell count")
} else {
plot <- ggplot() +
geom_area(data = metaDataYoung, aes(x=PseudotimeBin, y= V1, alpha = AGE, fill=eval(parse(text = column)))) +
geom_area(data = metaDataOld, aes(x=PseudotimeBin, y= -V1, alpha = AGE, fill=eval(parse(text = column)))) +
geom_vline(xintercept = branchingT, linetype = "dashed",color = "black",size = 1)+
geom_hline(yintercept=0,color='white',size = 1)+
scale_alpha_manual(values = c(0.5,1)) +
xlim(c(0,breaks)) +
scale_fill_manual( values= colorFactor,name = column)+
xlab("Pseudotime bin") +
ylab("Cell count")
}
return(plot)
}
Make stacked plot of HSPC subtypes along pseudotime with young (old) cells in the upper (lower) panel. Use the previous function to draw the stacked plot fot predicted cell type for the three fates (T,NM and ME) along pseudotime cut into 50 bins.
# To get the same bin as stacked plot for cell types, cut the bin on all the data before subsetting
breaks = 50
pData(monocle)$PseudotimeBin <- cut(pData(monocle)$Pseudotime,breaks = breaks,labels = FALSE)
branchingME <- max(pData(monocle)[which(pData(monocle)$State == 3),"Pseudotime"])*breaks/max(pData(monocle)$Pseudotime)
branchingT <- max(pData(monocle)[which(pData(monocle)$State == 1),"Pseudotime"])*breaks/max(pData(monocle)$Pseudotime)
predictedFateT <- getAreaPlot(monocle,
c(1,2),
column = "predicted",
colorFactor = rev(colorCellType),
order = c("MPP3","MPP2","STHSC","LTHSC"),
breaks = 50) +
scale_y_continuous(breaks=c(-600,-300,0,300,600),labels=c("600","300","0","300","600"),limits =c(-675,650)) +
annotate("text", label = "1", x = 15, y = 550, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "2", x = 30, y = 550, size = 8, colour = colorTrajStateFinal[2]) +
scale_fill_manual(breaks=c("LTHSC","STHSC","MPP2","MPP3"), values= rev(colorCellType),name = "predicted") + guides(alpha = FALSE) +theme(legend.title = element_blank())
## Scale for 'fill' is already present. Adding another scale for 'fill',
## which will replace the existing scale.
predictedFateM <- getAreaPlot(monocle,
c(1,3,4),
column = "predicted",
colorFactor = rev(colorCellType),
order = c("MPP3","MPP2","STHSC","LTHSC"),
breaks = 50)+ scale_y_continuous(breaks=c(-600,-300,0,300,600),labels=c("600","300","0","300","600"),limits =c(-675,650)) +
annotate("text", label = "1", x = 15, y = 550, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 26, y = 550, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "4", x = 40, y = 550, size = 8, colour = colorTrajStateFinal[4])
predictedFateE <- getAreaPlot(monocle,
c(1,3,5),
column = "predicted",
colorFactor = rev(colorCellType),
order = c("MPP3","MPP2","STHSC","LTHSC"),
breaks = 50)+
ylim(c(-675,650)) +
scale_y_continuous(breaks=c(-600,-300,0,300,600),labels=c("600","300","0","300","600"),limits =c(-675,650)) +
annotate("text", label = "1", x = 15, y = 550, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 26, y = 550, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "5", x = 40, y = 550, size = 8, colour = colorTrajStateFinal[5])
## Scale for 'y' is already present. Adding another scale for 'y', which
## will replace the existing scale.
ylabel <- 45.5
A function to draw division rate along pseudotime cut into a given number of bins
getJoinDivRateBp <- function (monocle,
phases = c("G2/M"),
breaks = 50) {
metaDiv <- pData(monocle)[which(pData(monocle)$phases %in% phases),]
traj <- unique(pData(monocle)$State)
df3 <- ddply(metaDiv,~AGE + PseudotimeBin ,nrow,.drop = FALSE)
df4 <- ddply(pData(monocle),~AGE + PseudotimeBin ,nrow,.drop = FALSE)
df3$totalCellsInBin <- NA
df3$totalCellsInBin[which(df3$AGE == "Young")] <-df4$V1[which(df4$AGE == "Young" & df4$PseudotimeBin %in% df3$PseudotimeBin[which(df3$AGE == "Young")])]
df3$totalCellsInBin[which(df3$AGE == "Old")] <- df4$V1[which(df4$AGE == "Old" & df4$PseudotimeBin %in% df3$PseudotimeBin[which(df3$AGE == "Old")])]
if(!2 %in% traj) {
plot <- ggplot(df3, aes(fill = AGE, y = V1/totalCellsInBin, x=PseudotimeBin))+
#facet_wrap(~Bin,nrow = 1) +
geom_bar(stat = "identity",position = "dodge")+
scale_y_continuous(name = "Div/cells in bin",limits = c(0,0.53))+
#ylab(label = "")+xlab(label = "") +
geom_vline(xintercept =branchingME, linetype="dashed",color = "red", size=1) +
geom_vline(xintercept =branchingT, linetype="dashed",color = "black", size=1) +
scale_fill_manual(breaks=c("Young","Old"),values = hue_pal()(2))+
xlim(c(0,breaks)) +
theme(legend.position = "bottom") +
xlab("Pseudotime bin")
} else {
plot <- ggplot(df3, aes(fill = AGE, y = V1/totalCellsInBin, x=PseudotimeBin))+
#facet_wrap(~Bin,nrow = 1) +
geom_bar(stat = "identity",position = "dodge")+
scale_y_continuous(name = "Div/cells in bin",limits = c(0,0.53))+
#ylab(label = "")+xlab(label = "") +
#geom_vline(xintercept =branchingME, linetype="dashed",color = "red", size=1) +
geom_vline(xintercept =branchingT, linetype="dashed",color = "black", size=1) +
scale_fill_manual(breaks=c("Young","Old"),values = hue_pal()(2))+
xlim(c(0,breaks)) +
theme(legend.position = "bottom") +
xlab("Pseudotime bin")
}
return(plot)
}
Use this function to draw division rate for old and young cells along pseudotime cut into 50 bins for the three fates (T,NM,ME)
divRateBpT <- getJoinDivRateBp(monocle[,which(pData(monocle)$State %in% c(1,2))]) +
annotate("text", label = "1", x = 15, y = 0.45, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "2", x = 30, y = 0.45, size = 8, colour = colorTrajStateFinal[2])
divRateBpM <- getJoinDivRateBp(monocle[,which(pData(monocle)$State %in% c(1,3,4))]) +
annotate("text", label = "1", x = 15, y = 0.45, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 26, y = 0.45, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "4", x = 40, y = 0.45, size = 8, colour = colorTrajStateFinal[4])
divRateBpE <- getJoinDivRateBp(monocle[,which(pData(monocle)$State %in% c(1,3,5))]) +
annotate("text", label = "1", x = 15, y = 0.45, size = 8, colour = colorTrajStateFinal[1]) +
annotate("text", label = "3", x = 26, y = 0.45, size = 8, colour = colorTrajStateFinal[3]) +
annotate("text", label = "5", x = 40, y = 0.45, size = 8, colour = colorTrajStateFinal[5])
Draw the final figure 6E, 6F 6G thah correspond to density plot, division rate and hspc subtypes stacked plot along the three pass of the trajectory (toward T, NeuMast and MkEr fates)
arrangeText <- function(sizeText = 19,sizeLabel = 21) {
adjustment <- theme(axis.title = element_text(size = sizeLabel),
axis.text.y = element_text(size = sizeText),
axis.text.x = element_text(size = sizeText))
return(adjustment)
}
densityLegend <- g_legend(densityFateM +
theme(legend.direction = 'vertical',
legend.title = element_blank(),
legend.text = element_text(size = 16),
legend.key = element_rect(size = 5),
legend.key.size = unit(1.25, 'lines'),
legend.spacing.x = unit(0.5, 'lines')),"vertical")
divRateLegend <- g_legend(divRateBpE + labs(fill="") +
theme(legend.direction = 'vertical',
legend.text = element_text(size = 16),
legend.key = element_rect(size = 5),
legend.key.size = unit(1.25, 'lines')),"vertical")
predictedLegend <- g_legend(predictedFateT +
labs(fill="") + guides(alpha = FALSE) +
theme(legend.direction = 'vertical',
legend.title = element_blank(),
legend.text = element_text(size = 16),
legend.key = element_rect(size = 5),
legend.key.size = unit(1.25, 'lines'),
legend.spacing.x = unit(0.5, 'lines')),"vertical")
grobFateT <- rbind(ggplotGrob(densityFateT + NoLegend() + arrangeText() + ylab("density\n")+ xlab("Pseudotime\n")),
ggplotGrob(divRateBpT + NoLegend() + arrangeText()+ xlab("Pseudotime bin\n")),
ggplotGrob(predictedFateT+NoLegend() + arrangeText() + xlab("Pseudotime bin\n")),size = "last")
grobFateM <- rbind(ggplotGrob(densityFateM + NoLegend() + arrangeText() + ylab("density\n")+ xlab("Pseudotime\n")),
ggplotGrob(divRateBpM + NoLegend() + arrangeText()+ xlab("Pseudotime bin\n")),
ggplotGrob(predictedFateM + NoLegend() + arrangeText() + xlab("Pseudotime bin\n")),size = "last")
grobFateE <- rbind(ggplotGrob(densityFateE + NoLegend() + arrangeText()+ ylab("density\n")+ xlab("Pseudotime\n")),
ggplotGrob(divRateBpE + NoLegend() + arrangeText() + xlab("Pseudotime bin\n")),
ggplotGrob(predictedFateE + NoLegend() + arrangeText()+ xlab("Pseudotime bin\n")),size = "last")
grobLegend <- arrangeGrob(densityLegend,divRateLegend,predictedLegend,ncol = 1)
grid.arrange(grobFateT,grobFateM,grobFateE,grobLegend,widths = c(10,10,10,3),nrow = 1)
svg("paperImages/svg/fig6H_densityCtPhasesTraj_row.svg",height = 10,width = 16)
grid.arrange(grobFateT,blank,grobFateM,blank,grobFateE,blank,grobLegend,widths = c(10,0.5,10,0.5,10,0.5,3),nrow = 1)
dev.off()
## png
## 2
Some gene expression along pseudotime that illustrate the delay of old HSPCs activation with age. We switch monocle normalisation method to be able to use the monocle process to fit the gene expression trend over pseudotime.
ptBranchingT <- max(pData(monocle)[which(pData(monocle)$State ==1),"Pseudotime"])
ptBranchingM <- max(pData(monocle)[which(pData(monocle)$State ==3),"Pseudotime"])
gbm_cds_old <- monocle[,which(pData(monocle)$AGE == "Old")]
gbm_cds_young <- monocle[,which(pData(monocle)$AGE == "Young")]
gbm_cds_old <- estimateSizeFactors(gbm_cds_old)
gbm_cds_young <- estimateSizeFactors(gbm_cds_young)
selectedGenes <- c("Ccnb1","Mki67","Cd48","Cdkn1a","Cdkn1b","Cdkn2c")
#selectedGenes <- c("Ccnb1","Mki67","Cd48","Cdkn2c","Cdkn1a")
#selectedGenes <- c("Cdkn1c","Cdkn2a","Cdkn1b","Cdkn2c","Cdkn1a")
p1 <- plot_genes_in_pseudotime(gbm_cds_young[selectedGenes],
ncol = 6,
panel_order = selectedGenes) +
scale_color_manual(values = colorTrajStateFinal) +
theme(legend.position = "right") +
geom_vline(xintercept = ptBranchingM,color = "red",linetype = "dashed",size =1) +
theme(strip.text = element_text(face = "italic"))
myfunction <- function(var ) {
print(var)
result <- c(rep("",length(rownames(var))))
print(result)
return(result)
}
labeller <- function(labels, multi_line = TRUE)
{
as.empty <- function(labels) {
return(" ")
}
labels <- lapply(labels, as.empty)
if (multi_line) {
labels
}
else {
collapse_labels_lines(labels)
}
}
p2 <- plot_genes_in_pseudotime(gbm_cds_old[selectedGenes],
ncol = 6,
panel_order = selectedGenes) +
scale_color_manual(values = colorTrajStateFinal) +
theme(legend.position = "right") +
geom_vline(xintercept = ptBranchingM,color = "red",linetype = "dashed",size = 1) +
facet_wrap(~feature_label,labeller=labeller,nrow = 1,
ncol = 6, scales = "free_y")
p1$labels$x <- "Pseudotime"
p2$labels$x <- "Pseudotime"
p2 <- p2 + theme(strip.text = element_text(face = "italic"))
legendPt <- g_legend(p1+ guides(colour = guide_legend(override.aes = list(size=5))) +
theme(legend.text = element_text(size = 14))+theme(legend.position = "bottom"),"horizontal")
grid <- arrangeGrob(p1+NoLegend(),p2+NoLegend(),legendPt,heights = c(10,10,1))
grid.arrange(grid)
svg("paperImages/svg/fig6I_genesCellCycleDiff.svg",height = 4.5,width = 12)
grid.arrange(grid)
dev.off()
## png
## 2
png("paperImages/png/fig6I_genesCdkn.png",height = 4.5,width = 12,units = "in",res = 300)
grid.arrange(grid)
dev.off()
## png
## 2
saveRDS(monocle,"monocle_report.rds")
saveRDS(hspc.combined,"seurat_report.rds")
Aibar, Sara, Carmen Bravo González-Blas, Thomas Moerman, Hana Imrichova, Gert Hulselmans, Florian Rambow, Jean-Christophe Marine, et al. 2017. “SCENIC: Single-Cell Regulatory Network Inference and Clustering.” Nature Methods 14 (11): 1083.
Chambers, Stuart M, Nathan C Boles, Kuan-Yin K Lin, Megan P Tierney, Teresa V Bowman, Steven B Bradfute, Alice J Chen, et al. 2007. “Hematopoietic Fingerprints: An Expression Database of Stem Cells and Their Progeny.” Cell Stem Cell 1 (5): 578–91.
Lieberman, Yuval, Lior Rokach, and Tal Shay. 2018. “CaSTLe–Classification of Single Cells by Transfer Learning: Harnessing the Power of Publicly Available Single Cell Rna Sequencing Experiments to Annotate New Experiments.” PloS One 13 (10): e0205499.
Lun, Aaron T. L., Davis J. McCarthy, and John C. Marioni. 2016. “A Step-by-Step Workflow for Low-Level Analysis of Single-Cell Rna-Seq Data with Bioconductor.” F1000Res. 5: 2122. https://doi.org/10.12688/f1000research.9501.2.
Qiu, Xiaojie, Qi Mao, Ying Tang, Li Wang, Raghav Chawla, Hannah A Pliner, and Cole Trapnell. 2017. “Reversed Graph Embedding Resolves Complex Single-Cell Trajectories.” Nature Methods 14 (10): 979.
Rodriguez-Fraticelli, Alejo E, Samuel L Wolock, Caleb S Weinreb, Riccardo Panero, Sachin H Patel, Maja Jankovic, Jianlong Sun, Raffaele A Calogero, Allon M Klein, and Fernando D Camargo. 2018. “Clonal Analysis of Lineage Fate in Native Haematopoiesis.” Nature 553 (7687): 212.
Scialdone, Antonio, Kedar N Natarajan, Luis R Saraiva, Valentina Proserpio, Sarah A Teichmann, Oliver Stegle, John C Marioni, and Florian Buettner. 2015. “Computational Assignment of Cell-Cycle Stage from Single-Cell Transcriptome Data.” Methods 85: 54–61.
Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck III, Yuhan Hao, Marlon Stoeckius, Peter Smibert, and Rahul Satija. 2019. “Comprehensive Integration of Single-Cell Data.” Cell 177 (7): 1888–1902.
Venezia, Teresa A, Akil A Merchant, Carlos A Ramos, Nathan L Whitehouse, Andrew S Young, Chad A Shaw, and Margaret A Goodell. 2004. “Molecular Signatures of Proliferation and Quiescence in Hematopoietic Stem Cells.” PLoS Biology 2 (10).