Summary
This study employs machine learning algorithms to systematically screen key biomarkers closely linked to the Immunogenic Cell Death (ICD) process and deeply analyze their regulatory mechanisms in the sepsis immune microenvironment. It further explores traditional Chinese medicine (TCM) monomers and small-molecule drugs with regulatory potential via high-throughput targeted gene screening.
By analyzing the intersection of 5445 differentially expressed genes (DEGs) and 34 ICD-related genes, 20 ICD-associated DEGs were identified. The prediction model revealed the glmBoost + RF algorithm combination as optimal, screening 10 key genes; ROC curve showed CD8A, IFNGR1, ENTPD1 had the strongest predictive ability.
Specifically, CD8A was downregulated, positively correlated with CD8 T cells but negatively with neutrophils. IFNGR1 and ENTPD1 were upregulated, with opposite correlations. A ceRNA regulatory network targeting CD8A and ENTPD1 was constructed to explore upstream non-coding mechanisms.
Stigmasterol, an active component in Corn Silk, Ginseng, and Scutellariae Radix, bound stably to the three key genes. This study may provide a new perspective for subsequent basic research of sepsis precision treatment.
Keywords
Immunogenic cell death, Machine Learning,Sepsis, Pharmacological Screening Strategy
Introduction
Sepsis is a life-threatening organ dysfunction syndrome caused by the combined effects of dysregulated immune response and extensive inflammatory damage triggered by infection, and it is one of the leading causes of death among critically ill patients1,2,3. According to the World Health Organization, there are approximately 49 million sepsis cases globally each year, with about 11 million related deaths4. Retrospective analyses have shown that the in-hospital mortality rate of sepsis is 17%, and it can be as high as 26% among critically ill patients5. These data highlight the severity of sepsis as a risk factor for death. Currently, the management of sepsis mainly involves three aspects: controlling infection, stabilizing hemodynamics, and regulating the host response6. These measures are mainly carried out around drug therapy, fluid resuscitation, and immunomodulatory therapy, such as the use of antibiotics, glucocorticoids, and anti-inflammatory drugs. Nevertheless, the mortality rate of sepsis remains high.Therefore, it is necessary to identify effective therapeutic targets and intervention strategies for sepsis.
The immune response of patients with sepsis to pathogens is primarily mediated through the interaction between pathogen-associated molecular patterns (PAMPs) and pattern recognition receptors (PRRs). PRRs are proteins capable of recognizing molecules frequently associated with pathogens. PRRs can be activated by damage-associated molecular patterns (DAMPs) in the host cell nucleus, mitochondria, and cytoplasm, and these molecules are released into the bloodstream from cells during sepsis7. DAMPs are non-microbial molecules in the host cell nucleus or cytoplasm. Similar to PAMPs, they can also be recognized by PRRs. When released from cells after tissue injury, they can serve as potent activators of the immune system, triggering and sustaining non-infectious inflammatory responses, leading to systemic inflammation, organ damage, and even death8,9,10. Evidently, during the pathological process of sepsis, the release of DAMPs activates the immune system and triggers an inflammatory response closely related to the pathophysiological process of sepsis.
ICD is a distinct form of cell death that triggers the immune system by DAMPs and PAMPs. It activates multiple transcription factors, promoting the production and release of pro-inflammatory and anti-inflammatory mediators to combat invading pathogens and activate the adaptive immune response. ICD plays an important role in anti-tumor immunity, autoimmune diseases, and infectious diseases. For example, a review elaborated on the application of DAMP-mediated ICD in various cancer immunotherapies11. Additionally, Wang et al. found that DAMP-mediated ICD upregulates the expression levels of heat shock protein 90AA1 (HSP90AA1) and P2RX7, which in turn induces excessive phosphorylation of tau protein and neuroinflammatory responses, thus playing a key role in the pathogenesis of Alzheimer's disease12. Another review indicated that DAMP-mediated ICD is an important driver of autoimmune reactive granulomatosis13. These findings suggest that DAMP-mediated ICD plays an important role in both neoplastic and non-neoplastic diseases. However, the role of ICD in sepsis and its regulatory mechanism have not been fully investigated. Notably, in the pathological process of sepsis, the release of DAMPs activates the immune system and triggers an inflammatory response; in infection-related cell death , cell death is accompanied by the release of DAMPs, which also activates the immune system and promotes the immune response. Apparently, DAMPs play a crucial role in both sepsis and ICD, suggesting a possible association between the two. This study aims to deeply explore the key regulatory genes of ICD in sepsis and reveal how these genes affect the immune response and disease progression in sepsis. We hypothesize that by identifying and studying these key genes, we can elucidate the regulatory network of ICD in sepsis and provide potential molecular targets for the development of novel therapeutic strategies.
Currently, in addition to common treatment options for sepsis such as early fluid resuscitation, anti-infection, and the use of vasoactive drugs, traditional Chinese Medicine (TCM) has also emerged as a promising therapeutic strategy.Among them, the Chinese herbal compound Xuebijing has been verified through multi - center, randomized, double-blind controlled trials to effectively reduce the 28-day mortality rate of sepsis14,15. Besides Xuebijing, some studies have reported that various Chinese herbal compound preparations, including Astragalus injection, Banxia Xiexin Decoction, Dahuang Fuzi Decoction, and Shenmai injection, have positive effects on the treatment of sepsis16, confirming the therapeutic efficacy of TCM preparations for sepsis. To further understand the detailed mechanism of the action of Chinese herbal compound preparations on sepsis, many researchers have explored the associations between Chinese herbal monomers, small-molecule drugs, and sepsis. Some studies have reported that multiple small - molecule drugs such as emodin, salidroside, and ginsenosides can improve the clinical symptoms of sepsis by regulating signaling pathways like NF-κB, STAT3, STAT1, and PI3K17. In addition, cichoric acid alleviates sepsis-induced cardiac damage in mice by inhibiting the activity of succinate dehydrogenase (SDH) and succinate deposition in mitochondria, and slowing down HIF-1α induction and the production of mitochondrial reactive oxygen species (ROS)18. IR-61 activates Nrf2 by directly inhibiting the Keap1-Nrf2 interaction, thereby enhancing the antibacterial function of macrophages and improving the outcome of septic mice18. Compound 7460-0250 can reduce lung injury and improve the survival rate of septic mice19. Collectively, these studies indicate that Chinese herbal medicines and small-molecule compounds may exert potential therapeutic effects against sepsis.Therefore, it is urgent to conduct in-depth exploration to discover the potential therapeutic targets of Chinese herbal monomers and small - molecule drugs for sepsis, especially those targeting ICD-related regulatory gene targets, which can provide new ideas for the treatment of sepsis patients.
Therefore, this study will explore the differentially expressed genes (DEGs) associated with ICD and sepsis through multi-omics data analysis. Ten machine learning algorithms and their 101 combination schemes will be used to construct high-precision prediction models to obtain key ICD-related DEGs. In-depth analysis of immune infiltration characteristics and functional enrichment analysis will be carried out for these key genes. Subsequently, a sepsis cell model will be constructed by inducing RAW264.7 macrophages with lipopolysaccharide (LPS), and the expression levels of key ICD genes in the sepsis macrophage model will be verified by quantitative real-time polymerase chain reaction (qRT-PCR). Finally, databases such as the COREMINE database and the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP) will be used to screen potential traditional Chinese medicine monomers and small-molecule drugs targeting key ICD-related DEGs, and the binding potential and affinity between related drugs and target proteins will be evaluated through molecular docking simulation technology. This study may provide a new perspective for subsequent exploration of diagnostic markers and therapeutic targets for sepsis.
Methods
Data sources
Using "sepsis" as the keyword, we downloaded the microarray chip data of human species required for this study from the GEO database and obtained a total of 13 datasets: GSE10361, GSE10474, GSE28750, GSE40012, GSE54514, GSE57065, GSE66890, GSE67530, GSE69528, GSE95233, GSE131761, GSE32707, and GSE65682. The first 11 datasets were used as the training set, and the last two were used as the test set. The training set contained gene expression chip data from 619 sepsis patients and 212 healthy individuals. The test set GSE32707 contained data from 79 sepsis patients and 34 healthy individuals, while the test set GSE65682 contained data from 192 sepsis patients and 42 healthy individuals. Regulatory genes related to ICD were obtained through literature review, and a total of 34 genes were ultimately identified20.
Analysis of DEGs
In this study, the limma package in R language was used to perform differential expression gene analysis on the microarray data of sepsis samples and healthy control groups. Differentially expressed genes were screened according to the criteria of absolute LogFC value > 1 and adjusted P - value (FDR) < 0.05.
Immunology-related analysis
Evaluation of immune cell infiltration
Using the IOBR software package in the R language environment and combining it with the advanced CIBERSORT algorithm, a comprehensive analysis of immune cell infiltration was conducted on the samples in the GSE32707 dataset. This algorithm can estimate the relative abundances of 22 major immune cell subsets based on gene expression data, revealing the differences in immune cell composition among different samples.
Association between DEGs and immune factors
Pearson correlation analysis was used to explore the potential association between DEGs in the dataset and specific immune cell subsets. These immune cell subsets include, but are not limited to, plasma cells, T cells, and other key immune cell types that play critical roles in immune response and disease progression.Pearson correlation analysis was used for correlation analysis in this study, and Benjamini-Hochberg method was used for FDR multiple comparison correction. After correction, P < 0.05 was considered statistically significant.
Machine learning algorithms
To build a high-precision, high-stability consensus on PCD-related genes, we integrated 10 machine learning algorithms and 101 combinations of them. These algorithms include random survival forest (RSF), elastic network (Enet), lasso regression, ridge regression, stepwise Cox regression, CoxBoost, partial least squares regression (plsRcox), supervised principal component analysis (SuperPC), generalized lifting regression model (GBM), and survival-SVM. The signature generation process includes the following steps: (a) building predictive models using 101 algorithm combinations and leave-one-out cross-validation (LOOCV) in the TCGA-KIRC cohort;(b) further cross-validating all models using the GSE22541 dataset; and (c) calculating the Harrell consistency index (C index) for each model, and judging the model with an average C index or a test set C index>0.7 as excellent.
Predictive accuracy
ROC curves were plotted using the pROC package in R language to assess the diagnostic potential of key DEGs. A nomogram of risk scores was constructed using rms packages and calibration curves were drawn to verify the predictive accuracy of the model. In addition, rmda package was used to draw decision curves and influence curves to evaluate the application value of the model in clinical decision making. Generally, an area under the ROC curve greater than 0.7 indicates a biomarker with a high diagnostic discrimination power.
Genset Variation Analysis (GSVA)
Genome Variation Analysis (GSVA) is a nonparametric, unsupervised method for assessing the enrichment of transcriptome genomes. The method determines the biological function of a sample by translating gene-level changes into pathway-level changes by providing comprehensive scores for specific gene sets. In this study, we downloaded gene sets from a molecular signature database and used the GSVA algorithm to generate a composite score for each gene set to assess potential biological function changes between samples.
GSEA enrichment analysis
Version 3.0 software was obtained from the GSEA official website (https://www.gsea-msigdb.org/gsea/index.jsp), and samples were divided into high expression group (>50%) and low expression group (<50%) according to the core gene expression level. The c2.cp.kegg.v7.4.symbols.gmt gene set was downloaded from the molecular signature database for pathway analysis and molecular mechanism studies. Through GSEA analysis, the minimum gene set was set to 5 and the maximum was set to 5000, and thousands of simulated samples were carried out. Pathways with P <0.05 and NES> threshold were screened out and classified visually.
ceRNA network construction
ceRNA networks were constructed using miRanda database (http://www.microRNA.org), miRDB database >(https://mirdb.org/) and TargetScan database (https://www.targetscan.org/vert_80/), and visualized using Cytoscape 3.7.2 software.
RT-qPCR
In this study, lipopolysaccharide (LPS)-induced sepsis macrophage model (RAW264.7) was used for in vitro validation. RAW264.7 cells (mouse monocyte-macrophage leukemia cells,CL-1090) were purchased from Wuhan Procell Life Science & Technology Co., Ltd. Raw264.7 was cultured in DMEM medium and seeded in 6-well plates (density 1×106). Lipopolysaccharide (LPS) 1 μg/mL was added during the logarithmic growth phase of the cells and intervened for 24 hours to establish an in vitro sepsis cell model. The control group was added with the same volume of PBS. TRIzol reagent extracts total RNA from cells, and PrimeScript RT kit reverse transcribes RNA into cDNA. RT-qPCR was performed with SYBR Green qPCR kit to detect the mRNA expression levels of CD8A, IFNGR1 and ENTPD1. 2^-ΔΔCt method was used to calculate the relative gene expression, and t-test was used to compare the differences between groups.
Network pharmacology screening potential target drugs
Using the Coremine database (coremine.com/medical/? locale=zh_CN), and the mapping Chinese medicine and drug small molecules of CD8A, IFNGR1 and ENTPD1 were screened by P<0.05. Subsequently, through the TCMSP database (www.tcmsp-e.com/load_intro.php? id=43), three key gene mapping active ingredients of traditional Chinese medicine were screened out under the conditions of oral bioavailability ( OB )(%)≥30% and drug-likeness ( DL )≥0.7.
Molecular docking simulation technology
Obtain the three-dimensional structure file of target protein from PDB protein database (https://www2.rcsb.org/); obtain the structure file of active component and small drug molecule of traditional Chinese medicine from Pub-Chem database (https://pubchem.ncbi.nlm.nih.gov/), and predict the binding ability of target protein and Mars component and small drug molecule of traditional Chinese medicine by molecular docking through CB-DOCK2 (https://cadd.labshare.cn/cb-dock2/php/index.php).
Results
Identification and enrichment analysis of ICD related DEGs in sepsis
Our overall research process is shown in Fig. 1. This study downloaded sepsis-related transcriptome data and clinical information from the GEO database. A total of 11 datasets were included : GSE10361, GSE10474, GSE131761, GSE28750, GSE40012, GSE54514, GSE57065, GSE66890, GSE67530, GSE69528, GSE95233. The training set was constructed after deduplication and integration of the original data. The expression matrix contained a total of 6819 genes, and a total of 890 sepsis patients and 288 healthy controls were included. In addition, two independent datasets of GSE32707 and GSE65682 were selected as test sets. Principal Component Analysis (PCA) was used to evaluate the data batch effect, and the samples before and after batch correction were visualized respectively (Fig.2A,B). ComBat algorithm is used for data batch correction. The results show that the sample distribution of each dataset tends to be consistent after correction, the batch deviation across datasets is effectively eliminated, and the quality of integrated data is significantly improved. DEGs between groups were analyzed by limma package. The screening criteria were |log 2FC| > 1 and Benjamini-Hochberg method corrected P < 0.05.Finally, 5445 sepsis-related DEGs were obtained (Fig.2C).Combined with domestic and foreign literature, 34 ICD -related genes (Table S1) were obtained. The two groups of genes were intersected and a total of 20 ICD-related DEGs were screened (Fig.2D). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were performed using the clusterProfiler package, and FDR < 0.05 was used as a significant enrichment threshold. The results of GO enrichment showed that ICD-related DEGs were mainly enriched in immune regulation processes such as positive regulation of cytokine production and immune response regulation signaling pathways (Fig.2E). KEGG enrichment analysis suggested that the above genes were involved in multiple pathways such as lipid metabolism, pathogen infection, antigen processing and presentation (Fig.2F).
In this study, 20 ICD-related DEGs were finally screened. According to the public database pathway annotation, ICD is a cell death phenotype defined based on immune activation function, which can be mediated by classical programmed cell death (PCD) such as apoptosis, pyroptosis, ferroptosis, necroptosis and autophagy, so there is gene overlap between different cell death pathways. In this study, the screened genes had a high degree of overlap with the pyroptosis and apoptosis pathways with high incidence of sepsis, and had less intersection with ferroptosis, necroptosis and autophagy. Among them, core genes such as CD8A, IFNGR1, and ENTPD1 are mainly involved in immune response and inflammation regulation, and have low overlap with non-immune programmed cell death pathways, and have obvious ICD regulation bias in function.
Immune Cell Infiltration of ICD-Related DEGs
In this study, CIBERSORT algorithm was used to estimate the infiltration ratio of 22 immune cells based on the expression matrix. Wilcoxon rank sum test was used for the difference between groups, and P< 0.05 was considered statistically significant. The correlation between immune cells was analyzed by Pearson correlation analysis and corrected by Benjamini-Hochberg FDR multiple comparison. P<0.05 was considered as significant correlation. The analysis of immune cell infiltration in the training set showed that there were significant differences in activated NK cells, M0 / M1 macrophages and dendritic cells between the sepsis group and the healthy control group (P < 0.05, Fig.3A). The correlation heat map between immune cells showed the correlation characteristics of each cell subgroup (Fig.3B). In the test set GSE32707, no significant difference in the proportion of immune cell infiltration was observed between the sepsis group and the control group (Fig.3C), but correlation analysis showed that there was a significant correlation between Tfh cells and mast cells, Tfh cells and dendritic cells, M1 macrophages and eosinophils (corrected P <0.05,Fig.3D). In the test set GSE65682, there was a significant difference in the infiltration ratio of most immune cell subsets between the sepsis group and the control group (P < 0.05, Fig.3E). Correlation analysis showed that there was a significant positive correlation between dendritic cells and CD8 + T cells (corrected P < 0.05, Fig.3F).Based on the above results, it is suggested that ICD-related differential genes not only affect the number distribution of immune cells, but also may further regulate cell biological functions. There was a significant positive correlation between CD8A and CD8 T cells, suggesting that the expression level of CD8A could directly affect the activation and killing function of cytotoxic T cells. CD8A, IFNGR1 and ENTPD1 were significantly correlated with neutrophils, indicating that the three types of genes may be involved in the regulation of neutrophil recruitment, activation and inflammatory phenotype transformation, and play an important role in the process of systemic inflammatory disorder in sepsis.
Screening of 101 Machine Learning Combinations and Evaluation of the Optimal Prognostic Model
Subsequently, LOOCV was used to evaluate the combination of 101 machine learning algorithms, and the average area under the curve (AUC) of each algorithm in the internal training set and the external test set was calculated to construct a sepsis prediction model. Among them, the glmBoost + RF ( generalized linear model boosting + random forest ) model was selected as the optimal model (Fig.4A). A total of 831 samples ( including 212 healthy controls and 619 sepsis patients) were included in the training set. The receiver operating characteristic (ROC) curve showed that the AUC of the model was 0.997 ( 95 % confidence interval : 0.995-0.998, Fig. 4B). The corresponding confusion matrix shows the consistency between the model prediction results and the actual grouping, where true negative (TN) = 176, false positive (FP) = 3, false negative (FN) = 36, true positive (TP) = 616 (Fig. 4C). A total of 113 samples were included in the test set GSE32707 ( including 34 healthy controls and 79 sepsis patients ). The AUC of the ROC curve was 0.694 ( 95 % confidence interval : 0.575-0.797, Fig.4D), and the corresponding confusion matrix showed TN = 1, FP = 2, FN = 33, TP = 77 (Fig.4E).The AUC of this cohort is relatively low, mainly due to the heterogeneity of clinical samples and the difference of detection platforms in different GEO datasets : the clinical characteristics and disease staging of the patients included in GSE32707 are different from those of the other cohorts, and the independent detection chip is used. Even after batch correction, there is still a small amount of technical residual bias. In addition, this model takes the screening of ICD-related characteristic genes as the core purpose, and does not optimize the parameters for the external queue, and the cross-queue generalization ability decreases. Another test set GSE65682 included 234 samples (including 42 healthy controls and 192 sepsis patients). The ROC curve AUC was 0.987 (95 % confidence interval : 0.976-0.996, Fig. 4F), and the corresponding confusion matrix showed TN = 35, FP = 6, FN = 7, TP = 186 (Fig. 4G).The main purpose of the machine learning model in this study is to screen ICD-related characteristic genes, rather than to construct a nomogram for clinical diagnosis and prediction. Therefore, AUC is selected as the core evaluation index of the model. The calibration curve and decision curve are mostly used to evaluate the performance of clinical risk prediction models, so these two analyses are not included in this study. GlmBoost + RF belongs to the integrated learning black box model, and the internal decision logic is difficult to interpret intuitively. This study uses the model to screen ICD-related feature genes, and does not carry out SHAP (SHapley Additive Explanations, SHapley Additive Explanations), LIME (Local Interpretable Model-agnostic Explanations) and other models interpretability analysis and visualization.
Identification of Key ICD-Related DEGs and Functional Immune Analysis of Key Genes
Based on the optimal prognostic model glmBoost+RF, 10 key ICD-related DEGs were identified, namely CD8A, LY96, IFNGR1, ENTPD1, NT5E, CASP1, IL1R1, BAX, IL17RA and TLR4, which all exhibited significant differences between sepsis and control groups. Additionally, among these key genes, CD8A and NT5E were under-expressed, while the remaining genes were over-expressed (Fig. S1A). Further analysis of the correlations among these 10 genes revealed the highest correlations between CD8A and LY96, and IFNGR1 and TLR4 (Fig. S1B). ROC curve analysis identified the top three genes with high diagnostic sensitivity as CD8A (AUC=0.806), IFNGR1 (AUC=0.788), and ENTPD1 (AUC=0.780) (Fig. S1C). Figures 5D-I demonstrate significant differences and correlations in the expression of CD8A, IFNGR1, and ENTPD1 between sepsis patients and normal control groups in relation to immune cell infiltration.
Specific Signaling Mechanisms Associated with the Key Genes CD8A, IFNGR1, and ENTPD1
Based on the training set of 1178 samples ( including 890 patients with sepsis and 288 healthy controls ), we performed GSVA and GSEA on the core genes CD8A, IFNGR1, and ENTPD1 to analyze the key signaling pathways associated with the progression of sepsis. Pathway enrichment analysis based on GSVA algorithm showed that (Fig. S2) : CD8A high expression group was mainly enriched in primary immunodeficiency, T cell receptor signaling pathway ; the low expression group was significantly related to sphingolipid metabolism, pantothenic acid and coenzyme A biosynthesis pathways. The high expression group of IFNGR1 was mainly enriched in pantothenic acid and coenzyme A biosynthesis, sphingolipid metabolism pathway ; the low expression group was related to purine metabolism and primary immunodeficiency pathway. The ENTPD1 high expression group was mainly enriched in sphingolipid metabolism and sphingolipid biosynthesis pathways ; the low expression group was related to cell adhesion molecules (CAMs) and RNA polymerase pathways. The pathway enrichment characteristics were further verified by GSEA analysis. Based on the KEGG pathway gene set, the normalized enrichment score (NES) |NES| > 1 and FDR q-value < 0.05 were used as the significant enrichment threshold (Fig. 5).The results showed that the CD8A high expression group was significantly enriched in the T cell receptor signaling pathway, and the low expression group was enriched in the lysosomal pathway (Fig. 5A-B); the IFNGR1 high expression group was enriched in the insulin signaling pathway, and the low expression group was enriched in the antigen processing and presentation pathway (Fig. 5C-D); the ENTPD1 high expression group was enriched in the insulin signaling pathway, and the low expression group was enriched in the CAMs pathway (Fig.5E-F). The above results reveal the key signaling pathways associated with core genes from different levels, and provide a theoretical basis for analyzing the molecular mechanism of sepsis and exploring potential therapeutic targets.
ceRNA Network of the Key Genes CD8A, IFNGR1, and ENTPD1
Subsequently, we constructed a ceRNA regulatory network of CD8A and ENTPD1 based on 1178 samples in the training set ( including 890 sepsis patients and 288 healthy controls ), aiming to analyze the upstream post-transcriptional regulation patterns of the two. ceRNA can participate in the regulation of gene expression through the endogenous competition mechanism of lncRNA-miRNA-mRNA, which is an important post-transcriptional way to regulate cell phenotype and inflammatory response. For CD8A (Fig. 6A), this study used StarBase, miRDB, TargetScan database to predict the target interaction relationship, with binding energy ≤ -20 kcal / mol, context score percentile ≥ 95 as the screening threshold, combined with Pearson correlation analysis ( corrected P < 0.05 ), and finally screened one key regulatory miRNA : hsa-miR-875-3p ; four lncRNAs ( CDR1-AS, PCBP3-OT1, RP11-64K12.8, FRMPD3-AS1 ) were predicted to interact with the miRNA, which constituted the core regulatory axis of lncRNA-hsa-miR-875-3p- CD8A .For ENTPD1 ( Fig. 6B ), a large number of potential interacting lncRNAs and miRNAs were obtained through the same prediction and screening strategy. In order to highlight the core regulatory relationship, only the key interaction nodes are displayed in the figure, and all network members are not listed. This complex network suggests that ENTPD1 is regulated by multiple ceRNA pathways, and its expression pattern has multi-level post-transcriptional regulation characteristics in the sepsis microenvironment. It should be noted that acute sepsis has temporal dynamic characteristics. Limited by the single time point design of clinical samples and in vitro experiments in this study, the ceRNA network constructed in this study is a static analysis result, which fails to reflect the changes of regulation modes at different disease stages or intervention time.
Validation of expression of key ICD-related DEGs in a macrophage model of sepsis
RT-qPCR showed that CD8A mRNA expression decreased significantly after LPS treatment in RAW264.7 cells for 24 hours (P < 0.05), which was consistent with bioinformatics analysis. IFNGR1 and ENTPD1 mRNA expression levels did not change significantly (P > 0.05) (Fig. S3).Combined with the analysis of experimental design and gene expression characteristics, the reasons for the difference in the results mainly include three aspects : First, this study only set a single time point of LPS intervention for 24 h, ICD-related gene expression has obvious time dynamic rules, and 24 h may not be the window period of differential expression of IFNGR1 and ENTPD1 ; secondly, the data of this study were derived from human peripheral blood samples, and mouse RAW264.7 macrophages were used for in vitro verification. There were differences in gene background expression levels between the two types of cells : CD8A was mainly expressed in T cells, and the baseline expression in macrophages was extremely low. The expression changes after stimulation were easy to be detected. The background expression of IFNGR1 and ENTPD1 in macrophages was high, and the relative fluctuation of mRNA after inflammatory stimulation was small. Third, ENTPD1 encodes the membrane protein CD39, and its biological function mainly depends on the regulation of protein expression and enzyme activity. The mRNA level alone cannot fully reflect its functional status. CRT, ATP and HMGB1 are classic markers to verify ICD phenotype. In this study, only gene mRNA expression was detected, and related detection of the above indicators was not carried out. ICD phenotype verification is not sufficient.
CD8A,IFNGR1 and ENTPD1: Potential Traditional Chinese Medicine, Active Constituents and Small Molecular Drug Screening
In this study, virtual drug screening and molecular docking analysis were carried out by targeting CD8A, IFNGR1 and ENTPD1, which are the core differential genes related to ICD in sepsis. Firstly, the Coremine database was used to mine traditional Chinese medicine and small molecule compounds potentially associated with the three targets. Then, based on the TCMSP database, the general screening criteria for active components of traditional Chinese medicine were adopted : OB ≥ 30 %, DL ≥ 0.7, and active monomers with good druggability were further screened. Among them : the traditional Chinese medicine associated with CD8A is silkworm and corn silk, and the corresponding small molecule drugs mainly include IFN-γ, maltose, maltotriose, phosphorus-32 and posatilin ; the traditional Chinese medicines associated with IFNGR1 are ganoderma lucidum and ginseng, and the corresponding small molecule drugs mainly include conivaptan, erlotinib, zidovudine, clarithromycin and tributyrin. The traditional Chinese medicines associated with ENTPD1 are raspberry, scutellaria baicalensis, and tangerine red. The corresponding small molecule drugs mainly include adenosine, adenosine diphosphate, adenosine monophosphate, adenosine triphosphate, and Quemliclustat.
Based on the above traditional Chinese medicine, the main active ingredients were screened by TCMSP database. The main active ingredients of silkworm and corn silk were folic acid ( FA ), stigmast-4-en-3,6-diol, resveratrol, sitosterol and stigmasterol. The main active components of G.lucidum and P.ginseng were ganoderic acid-Y, ganoderic alcohol-F, ganoderic aldehyde-A, β-sitosterol and stigmasterol. The main active components of raspberry, scutellaria baicalensis and tangerine were baicalin, raspberry acid, β-sitosterol, sitosterol and stigmasterol. Subsequently, molecular docking analysis was carried out : the crystal structure of the target protein was downloaded from the PDB protein database, where CD8A corresponds to PDB ID : 1AKJ, IFNGR1 corresponds to PDB ID : 1FG9, and ENTPD1 uses a predicted structure ( AF-P49961-F1 ) ; the three-dimensional structures of all candidate small molecule drugs and active ingredients of traditional Chinese medicine were downloaded from PubChem database.The molecular docking simulation was carried out on the CB-DOCK2 platform. The binding energy of AutoDock Vina ( kcal / mol ) was used as the evaluation index, and the binding energy ≤ -6.0 kcal / mol was used as the criterion for good binding affinity. The docking results showed that CD8A, IFNGR1 and ENTPD1 showed good binding ability to most candidate small molecules and active components of traditional Chinese medicine ( binding energy ≤ − 6.0 kcal / mol ), and only the binding energy of CD8A and phosphorus-32 was > − 6.0 kcal / mol, suggesting that the binding stability of the two was poor. The docking configuration and interaction relationship between CD8A and representative active components of traditional Chinese medicine and small molecule drugs are shown in Fig.7. The related docking results of IFNGR1 and ENTPD1 are shown in supplementary Fig. S4 and Fig. S5-S6.
Discussion
Sepsis is a complex inflammatory response triggered by systemic infection that not only disrupts the balance of the immune system, but may also lead to multiple organ dysfunction. With the advancement of medical technology, the survival rate of patients with sepsis has been significantly improved, but the pathogenesis of the disease is still complex and not yet fully understood. The current lack of effective biomarkers for the timely diagnosis and treatment of sepsis limits our ability to understand the nature of the disease and develop precise treatment strategies. Therefore, in-depth exploration of the molecular mechanisms of sepsis and the search for precise biomarkers and therapeutic targets are essential to improve patient outcomes. Immunotherapy has demonstrated significant clinical efficacy in the treatment of a variety of diseases. Apoptosis is traditionally considered a non-immunogenic process, whereas necrotic or necrotic apoptosis plays a central role in inflammation and immune response. However, the concept of ICD provides new insight, suggesting that certain forms of cell death can activate the immune system to fight cancer21,22,23. This concept not only opens up new strategies for cancer treatment, but may also provide inspiration for sepsis treatment. Despite widespread interest in the role of ICDs in cancer treatment, research in the field of sepsis remains relatively scarce. This study is the first time to explore the possibility of immune cell-mediated cell death in sepsis by systematic bioinformatics methods, and to search for potential therapeutic biomarkers, providing a new perspective for understanding the immune regulation mechanism of sepsis, and providing new directions for future experimental research and clinical treatment.
The main objective and innovation of our study are highlighted as follows: We identified targets associated with ICD mortality in adult sepsis patients. Based on this critical set of targets, we further screened relevant monomeric compounds from traditional Chinese medicine (TCM) and small-molecule compounds, aiming to modulate the immune microenvironment in sepsis. The distinctiveness of this research lies not only in screening either TCM monomers or small-molecule compounds alone, but in concurrently screening corresponding TCM monomers and small-molecule compounds targeting the same set of key targets. This approach provides a critical foundation for developing integrated treatment strategies combining traditional Chinese and Western medicine.
We used machine learning techniques to delve into the role of ICD-related genes in the immune response of sepsis patients. The main findings of this study are as follows: Firstly, 20 ICD related DEGs were successfully identified by functional enrichment analysis, revealing their dominant role in immune regulatory pathways. Secondly, after comprehensive evaluation of multiple machine learning algorithms, glmBoost+RF algorithm combination was confirmed as the optimal model, and 10 key ICD related DEGs were further screened. ROC curve analysis showed that CD8A, IFNGR1 and ENTPD1 had high sensitivity as diagnostic markers, ranking in the top three. Third, we analyzed the signaling mechanisms of these three key genes, and the results showed that they are closely related to immune regulation. Fourthly, RT-qPCR was used to further verify the expression levels of three key genes in sepsis macrophage model. Fifth, we constructed ceRNA networks for CD8A and ENTPD1, providing new insights into the regulatory networks of these genes in sepsis. Finally, we screened the potential target Chinese medicine monomers and small molecular compounds through COREMINE, TCMSP and other databases, and evaluated the binding affinity of CD8A, IFNGR1, ENTPD1 with the potential target Chinese medicine monomers and small molecular compounds through one-to-one docking simulation technology.
In this study, three core ICD-related markers, CD8A, IFNGR1 and ENTPD1, were screened based on a multi-personal clinical cohort. The clinical transcriptome data is the core basis for determining the above genes as potential biomarkers of sepsis. LPS-induced RAW264.7 macrophages were only used as a preliminary in vitro model to observe the expression characteristics of candidate genes in an inflammatory environment. In this in vitro verification, there was no significant difference in the expression of IFNGR1 and ENTPD1, which was not to deny its marker value, but due to the limitations of the experimental system. On the one hand, the gene expression in the process of immunogenic cell death showed dynamic time series changes, and it was difficult to capture the expression characteristics of all genes at a single 24 h intervention point. On the other hand, the differences in cell species and cell types will directly affect the gene expression pattern. The background expression abundance of IFNGR1 and ENTPD1 in macrophages is high, and the mRNA changes caused by inflammatory stimulation are difficult to achieve statistical differences. It is worth noting that CD39 encoded by ENTPD1 is a functional membrane protein. Its anti-inflammatory and immune regulatory effects are mainly determined by protein content and enzyme activity, and the change of transcription level cannot fully represent the change of protein function.
In addition, it needs to be clear that immunogenic cell death is not an independent cell death pathway and can be mediated by a variety of classical programmed cell death. The 20 ICD-related differential genes identified in this study have certain gene overlaps with pyroptosis and apoptosis pathways. This phenomenon is common in sepsis inflammatory injury models, and is also highly consistent with the background of sepsis with inflammatory cell death as the main pathological feature. At the same time, the intersection of core candidate genes such as CD8A, IFNGR1, and ENTPD1 with non-immune-dependent cell death pathways such as ferroptosis, necroptosis, and autophagy is limited, which further proves that they have good specificity at the level of ICD immune regulation.
Based on glmBoost+RF algorithm and ROC curve analysis, CD8A,IFNGR1 and ENTPD1 were defined as differentially expressed genes associated with immune-related disease in sepsis patients. CD8A is a gene encoding the CD8 α chain of the dimeric CD8 protein, expressed primarily on the surface of cytotoxic T cells and essential for cell-mediated immune defense and T cell development. CD8A expression levels have been shown to be important quantitative predictors for evaluating immunotherapy response and immune cell infiltration24,25. For example, the CD8A gene was found to predict the severity of chronic sinusitis and could be used as a diagnostic biomarker for rheumatoid arthritis. CD8A expression was significantly upregulated in early NSCLC stage N and correlated with good overall survival. Bioinformatics studies suggest that sepsis patients may have poorer outcomes when CD8A expression is downregulated26. In addition, a bioinformatics study confirmed CD8A as a key gene in sepsis diagnosis and prognosis27. This study further validates this view by applying 101 advanced machine learning algorithms to screen. CD8A has also been identified as an ICD-associated prognostic gene for low-grade gliomas and breast cancer28,29,30, suggesting that CD8A may play a key role in immunotherapy response in a variety of diseases. This study confirms the importance of CD8A as an ICD associated prognostic gene for sepsis.
Sepsis has been shown to be associated with decreased CD8 T cell numbers and functional responses, which increase the risk of secondary infections and lead to poor patient outcomes31. In this study, bioinformatics analysis showed that CD8A expression was significantly reduced in sepsis, and RT-qPCR results showed that CD8A mRNA expression was significantly reduced in sepsis macrophage model after LPS intervention for 24 hours, further supporting this view. CD8A and CD8 T cells were positively correlated in immunocyte correlation analysis, suggesting that low expression of CD8A in sepsis may inhibit the number or function of CD8 T cells. This phenomenon is consistent with the cellular immunosuppression observed in sepsis32. Furthermore, immunocyte association analysis also revealed a significant negative correlation between CD8A and neutrophils, suggesting that low CD8A expression in sepsis may lead to abnormal neutrophil aggregation. Sepsis has been found to induce neutrophils into a delayed apoptotic state, resulting in persistent dysfunction and immature neutrophil release. These immature neutrophils can trigger oxidative bursts, cell migration, complement activation, and decreased bacterial clearance, ultimately leading to persistent immune dysfunction and the persistence of inflammatory responses33.Combined with the results of immune infiltration analysis, it is further known that CD8 A can not only be used as a diagnostic marker for sepsis, but also directly participate in the regulation of the activation and effector function of CD8 T cells. The down-regulation of CD8A expression in sepsis can inhibit the normal immune response of cytotoxic T cells, resulting in impaired cellular immune function. At the same time, the low expression of this gene is related to the abnormal activation of neutrophils, which can induce neutrophils to enter a delayed apoptosis state, continuously release inflammatory mediators, and aggravate systemic inflammatory injury.
In order to further study the regulatory mechanism of CD8A in sepsis patients, we constructed a ceRNA regulatory network. The results showed that the upstream regulatory network of CD8A gene was composed of microRNA has-miR-875-3p and four long non-coding RNAs ( CDR1-AS, PCBP3-OT1, RP11-64K12.8, FRMPD3-AS1 ). The classical ceRNA mechanism suggests that the above lncRNA can play a molecular sponge function, specifically bind and adsorb has-miR-875-3p, weaken the miRNA 's inhibition of downstream CD8A mRNA degradation and translation, and then up-regulate CD8A expression. Combined with the results of this study, the expression of CD8 A in sepsis was down-regulated, suggesting that the balance of the ceRNA regulatory axis was broken, which may indirectly regulate the function of CD8 T cells and the process of immunogenic cell death by affecting the expression of CD8A, and participate in the immune disorder of sepsis. Exosome-mediated long non-coding RNA HCG18 has been shown to promote M2 macrophage polarization by reducing the level of miR-875-3p in macrophages, exerting immunosuppressive effects and promoting gastric cancer cell metastasis34. In addition, it has been reported that miR-875-3p inhibits or slows the progression and development of prostate cancer, colorectal cancer, hepatocellular carcinoma and breast cancer35.However, it is worth noting that there is no literature to systematically explore the regulatory role of has-miR-875-3p and its upstream lncRNA in sepsis immune response and ICD.The lncRNA-has-miR-875-3p-CD8A axis screened in this study can provide a new direction for subsequent mechanism research. This study also constructed a ceRNA network of ENTPD1, which is co-regulated by multiple lncRNA-miRNA pathways. As a key gene regulating inflammation and immune homeostasis, ENTPD1 is involved in the inflammatory response and ICD process of sepsis. The complex ceRNA regulatory network upstream of ENTPD1 suggests that the expression of ENTPD1 in vivo is finely regulated by multi-level and multi-pathway. Subsequently, the top-ranked core lncRNAs and miRNAs can be screened from the network to further verify their regulatory effects on ENTPD1, and to analyze the molecular mechanism of ceRNA pathway involved in pathological injury of sepsis. Combined with the disease characteristics of acute sepsis, it can be seen that the activity of the ceRNA regulatory pathway may change dynamically with disease progression. Due to the limitation of samples and experimental conditions, this study only completed static interaction analysis, and subsequent experiments at multiple time points were needed to further verify the regulation rules in different disease courses.
In this study, we observed a significant upregulation of the IFNGR1 gene, which encodes the interferon-gamma receptor 1 (IFN-γR1), a receptor located on the cell surface that plays a crucial role in interferon-gamma (IFN-γ) signaling. Our immune cell correlation analysis further revealed a significant positive correlation between IFNGR1 and neutrophils. Previous research has found that enhanced IFN-γ signaling can promote the recruitment of neutrophils36, and in the context of sepsis, IFN-γ can induce the formation of immunosuppressive neutrophils37. This upregulation of IFNGR1 expression may play a role in enhancing the body's defense against pathogens, but it also poses risks, including triggering excessive inflammatory responses and causing tissue damage. Therefore, IFNGR1 and its regulatory role in neutrophils may become potential targets for the treatment of sepsis, providing us with new therapeutic strategies and research directions.IFNGR1 acts as a functional receptor for interferon-γ, and its up-regulation can amplify the downstream signaling pathway of IFN-γ, thereby promoting neutrophil recruitment and inducing its transformation into a pro-inflammatory phenotype. On the one hand, this regulation can enhance the body 's ability to remove pathogens. On the other hand, it can also drive excessive inflammatory response and destroy immune homeostasis, which is also one of the important causes of the occurrence and development of sepsis inflammatory storm.
In this study, we found that the expression level of the ENTPD1 gene was upregulated, which encodes the CD39 protein, a key cell membrane protein that plays a crucial role in the regulation of the immune system. Our immune cell correlation analysis revealed a significant positive correlation between ENTPD1 and neutrophils. As previous studies have pointed out38,39, the widespread expression of CD39 in neutrophils is crucial for regulating the activity of these cells. The multifaceted role of CD39 in sepsis patients is particularly noteworthy. On the one hand, the enhancement of CD39 can inhibit the P2X7R response and trigger adenosine signaling, which helps to limit systemic inflammation during acute sepsis and promotes the recovery of liver function40. Additionally, the enhancement of CD39 also shows the ability to alleviate lipopolysaccharide (LPS)-induced renal tubular epithelial cell injury, enhance cell viability, inhibit cell apoptosis, and suppress the activation of the NLRP3 inflammasome41. On the other hand, the increased expression of CD39 on regulatory T cells (Tregs) is associated with poor prognosis in sepsis patients42. Notably, studies have shown that miR-155 can increase the proportion of CD39-positive Tregs in sepsis patients, which may subsequently lead to an enhanced immunosuppressive effect43. Therefore, the double-edged sword effect of CD39 in sepsis suggests its potential importance in disease treatment, which may provide new perspectives for future therapeutic strategies.The CD39 protein encoded by ENTPD1 plays an immunoregulatory role by regulating purine metabolism and adenosine signaling pathway. The increased expression of this gene can regulate the inflammatory activity of neutrophils and macrophages. On the one hand, it can inhibit the activation of NLRP3 inflammasome and reduce acute inflammatory injury. On the other hand, it can also induce immunosuppression by regulating the function of regulatory T cells, showing a two-way regulatory feature, and ultimately affect the disease progression of sepsis.
In order to explore the potential target drugs of CD8A, IFNGR1 and ENTPD1, which are the key differential genes between ICD and sepsis, this study screened out Bombyx Mori L. and corn silk mapped by CD8A through COREMINE and TCMSP Chinese herbal medicine database by network pharmacology analysis.
Studies have shown that Bombyx Mori L., as one of the main materials of traditional Chinese medicine compound preparation, can be used as a potential new drug for the treatment of diabetic nephropathy through anti-inflammatory, antioxidant activity and improvement of insulin resistance, and has broad development potential44. In addition to this, Bombyx Mori L. can enhance NK cell activity and induce maturation and activation, affecting the immunomodulatory response45. This confirms that besides antagonizing inflammatory response, Bombyx Mori L. may also modulate systemic immune response by activating innate immune cells in vivo. Control of systemic inflammatory response is also one of the main management measures for patients with sepsis, so Bombyx Mori L. may be a potential target Chinese medicine for the treatment of sepsis. Folic acid is one of the main active ingredients of Bombyx Mori L. and is an essential vitamin for the human body. When serum folate levels are too low, there is an increased risk of gastrointestinal disorders, sepsis, and serum creatinine abnormalities46.Corn silk can inhibit ICAM-1 expression induced by TNF and LPS and adhesion of endothelial cells, which may be one of the mechanisms of its anti-inflammatory effect47. Schottenol, sitosterol and Stigmasterol, as the main active components of corn silk, widely exist in plants. Phytosterols have been shown to have anti-inflammatory effects48,49. In this study, we confirmed that Folic Acid, Schottenol, sitosterol and Stigmasterol have strong binding affinity with ICD key differential gene CD8A through molecular docking simulation experiments. Therefore, CD8A is mapped to Bombyx Mori L., Corn silk and its important active components may be potential target drugs for sepsis.The above components have clear anti-inflammatory, antioxidant and immunomodulatory activities, which can effectively inhibit the release of inflammatory factors and regulate the function of immune cells. They are consistent with the pathological characteristics of systemic inflammatory storm and immune disorder in sepsis, and are also expected to play a role in disease intervention by regulating the ICD process.
Ganoderma, a related Chinese medicine targeting IFNGR1, as a common and commonly used Chinese medicinal fungus in traditional Chinese medicine prescriptions, has various characteristics such as immunomodulation, anti-aging, antibacterial and anticancer activities. Ganoderma has been reported to modulate inflammation, oxidative stress and reduce cell death by inhibiting the NF-κB pathway50. Ganoderma purified compounds Ganoderic-acid-Y, Ganoderiol-F, Ganoderic-aldehyde-A have certain antiviral and antitumor effects51,52,53. Ganoderma's good anti-inflammatory effect makes it a great advantage as a potential treatment for sepsis. The results of molecular docking simulation showed that the binding affinities of Ganoderma main active components Ganoderic-acid-Y, Ganoderic-F, Ganoderic-aldehyde-A to IFNGR1 were all above-8.0, suggesting that Ganoderma had good binding affinities and stability. This further validates the potential of Ganoderma as a potential treatment for sepsis. Ginseng is also commonly used as a Chinese herbal medicine, which has a positive effect on alleviating inflammatory reactions and diseases54.Active ingredients such as ganoderma lucidum acid and phytosterol can regulate multiple inflammatory and immune-related signaling pathways and interfere with the process of cell death. Its pharmacological characteristics of anti-inflammatory and immune regulation make it have the potential value of intervening sepsis and related ICD.
It has been reported that Rubi Fructus, Scutellaria Radix and Citri Exocarpium Rubrum, which target ENTPD1, play an important role in regulating inflammatory response. Rubi Fructus, for example, significantly inhibited weight gain, hyperlipidemia, inflammation and fat accumulation induced by a high fat diet in mice, and improved the intestinal flora environment55. Scutellaria Radix effectively ameliorated the clinical symptoms of ulcerative colitis in mice, while Scutellaria Radix inhibited pro-inflammatory cytokines and mediators to reduce the inflammatory response56. Citri Exocarpium Rubrum has been shown to exhibit anti-inflammatory and anti-cancer properties in vivo and in vitro, and has also been found to have antioxidant activity57,58. Based on the above research, it is speculated that Rubi Fructus, Scutellaria Radix and Citri Exocarpium Rubrum, which target ENTPD1, may have multi-dimensional potential regulatory effects in sepsis. Pathological mechanism of sepsis is highly correlated with inflammatory reaction and immune disorder, and known pharmacological properties of these three herbs are related to regulating intestinal flora, inhibiting pro-inflammatory factors and anti-oxidation, which may play a potential protective role in the pathological process of sepsis, providing theoretical basis for TCM intervention in sepsis.These traditional Chinese medicines and their active ingredients can inhibit the secretion of pro-inflammatory mediators, regulate the body 's immune homeostasis, and act on the core pathological links of sepsis, providing pharmacological theoretical support for the treatment of sepsis by targeting ICD-related targets.
In addition, sitosterol, as one of the main active components of Corn Silk and Scutellaria Radix, has a molecular docking binding affinity of-8.3 and-8.9 with CD8A and ENTPD1, respectively. This data indicates that sitosterol has high binding stability with CD8A and ENTPD1. Beta-sitosterol is one of the main active components of Ginseng and Scutellaria Radix, and its molecular docking affinity scores with IFNGR1 and ENTPD1 are-9.1,-9, respectively, which also shows that beta-sitosterol has high binding stability with IFNGR1 and ENTPD1. It should be noted that Stigmasterol, as the main active ingredient in Corn Silk, Ginseng and Scutellaria Radix, showed better binding efficiency in molecular docking studies. Its binding affinity to CD8A, IFNGR1 and ENTPD1 reached-9.3,-9.2 and-9.2, respectively, which was significantly higher than that of sitosterol and beta-sitosterol. From the pathological mechanism of sepsis, CD8A is involved in the recognition and activation of immune cells, IFNGR1 is a key receptor regulating immune response, ENTPD1 is closely related to the regulation of inflammatory response, these three targets play an important role in the core links of immune imbalance and inflammatory storm in sepsis. Stigmasterol can form stable binding with these three key targets at the same time, which means that it may intervene in sepsis process from multiple dimensions such as immune regulation and inflammation inhibition through multi-target synergy. Therefore, Stigmasterol, as a potential target monomer of traditional Chinese medicine for sepsis, provides a new research direction and application prospect for multi-target therapy of sepsis by virtue of its wide target coverage and excellent binding stability.Phytosterol compounds have been confirmed by a number of pharmacological studies to have broad-spectrum anti-inflammatory and immune regulation effects, and can participate in the regulation of cell death and inflammatory pathways. Combined with the results of molecular docking in this study, it is suggested that these components can act on multiple core targets at the same time, and have the pharmacological potential of multi-target intervention in sepsis ICD.
In this study, we integrated multi-omics data and combined machine learning technology to complete the screening and identification of markers. In the process of research, firstly, extensive screening was carried out by using difference analysis, gene set variation analysis, function enrichment and ceRNA network construction. On this basis, three ICD-related differential genes (ICD-related DEGs) were further explored, and their expression characteristics were verified by PCR experiments of independent samples, which significantly improved the reliability of the results. Subsequently, immune infiltration analysis, GSVA analysis, GSEA analysis and drug screening were systematically carried out for these three ICD-related DEGs, thus clarifying the core functions of ICD-related DEGs in sepsis, laying a key foundation for mining new disease-associated targets and potential treatment schemes.
Although this study significantly improved the reliability of the results through multi-cohort integration analysis and machine learning algorithm optimization, there are still the following limitations to be explained: 1. Potential impact of data heterogeneity: This study relied primarily on the GEO database for multi-omics data analysis, however, this database has certain limitations. Since the data are derived from different ethnic, geographical populations and medical settings, and cover samples from different pathological stages of sepsis, these factors may lead to significant batch effects among the data. This heterogeneity may interfere with accurate identification of differentially expressed genes in sepsis and reduce the specificity of biomarker screening.At the same time, the differences in clinical features and detection platforms of different datasets are also an important reason for the low AUC value of the external test cohort GSE32707, which further reflects the impact of sample heterogeneity of public databases on model performance. 2. Completeness of ICD gene set: This study was based on systematic literature search to construct ICD related gene set, but the literature screening process was subjective. The bias in the selection of search terms (e.g., using "immunogenic cell death" as the main keyword) and the coverage of databases (mainly relying on PubMed, Embase and other English databases) may lead to the omission of some non-English literatures or emerging studies, thus affecting the integrity of gene sets. 3. Limitations of in vitro model and experimental verification : The LPS-induced RAW264.7 macrophage model is only a classical in vitro inflammation model, which cannot simulate the complex multicellular microenvironment such as human sepsis immune cell-endothelial cell-epithelial cell interaction, nor can it reflect animal survival rate, organ damage, circulating inflammatory factors and other in vivo pathological phenotypes. At the same time, in vitro verification only set a single time point of LPS intervention for 24 h, and only detected the expression level of gene mRNA ; the mRNA expression results of IFNGR1 and ENTPD1 were deviated from the clinical transcriptome data due to the influence of ICD gene expression timing characteristics, human and mouse species differences, and different gene background expression abundances of different cell types. In addition, ENTPD1 encodes a functional membrane protein CD39, and its biological activity mainly depends on protein expression and enzyme activity regulation. In addition, acute sepsis has time-series dynamic characteristics. The clinical samples and in vitro experiments in this study are designed at a single time point. The constructed ceRNA network is a static analysis result, which fails to reflect the differences in molecular regulation at different time scales. At the same time, CRT, ATP and HMGB1 are classic phenotypic markers of ICD.This study did not carry out the above indicators, and the direct verification of ICD phenotype was lacking. 4. Coverage blind area of drug screening database: COREMINE, TCMSP and other databases were used to screen Chinese medicine monomers. Although these databases contained comprehensive drug information, there were still significant deficiencies. A large number of active ingredients in traditional Chinese medicine prescriptions (such as secondary metabolites after compound compatibility) and unknown compounds in folk herbs have not been systematically recorded. Limitations in database coverage may result in drugs with potential ICD regulatory activity being missed, affecting the comprehensiveness of therapeutic target mining. 5. Methodological limitations of molecular docking techniques: This study uses molecular docking simulations to assess the binding affinity of drugs to targets, which has inherent shortcomings. Static models cannot accurately reflect the dynamics of protein-ligand interactions and do not consider the effects of cell microenvironment (e.g. pH, ion concentration) on binding kinetics. This simplification may lead to overestimation or underestimation of the efficacy of candidate drugs, requiring more advanced molecular dynamics simulations to verify.
6.ceRNA network is only the result of bioinformatics prediction : In this study, the lncRNA-miRNA-mRNA regulatory network was constructed based on multiple databases, and only the interaction analysis at the bioinformatics level was completed. The real binding relationship between molecules was not verified by experiments such as dual luciferase reporter gene and RNA immunoprecipitation. At the same time, the number of network nodes is large. This study only screened the core regulatory axis for analysis, and the functions of some potential regulatory molecules still need to be further explored and verified. 7.Drug screening lacks pharmacological experimental verification : In this study, virtual screening of candidate Chinese medicines and small molecule compounds was completed only through database mining, network pharmacology and molecular docking. Although the relevant pharmacological theoretical basis was supplemented in combination with the existing literature, in vitro cell efficacy and animal in vivo pharmacological experiments were not carried out, and the actual regulatory effects of candidate drugs on target gene expression, inflammation level and immunogenic cell death could not be directly verified. Subsequent systematic pharmacological experiments are still needed to further verify the efficacy and mechanism of the candidate drugs. 8.Insufficient interpretability of machine learning model : The giboost + RF integrated model used in this study is a typical black box model. Limited by the analysis conditions, SHAP and LIME feature attribution analysis and visualization are not carried out this time, and the internal decision logic of the model is not displayed intuitively. Subsequently, if a clinical prediction model is constructed, the above interpretability analysis will be supplemented to improve the transparency of the model.Taken together, these limitations affected the depth and breadth of the findings. Future studies need to reduce data bias by integrating multicenter clinical cohorts, dynamically monitoring ICD-related pathways with multi-omics techniques, constructing 3D cell co-culture models or organoid models closer to physiological states, and developing more comprehensive natural drug databases to enhance the translational value of studies.
In summary, three key ICD-related DEGs CD8A, IFNGR1 and ENTPD1 were identified in sepsis, and their key roles in immune regulatory response were confirmed. Meanwhile, the ceRNA network of CD8A and ENTPD1 was constructed. The small molecule drugs related to CD8A, IFNGR1 and ENTPD1 were screened through network pharmacology, and the potential binding sites between these three key genes and mapped small molecules were predicted through molecular docking simulation experiments, which provided a new perspective for exploring key regulatory mechanisms.
Abbreviations
AUC:area under the curve; CAMs: cell adhesion molecules; DAMPs: Damage-associated molecular patterns; DEGs: Differentially Expressed Genes; DL:drug-likeness; Enet: elastic network; FA :folic acid; GBM: generalized lifting regression model; GO:Gene Ontology; GSEA: Gene Set Enrichment Analysis; GSVA: Genome Variation Analysis; ICD: Immunogenic Cell Death; KEGG: Kyoto Encyclopedia of Genes and Genomes; LOOCV: leave-one-out cross-validation; LPS: lipopolysaccharide; OB:oral bioavailability; PAMPs: Pathogen-associated molecular patterns; PCA: Principal component analysis; PCD:Programmed cell death; PRRs: Pattern recognition receptors; plsRcox: partial least squares regression; qRT-PCR: quantitative real - time polymerase chain reaction; ROC: receiver operating characteristic; RSF: random survival forest; SDH: succinate dehydrogenase; SuperPC: supervised principal component analysis; TCM: traditional Chinese Medicine; TCMSP: Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform
Authors' contributions
Qiuli Chen, Yongbing Yu and Fanyan Ou conceived the idea for the study and designed the overall research framework. Qiuli Chen, Yongbing Yu and Fanyan Ou developed the experimental methods used in the study.Junzhi Yang was responsible for developing and implementing the software used for data analysis.Binbin Li and Lixiong Zeng collected and organized the data used in the study. Fanyan Ou, Houyu Gan and Guo Qian were responsible for cleaning and preprocessing the raw data. Qiuli Chen, Yongbing Yu and Fanyan Ou wrote the first draft of the manuscript.Binbin Li and Guo Qian created the visualizations and figures for the paper. Kanglai Wei, Jie Yang, Jihua Feng and Jianfeng Zhang supervised the entire research process.All authors contributed to the review and editing of the manuscript.All authors discussed, revised, and proofread the manuscript.
Funding
1 Guangxi Natural Science Foundation (No.2021GXNSFBA196017).
2 National Natural Science Foundation of China (No.82302461 and No.82360374).
3 Guangxi Zhuang Autonomous Region Young Talents Program
4 Guangxi Medical University Youth Science Fund Program (No.GXMUYSFB2026057)
Acknowledgements
The authors wish to acknowledge that Qiuli Chen, Yongbing Yu and Fanyan Ou contributed equally to this work and are considered co-first authors. Jihua Feng, Jianfeng Zhang and Jie Yang are co-corresponding authors.We would like to express my sincere gratitude to Guangxi Anrenxin Biotechnology Co, Ltd and the Guangxi Key Laboratory of Intelligent Precision Medicine for their generous support throughout the research process.
Ethics approval and consent to participate
Not applicable
Date availability statement
The datasets presented in this study can be found in online repositories. The names of therepository/repositories and accession number(s) can be found in the article.The detailed description of data availability can be found at the following links:(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi).
Competing interests
We declare that the authors have no competing interests in this paper.
References
Figure Legends(3)
Supplemental information(1)
Citation