李宏杰. 基于GEO基因数据集的糖尿病足差异表达基因生物信息学分析. 2025. biomedRxiv.202502.00048
基于GEO基因数据集的糖尿病足差异表达基因生物信息学分析
通讯作者: 李宏杰, 1083856574@qq.com
DOI:10.12201/bmr.202502.00048
Bioinformatics analysis of differentially expressed genes in diabetic foot based on the GEO gene dataset
-
摘要:目的 通过对GEO(gene expression omnibus)数据库GSE143735数据集进行生物信息学分析探寻与糖尿病足发生发展相关的重要基因。方法 对GSE143735mRNA基因芯片数据集进行基因转录谱分析,将数据集中的患者根据是否存在糖尿病足溃疡分为两组,即糖尿病足溃疡组(DFU)、糖尿病无糖尿病足溃疡组(DS),利用在线工具GEO2R对两组患者的mRNA表达数据进行差异表达基因分析,绘制火山图,通过DAVID数据库进行GO分析及KEGG通路分析,绘制GO和KEGG富集分析图。筛选的得到的差异表达基因置入String公共数据库中进行分析。从String数据库中下载目标基因和靶基因相互作用的数据,通过Cytoscape绘制蛋白质互作网络图并筛选出核心基因。结果 DFU及DS两组筛选出差异表达基因706个,其中上调基因470个,下调基因236个。GO分析提示差异表达基因主要集中在细胞外液、核小体、含胶原蛋白的细胞外基质、外泌体等细胞组成结构。GO分析提示差异表达基因主要功能是:DNA结合、蛋白质异二聚体活化、染色质结构组成、酶结合、核小体DNA结合、细胞因子激活、微管结合、细胞外基质结构组成、结构分子活化、丝氨酸抑制酶活化等。差异表达基因的GO分析生物学过程主要涉及:核小体组装、染色质组织、角质化、端粒组织、基因表达的正调控、细胞群增殖的正调控、含CENP-A染色质的蛋白质定位、异染色质形成、抗菌体液免疫、抗菌肽介导的抗菌体液免疫应答。差异表达基因的KEGG通路分析主要涉及:酒精中毒、系统性红斑狼疮、中性粒细胞胞外陷阱的形成、病毒致癌作用、癌症的转录失调、神经活性配体-受体相互作用、志贺菌病、坏死性上睑下垂、ATP依赖的染色质重塑、化学性致癌受体激活。蛋白质互作网络筛选得到H2BC14、H3C14、H2BC17、H4C6、H2BC9、H3C1、H2BC3、H3C12、H3C13、H2AC18等10个核心基因,根据火山图结果这些基因在DFU组均为低表达。结论 H2BC14、H3C14、H2BC17、H4C6、H2BC9、H3C1、H2BC3、H3C12、H3C13、H2AC18基因低表达可能可以作为糖尿病患者发生糖尿病足溃疡的预测因子;可以通过检测这些基因在糖尿病患者中的表达量高低来预测这些患者发生糖尿病足的风险。
Abstract: This study aims to explore the crucial genes associated with the development and progression of diabetic foot by conducting bioinformatics analysis on the GSE143735 dataset within the GEO (Gene Expression Omnibus) database.Methods The GSE143735 mRNA gene chip dataset underwent gene transcription profile analysis. The patients in the dataset were divided into two groups according to whether there was diabetes foot ulcer, namely, diabetes foot ulcer group (DFU) and diabetes foot ulcer without diabetes (DS). The mRNA expression data of the two groups were analyzed for DEGs using online tool GEO2R and a volcano plot was subsequently generated. GO analysis and KEGG pathway analysis were carried out using the DAVID database, and GO and KEGG enrichment analysis plots were created. The identified DEGs were then input into the String public database for further analysis. The data regarding the interactions between target genes and their counterparts were downloaded from the String public database. A protein - protein interaction network diagram was constructed using Cytoscape, and core genes were screened out.Results A total of 706 differentially expressed genes were identified in the DFU and DS groups. Among them, 470 genes were up - regulated, while 236 were down - regulated. The results of GO analysis indicated that the DEGs were primarily concentrated in cellular components such as extracellular fluid, nucleosome, collagen - containing extracellular matrix, and exosome. The main functions of these DEGs as revealed by GO analysis, included DNA binding, protein heterodimer activation, chromatin structure formation, enzyme binding, nucleosome DNA binding, cytokine activation, microtubule binding, extracellular matrix structure composition, structural molecule activation, and serine inhibitor enzyme activation. The biological processes in the GO analysis of the DEGs mainly involved nucleosome assembly, chromatin organization, keratinization, telomere organization, positive regulation of gene expression, positive regulation of cell population proliferation, protein localization in CENP - A - containing chromatin, heterochromatin formation, antibacterial humoral immunity, and antibacterial humoral immune response mediated by antibacterial peptides. The KEGG pathway analysis of the DEGs mainly covered alcoholism, SLE, formation of neutrophil extracellular traps, viral carcinogenesis, transcriptional dysregulation in cancer, neuroactive ligand - receptor interaction, shigellosis, necrotizing blepharoptosis, ATP - dependent chromatin remodeling, and activation of chemical carcinogenesis receptors. From the protein - protein interaction network, ten core genes, namely H2BC14, H3C14, H2BC17, H4C6, H2BC9, H3C1, H2BC3, H3C12, H3C13, and H2AC18, were successfully screened out. According to the volcano plot results, these genes were all expressed at low levels in the DFU group.Conclusion The low expression of H2BC14, H3C14, H2BC17, H4C6, H2BC9, H3C1, H2BC3, H3C12, H3C13, and H2AC18 genes may serve as predictive factors for the occurrence of diabetic foot ulcers in diabetic patients. The risk of diabetic foot in these patients can be predicted by detecting the expression levels of these genes in diabetic patients.
Key words: diabetic foot; GEO database; differentially expressed genes; bioinformatics analysis提交时间:2025-02-24
版权声明:作者本人独立拥有该论文的版权,预印本系统仅拥有论文的永久保存权利。任何人未经允许不得重复使用。 -
图表
-
王陆银. 儿童暴发性心肌炎miRNA的筛选及生物信息学分析. 2025. doi: 10.12201/bmr.202503.00064
程灵婧, 李贺同, 张升校, 刘鸿齐, 于琦, 郑超越, 冯爽, 孔腾, 孙翔飞, 贺培凤, 吕小萍. 基于蛋白对宫颈癌相关基因的生物信息学分析. 2023. doi: 10.12201/bmr.202303.00017
王志远, 孟玲, 石娟, 刘圆圆, 李瑞頔, 李文鑫, 安淑红. 基于生物信息学分析TFF3在肺腺癌中的表达和调控作用. 2024. doi: 10.12201/bmr.202403.00019
赵严红, 张 鑫, 宁静华, 屈 润, 张钰哲. 孟德尔随机化结合转录组鉴定系统性红斑狼疮的关键致病基因. 2025. doi: 10.12201/bmr.202502.00062
陈庚, 蒋雯. 血清ALB、CRP、GLB对复发性糖尿病足溃疡的预测价值. 2024. doi: 10.12201/bmr.202410.00061
郑敏瑶, 楚鑫, 何艳, 李娜娜. 糖尿病患者足部自我护理行为影响因素的Meta分析. 2024. doi: 10.12201/bmr.202408.00046
宋思嘉, 单晨璐, 王爽, 陈如梵, 张涛, 郑灏, 韩雅琴. 基因数据隐私问题及相关保护技术进展研究. 2021. doi: 10.12201/bmr.202104.00015
李筱琳, 隋欣, 宋娟, 包亚男, 林宇, 满子腾, 程甜甜, 杨宏艳. 基于WGCNA鉴定阿尔茨海默病的衰老关键基因. 2024. doi: 10.12201/bmr.202407.00036
朱千君, 赵成玉. 尿酸与糖尿病周围神经病变的相关研究进展. 2024. doi: 10.12201/bmr.202408.00019
王芳, 胡红濮, 陈荃, 杨予青, 李晓泽, 宋妍, 万艳丽. 北京市东城区老年糖尿病患者健康知识知晓情况调查与分析. 2022. doi: 10.12201/bmr.202207.00039
-
序号 提交日期 编号 操作 1 2025-02-03 bmr.202502.00048V1
下载 -
-
公开评论 匿名评论 仅发给作者
引用格式
访问统计
- 阅读量:157
- 下载量: 0
- 评论数:0