Using Alias Sampling Strategy Based on Network Embeddings to Detect Protein Complexes

Detecting protein complexes from available protein-protein interaction (PPI) data will help to deeply understand the mechanism of the biological activities. In recent years, various computational methods have been developed for identifying protein complexes from PPI networks. Almost all the basic co...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access Vol. 8; pp. 211773 - 211783
Main Authors: Liu, Xiaoxia, Sang, Shengtian, Wang, Xiaoxu
Format: Journal Article
Language:English
Published: Piscataway IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Detecting protein complexes from available protein-protein interaction (PPI) data will help to deeply understand the mechanism of the biological activities. In recent years, various computational methods have been developed for identifying protein complexes from PPI networks. Almost all the basic computational methods mainly depend on the association of topological analysis of PPI networks. However, most of them fail to satisfactorily capture the global and local topological structures of the PPI networks, as well as the diversity of connectivity patterns between individual nodes at the same time. To solve this problem, in this work we propose a node embedding based alias sampling extension method to detect protein complexes. More specifically, for a given set of seed nodes, it first uses the alias sampling strategy based on protein node embedding similarities to select potential addable nodes. Then it makes use of a new conductance measure, which could better quantify the likelihood of a subgraph being a protein complex, to decide whether to extend the current candidate subgraph in order to find protein complexes. Evaluated on six real yeast PPI networks, our method outperforms state-of-the-art methods in detecting protein complexes. Furthermore, the experimental results demonstrate the protein complexes predicted by our method have higher biological significance.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2020.3040327