Significance of subcellular localization prediction:

Proteins must be transported to the correct organelles of a cell and folded into correct 3-D structures to properly perform their functions. Therefore, knowing the subcellular localization is one step towards understanding its functions. Proteins can exist in different locations within a cell, and some proteins can even simultaneously reside at, or move between, two or more different subcellular locations. As an essential and indispensable topic in proteomics research and molecular cell biology, protein subcellular localization is critically important for protein function annotation, drug target discovery, and drug design. Efficient and reliable computational methods are developed to assist the biological experiments such as fluorescent microscopy imaging. Proteins with multiple locations play important roles in some metabolic processes taking place in more than one compartment.


Specific information about HybridGO-Loc:

HybridGO-Loc stands for mining Hybrid features on Gene Ontology (GO) for protein subcellular Localization prediction, meaning that this predictor extracts the feature of proteins from different perspectives of GO information (i.e. GO frequency occurrences and GO semantic similarity) and then processes the information by a multi-label multi-class SVM classifier with an adaptive decision scheme. The HybridGO-Loc predictor can deal with both single-location proteins and multi-location proteins.

For virus proteins, HybridGO-Loc is designed to predict 6 subcellular locations of multi-label viral proteins. The 6 subcellular locations include: (1) viral capsid; (2) host cell membrane; (3) host endoplasmic reticulum; (4) host cytoplasm; (5) host nucleus; and (6) secreted. The predictor is not designed for predicting the subcellular localization of non-viral proteins. Therefore, the prediction results of non-viral proteins are arbitary and meaningless.

For plant proteins, HybridGO-Loc is designed to predict 12 subcellular locations of multi-label plant proteins. The 12 subcellular locations include: (1) cell membrane; (2) cell wall; (3) chloroplast; (4) cytoplasm; (5) endoplasmic reticulum; (6) extracellular; (7) golgi apparatus; (8) mitochondrion; (9) nucleus; (10) peroxisome; (11) plastid; and (12) vacuole. Note (11) plastid here includes those plastid groups except for (3) chloroplast. The predictor is not designed for predicting the subcellular localization of non-plant proteins. Therefore, the prediction results of non-plant proteins are arbitary and meaningless.

Some notes:

  1. The HybridGO-Loc predictor can make prediction for the input of either accession numbers (ACs) or sequences of proteins.
  2. Letters of ACs (in UniProtKB format) should be in uppercases.
  3. When users use amino acid sequences as input, HybrdiGO-Loc will call BLAST to look for the AC of the closest proteins. Therefore, the predictor will take much longer to determine the subcellular location(s) if AA sequences are used as inputs.
  4. For AC inputs starting with letters 'O', 'P' and 'Q', HybridGO-Loc can produce results very fast (because the AC to GO-terms mapping is implemented as a hash-table in memory). For ACs starting with other letters, HybridGO-Loc may take longer for each AC.