Projects/READY-GO: Prediction of protein functional from peptide sequences

From Research management course
Jump to: navigation, search

Problem 75

  • Title: Projects/READY-GO: Prediction of protein functional from peptide sequences
  • Problem: Protein functions are encoded into their 1-dimensional amino acid sequences. Consecutive or non-consecutive amino acid motifs present in these sequences can be viewed as signals that will be recognized by the cellular machinery. As a result, the protein will be transported into a specific cellular compartment, perform a catalytic reaction, participate in signaling pathways or in the formation of the cell cytoskeleton, and regulate gene expression, among others. Here, we propose to develop a probabilistic model able to identify such signals and guide the discovery of novel regulatory mechanisms associated to them. The model should be interpretable and, at the same time, use as few biological priors as possible, to avoid biasing the results toward what is already known.
  • Data:
    1. from DeepLoc on this page It’s a multi-FASTA file, where each entry (« ^> ») corresponds to a protein
    2. DeepLoc-1.0: Eukaryotic protein subcellular localization predictor
    3. UniProtKB - Q9H400 (LIME1_HUMAN)
    4. GO-CAM
    5. Saccharomyces Genome Database (SGD)
  • References:
    1. https://elifesciences.org/articles/39397
    2. Transcripts’ evolutionary history and structural dynamics give mechanistic insights into the functional diversity of the JNK family by Elodie Laine et al, 2020, biorxiv, github
    3. Draft problem statement
  • Basic solution: in the lab paper
  • Method: to be established
  • Authors: Elodie Laine, Sergei Grudinin, and Vadim Strijov