Difference between revisions of "Projects/READY-GO: Prediction of protein functional from peptide sequences"
From Research management course
m (Wiki moved page Projects/GOFUN: Prediction of protein functional from peptide sequences to Projects/READY-GO: Prediction of protein functional from peptide sequences: Title change.) |
|||
Line 1: | Line 1: | ||
==Problem 75== | ==Problem 75== | ||
* Title: Projects/READY-GO: Prediction of protein functional from peptide sequences | * Title: Projects/READY-GO: Prediction of protein functional from peptide sequences | ||
− | * Problem: | + | * Problem: Protein functions are encoded into their 1-dimensional amino acid sequences. Consecutive or non-consecutive amino acid motifs present in these sequences can be viewed as signals which will be recognised by the cellular machinery. As a result, the protein will be transported into a specific cellular compartment, perform a catalytic reaction, participate in signalling pathways or in the formation of the cell cytoskeleton, regulate gene expression, among others. Here, we propose to develop a probabilistic model able to identify such signals and guide the discovery of novel regulatory mechanisms associated to them. The model should be interpretable and, at the same time, use as few biological priors as possible, to avoid biasing the results toward what is already known. |
* Data: | * Data: | ||
*# from DeepLoc on this [http://www.cbs.dtu.dk/services/DeepLoc/data.php page] It’s a multi-FASTA file, where each entry (« ^> ») corresponds to a protein | *# from DeepLoc on this [http://www.cbs.dtu.dk/services/DeepLoc/data.php page] It’s a multi-FASTA file, where each entry (« ^> ») corresponds to a protein |
Revision as of 18:16, 24 September 2020
Problem 75
- Title: Projects/READY-GO: Prediction of protein functional from peptide sequences
- Problem: Protein functions are encoded into their 1-dimensional amino acid sequences. Consecutive or non-consecutive amino acid motifs present in these sequences can be viewed as signals which will be recognised by the cellular machinery. As a result, the protein will be transported into a specific cellular compartment, perform a catalytic reaction, participate in signalling pathways or in the formation of the cell cytoskeleton, regulate gene expression, among others. Here, we propose to develop a probabilistic model able to identify such signals and guide the discovery of novel regulatory mechanisms associated to them. The model should be interpretable and, at the same time, use as few biological priors as possible, to avoid biasing the results toward what is already known.
- Data:
- from DeepLoc on this page It’s a multi-FASTA file, where each entry (« ^> ») corresponds to a protein
- DeepLoc-1.0: Eukaryotic protein subcellular localization predictor
- UniProtKB - Q9H400 (LIME1_HUMAN)
- GO-CAM
- Saccharomyces Genome Database (SGD)
- References:
- https://elifesciences.org/articles/39397
- Transcripts’ evolutionary history and structural dynamics give mechanistic insights into the functional diversity of the JNK family by Elodie Laine et al, 2020, biorxiv, github
- Draft problem statement
- Basic solution: in the lab paper
- Method: to be established
- Authors: Elodie Laine, Sergei Grudinin, and Vadim Strijov