| Title: | Prediction of Amyloid Proteins |
|---|---|
| Description: | Predicts amyloid proteins using random forests trained on the n-gram encoded peptides. The implemented algorithm can be accessed from both the command line and shiny-based GUI. |
| Authors: | Michal Burdukiewicz [cre, aut] (ORCID: <https://orcid.org/0000-0001-8926-582X>), Piotr Sobczyk [ctb], Jaroslaw Chilimoniuk [ctb] (ORCID: <https://orcid.org/0000-0001-5467-018X>), Stefan Roediger [ctb] (ORCID: <https://orcid.org/0000-0002-1441-6512>), Dominik Rafacz [ctb] |
| Maintainer: | Michal Burdukiewicz <[email protected]> |
| License: | GPL-3 |
| Version: | 1.2 |
| Built: | 2026-06-04 08:54:12 UTC |
| Source: | https://github.com/michbur/amylogram |
Amyloids are proteins associated with the number of clinical disorders (e.g., Alzheimer's, Creutzfeldt-Jakob's and Huntington's diseases). Despite their diversity, all amyloid proteins can undergo aggregation initiated by 6- to 15-residue segments called hot spots. Henceforth, amyloids form unique, zipper-like beta-structures, which are often harmful. To find the patterns defining the hot spots, we developed our novel predictor of amyloidogenicity AmyloGram, based on random forests.
AmyloGram is available as R function (predict.ag_model) or
shiny GUI (AmyloGram_gui).
The package is enriched with the benchmark data set pep424.
Maintainer: Michal Burdukiewicz <[email protected]>
Burdukiewicz MJ, Sobczyk P, Roediger S, Duda-Madej A, Mackiewicz P, Kotulska M. (2017) Amyloidogenic motifs revealed by n-gram analysis. Scientific Reports 7 https://doi.org/10.1038/s41598-017-13210-9
Launches graphical user interface that predicts presence of amyloids.
AmyloGram_gui()AmyloGram_gui()
Any ad-blocking software may cause malfunctions.
Random forest grown using the ranger package with additional
information.
A list of length three: random forest, a vector of important n-grams and the best-performing encoding.
Checks if an object is a protein (contains letters from one-letter amino acid code).
is_protein(object)is_protein(object)
object |
|
TRUE or FALSE.
Benchmark dataset for PASTA 2.0. 5 sequences shorter than 6 amino acids (1% of the original dataset) were removed.
pep424pep424
a list of 424 peptides (class SeqFastaAA).
Walsh, I., Seno, F., Tosatto, S.C.E., and Trovato, A. (2014). PASTA 2.0: an improved server for protein aggregation prediction. Nucleic Acids Research gku399.
Recognizes amyloids using AmyloGram algorithm.
## S3 method for class 'ag_model' predict(object, newdata, ...)## S3 method for class 'ag_model' predict(object, newdata, ...)
object |
|
newdata |
|
... |
further arguments passed to or from other methods. |
data(AmyloGram_model) data(pep424) predict(AmyloGram_model, pep424[c(4, 10)])data(AmyloGram_model) data(pep424) predict(AmyloGram_model, pep424[c(4, 10)])
Prints ag_model objects.
## S3 method for class 'ag_model' print(x, ...)## S3 method for class 'ag_model' print(x, ...)
x |
|
... |
further arguments passed to or from other methods. |
data(AmyloGram_model) print(AmyloGram_model)data(AmyloGram_model) print(AmyloGram_model)
Prints ag_prediction objects.
## S3 method for class 'ag_prediction' print(x, ...)## S3 method for class 'ag_prediction' print(x, ...)
x |
|
... |
further arguments passed to or from other methods. |
data(AmyloGram_model) data(pep424) pred <- predict(AmyloGram_model, pep424[c(4, 10)]) print(pred)data(AmyloGram_model) data(pep424) pred <- predict(AmyloGram_model, pep424[c(4, 10)]) print(pred)
Read sequence data saved in text file.
read_txt(connection)read_txt(connection)
connection |
a |
The input file should contain one or more amino acid sequences separated by empty line(s).
a list of sequences. Each element has class SeqFastaAA. If
connection contains no characters, function prompts warning and returns NULL.
Sensitivity, specificity and Matthew's Correlation Coefficient
of AmyloGram for different cutoffs computed on pep424 dataset.
spec_sensspec_sens
a data frame with four columns and 99 rows.
Walsh, I., Seno, F., Tosatto, S.C.E., and Trovato, A. (2014). PASTA 2.0: an improved server for protein aggregation prediction. Nucleic Acids Research gku399.