ORFLine: a bioinformatic pipeline to prioritize small open reading frames identifies candidate secreted small proteins from lymphocytes
Hu F., Lu J., Matheson LS., Díaz-Muñoz MD., Saveliev A., Xu J., Turner M.
Motivation: The annotation of small open reading frames (smORFs) of <100 codons (<300 nucleotides) is challenging due to the large number of such sequences in the genome. Results: In this study, we developed a computational pipeline, which we have named ORFLine, that stringently identifies smORFs and classifies them according to their position within transcripts. We identified a total of 5744 unique smORFs in datasets from mouse B and T lymphocytes and systematically characterized them using ORFLine. We further searched smORFs for the presence of a signal peptide, which predicted known secreted chemokines as well as novel micropeptides. Four novel micropeptides show evidence of secretion and are therefore candidate mediators of immunoregulatory functions.