Computational techniques to recover missing gene expression data

Thumbnail Image
Issue Date
Fraidouni, Negin
Záruba, Gergely V.

Fraidouni, Negin; Zaruba, Gergely V. 2018. Computational techniques to recover missing gene expression data. Advances in Science, Technology and Engineering Systems, vol. 3:no. 6:pp 233-242


Almost every cells in human's body contain the same number of genes so what makes them different is which genes are expressed at any time. Measuring gene expression can be done by measuring the amount of mRNA molecules. However, it is a very expensive and time consuming task. Using computational methods can help biologists to perform gene expression measurements more efficiently by providing prediction techniques based on partial measurements. In this paper we describe how we can recover a gene expression dataset by employing Euclidean distance, Pearson correlation coefficient, Cosine similarity and Robust PCA. To do this, we can assume that the gene expression data is a matrix that has missing values. In that case the rows of the matrix are different genes and columns are different subjects. In order to find missing values, we assume that the data matrix is low rank. We then used different correlation metrics to find similar genes. In another approach, we employed RPCA method to differentiate the underlying low rank matrix from the sparse noise. We used existing implementations of state-of-the-art algorithms to compare their accuracy. We describe that RPCA approach outperforms the other approaches with reaching improvement factors beyond 4.8 in mean squared error.

Table of Content
© 2018 Advances in Science, Technology and Engineering Systems. During 2016, ASTES Journal started to publish articles under the Creative Commons Attribution License and are now using the latest version of the CC BY license, which grants authors the most extensive rights. This means that all articles published in ASTES Journal, including data, graphics, and supplements, can be linked from external sources, scanned by search engines, re-used by text mining applications or websites, blogs, etc. free of charge under the sole condition of proper accreditation of the source and original publisher. Important Note: some articles (especially Reviews) may contain figures, tables or text taken from other publications, for which ASTES Journal does not hold the copyright or the right to re-license the published material. Please note that you should inquire with the original copyright holder (usually the original publisher or authors), whether or not this material can be re-used.