Mahmoud Ahmed

Postdoc - Cancer Genomics

Re-implementation of an algorithm to integrate transcriptome and ChIP-seq data


Preprint


Mahmoud Ahmed, Deok Ryong Kim
bioRxiv, Cold Spring Harbor Laboratory, 2022 Nov 25


View PDF
Cite

Cite

APA   Click to copy
Ahmed, M., & Kim, D. R. (2022). Re-implementation of an algorithm to integrate transcriptome and ChIP-seq data. BioRxiv. https://doi.org/10.1101/2022.11.23.517753


Chicago/Turabian   Click to copy
Ahmed, Mahmoud, and Deok Ryong Kim. “Re-Implementation of an Algorithm to Integrate Transcriptome and ChIP-Seq Data.” bioRxiv (November 25, 2022).


MLA   Click to copy
Ahmed, Mahmoud, and Deok Ryong Kim. “Re-Implementation of an Algorithm to Integrate Transcriptome and ChIP-Seq Data.” BioRxiv, Cold Spring Harbor Laboratory, Nov. 2022, doi:10.1101/2022.11.23.517753.


BibTeX   Click to copy

@article{ahmed2022a,
  title = {Re-implementation of an algorithm to integrate transcriptome and ChIP-seq data},
  year = {2022},
  month = nov,
  day = {25},
  journal = {bioRxiv},
  publisher = {Cold Spring Harbor Laboratory},
  doi = {10.1101/2022.11.23.517753},
  author = {Ahmed, Mahmoud and Kim, Deok Ryong},
  month_numeric = {11}
}

Abstract

Transcription factor binding to a gene regulatory region induces or represses its expression. Binding and expression target analysis (BETA) integrates the binding and gene expression data to predict this function. First, the regulatory potential of the factor is modeled based on the distance of its binding sites from the transcription start sites in a decay function. Then the differential expression statistics from an experiment where this factor was perturbed represent the binding effect. The rank product of the two values is employed to order in importance. This algorithm was originally implemented in Python. We reimplemented the algorithm in R to take advantage of existing data structures and other tools for downstream analyses. Here, we attempted to replicate the findings in the original BETA paper. We applied the new implementation to the same datasets using default and varying inputs and cutoffs. We successfully replicated the original results. Moreover, we showed that the method was appropriately influenced by varying the input and was robust to choices of cutoffs in statistical testing.