1. Web-based front-end for Emerald

The goal of this project is to create a web-based front-end for the C++ command-line application Emerald. This enables the user to interact with the application through a graphical user interface and provides the possibility for visualizations. The application Emerald analyzes protein sequences and computes regions of the proteins that are likely be important for the protein. It can be applied in various ways: for example, we envision a speed-up of protein structure prediction with AlphaFold2, which received the Nobel Prize in Chemistry in 2024.

Objectives

Implementation environment

Front-end

Back-end

2. Fusion gene discovery from long RNA reads

Fusion genes appear due to aberrant chromosome rearrangement and in some cases cause cancer. Long RNA reads (such as Oxford Nanopores and PacBio) already proved to be a great instrument for transcript and gene discovery. In this project we plan to examine existing tools [1-3], simulate sequencing data containing fusion genes and develop our own algorithm for fusion gene discovery.

References

  1. Davidson, N. M., Chen, Y., Sadras, T., Ryland, G. L., Blombery, P., Ekert, P. G., ... & Oshlack, A. (2022). JAFFAL: detecting fusion genes with long-read transcriptome sequencing. Genome biology, 23(1), 10.
  2. Liu, Q., Hu, Y., Stucky, A., Fang, L., Zhong, J. F., & Wang, K. (2020). LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing. BMC genomics, 21, 1-12.
  3. Chen, Y., Wang, Y., Chen, W., Tan, Z., Song, Y., Human Genome Structural Variation Consortium, ... & Chong, Z. (2023). Gene fusion detection and characterization in long-read cancer transcriptome sequencing data with FusionSeeker. Cancer research, 83(1), 28-33.