1. Web-based front-end for Emerald
The goal of this project is to create a web-based front-end for the C++ command-line application Emerald. This enables the user to interact with the application through a graphical user interface and provides the possibility for visualizations. The application Emerald analyzes protein sequences and computes regions of the proteins that are likely be important for the protein. It can be applied in various ways: for example, we envision a speed-up of protein structure prediction with AlphaFold2, which received the Nobel Prize in Chemistry in 2024.
Objectives
- Web-based front-end
- Support the features of Emerald
- Visualize the output
- Create interactive tools to analyze the output data
- Easily usable (target audience consists of e.g. Biologists)
Implementation environment
Front-end
- Web-based
- Coded in HTML, CSS, JavaScript
- Using libraries like React
Back-end
- Emerald is already written (in C++), and uses a single non-standard library GMP
- Run on the website via WebAssembly (we are open to other suggestions proposed by the students, as long as the end product is easy to use)
2. Fusion gene discovery from long RNA reads
Fusion genes appear due to aberrant chromosome rearrangement and in some cases cause cancer.
Long RNA reads (such as Oxford Nanopores and PacBio) already proved to be a great instrument for transcript and gene discovery.
In this project we plan to examine existing tools [1-3], simulate sequencing data containing fusion genes and develop our own algorithm for fusion gene discovery.
References
- Davidson, N. M., Chen, Y., Sadras, T., Ryland, G. L., Blombery, P., Ekert, P. G., ... & Oshlack, A. (2022). JAFFAL: detecting fusion genes with long-read transcriptome sequencing. Genome biology, 23(1), 10.
- Liu, Q., Hu, Y., Stucky, A., Fang, L., Zhong, J. F., & Wang, K. (2020). LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing. BMC genomics, 21, 1-12.
- Chen, Y., Wang, Y., Chen, W., Tan, Z., Song, Y., Human Genome Structural Variation Consortium, ... & Chong, Z. (2023). Gene fusion detection and characterization in long-read cancer transcriptome sequencing data with FusionSeeker. Cancer research, 83(1), 28-33.