Me
Chin-Yu Lee (李清宇)
Research Fellow in Precision Medicine
Bioinformatics Engineer
Medical Technologist
Chin-Yu Lee
here

About Me

My name is Ching-Yu Lee, and I graduated from Taipei Medical University with a degree in Medical Laboratory Science and Biotechnology. I am a licensed medical technologist and hold a master’s degree in Toxicology from National Taiwan University. My research focused on using iPSCs to study the combined effects of heavy metals and PAHs on East Asian lung adenocarcinoma. I specialize in bioinformatics, including RNA-seq, scRNA-seq, and methylation array analyses, and I am skilled in tools like R, Linux, and NGS pipelines. With a strong background in lab operations, data analysis, and programming, I aim to integrate bioinformatics and AI to advance precision medicine and contribute to the biotechnology industry.

Education

Master's Degree
Toxicology Graduate Institute, National Taiwan University
2022 ~ 2024
GPA: 3.86
Bachelor's Degree
Medical Laboratory Science and Biotechnology, Taipei Medical University
2017 ~ 2021
GPA: 3.84

Experience

Dec 2024 ~ Present
Research Fellow in Precision Medicine, Medcom Biotech
Jul 2022 ~ Aug 2024
Graduate Student, National Taiwan University
Aug 2020 ~ Dec 2020
Medical Laboratory Technologist Intern, Taipei Municipal Wanfang Hospital
Jul 2018 ~ Jul 2020
Caretaker of Laboratory Animal Center, Taipei Medical University
Jul 2018 ~ Jun 2020
Lab intern, Taipei Medical University

Skills

My expertise spans across diverse domains including Bioinformatics, specializing in techniques such as RNA-seq, scRNA-seq, Methylation array, and NGS. I am proficient in bioinformatics tools like GATK4, BWA, Samtools, Bedtools, FastQC, and MultiQC.

Additionally, I have extensive experience with programming languages and computational frameworks, including R, Linux, Docker, Git, Bash, and Python. My skillset also extends to AI, with a strong focus on Machine Learning and Deep Learning.

(Created using WordCloud, Matplotlib, Python)

Portfolio

RNAseq Analysis
RNAseq Analysis

This repository contains a collection of R scripts for bioinformatics and genomic data analysis. Key features include workflows for RNA-seq analysis, gene set enrichment analysis, transcriptional regulatory network analysis, and pathway analysis. It also includes specialized scripts for working with The Cancer Genome Atlas (TCGA) data and preprocessing genomic datasets. The repository provides a comprehensive toolkit for researchers aiming to perform advanced analyses of high-throughput sequencing data.

[GitHub]
Methylation Analysis
Methylation Analysis

This repository contains scripts for initial processing and analysis of methylation array data, primarily utilizing the sesame package in R. Key scripts include sesame_methylation_analysis.R for processing methylation data, methy_heatmap.R for visualizing methylation patterns, and plotArg.R for generating custom plots. These scripts were developed as part of a project at National Taiwan University to streamline methylation analysis workflows.

[GitHub]
scRNA-seq Analysis
scRNA-seq Analysis

This repository contains the final project from a transcriptomics course at National Taiwan University. It demonstrates the creation of a custom analysis workflow for single-cell RNA sequencing (scRNA-seq) data based on a published paper. Key files include final_project.R and scRNA-seq.R, which detail the analytical process, and the rendered report final_project.html. The project highlights the use of tools like loupeR and RMarkdown for documenting and visualizing results.

[GitHub]
Transcriptome Analysis Course
Transcriptome Analysis Course

This repository contains assignments and exercises from the Transcriptomics course at National Taiwan University. It includes weekly homework files focusing on transcriptomics analysis (e.g., HWweek2.R, HWweek3.R) and a data visualization folder with scripts for creating heatmaps, ggplot2 visualizations, and analyzing cell viability. Additionally, a package directory provides relevant resources and documentation for deeper exploration of transcriptomics workflows and R programming fundamentals.

[GitHub]
Transcriptome Analysis Course
Data Science

This repository is a comprehensive collection showcasing data science and programming skills. It includes Python basics and advanced concepts like regular expressions, unit testing, and concurrent programming, along with practical implementations of machine learning algorithms. It also features topics on object-oriented programming (OOP), data visualization using Matplotlib, and data manipulation with tools like NumPy, Pandas, and scikit-learn. This resource serves as a solid foundation for aspiring data scientists and developers, providing hands-on examples and structured learning for practical applications.

[GitHub]
Transcriptome Analysis Course
Computer Science

This repository focuses on fundamental concepts of computer science, including Bash scripting, Docker containerization, Git version control, and HTML development. Each folder contains practical examples and exercises to build proficiency in essential tools and technologies for software development and systems operations.

[GitHub]
Transcriptome Analysis Course
NGS

This repository is dedicated to Next-Generation Sequencing (NGS) data analysis, featuring tools and workflows for genomic studies. It includes directories for popular tools such as BCFtools, BEDtools, and SAMtools, as well as scripts for Whole Exome Sequencing (WES) analysis. The repository provides a practical script for WES practice, enabling users to effectively process and analyze genomic data, making it a valuable resource for bioinformatics professionals.

[GitHub]
Transcriptome Analysis Course
Deep learning

This repository focuses on deep learning concepts and applications. It includes key resources like Jupyter notebooks on neural networks and image classification, offering hands-on examples and implementations. The repository also contains curated learning materials to support in-depth understanding of deep learning fundamentals and practical applications, making it a valuable resource for anyone exploring artificial intelligence and machine learning.

[GitHub]
Transcriptome Analysis Course
My personal website

This repository hosts my personal website built with HTML and CSS. It features interactive visualizations, such as word clouds, and highlights my projects and experiences. The repository is deployed using GitHub Pages and serves as a showcase of my technical skills and portfolio.

[GitHub]

Certifications

Medical Technologist Certification
Medical Technologist License

Certified by the Examination Yuan, Taiwan, demonstrating expertise in medical laboratory science, including clinical practices, diagnostic techniques, and quality assurance. Proficient in collaborating with healthcare teams and operating advanced diagnostic equipment.

Bioinformatics Certification
Genomic Data Science

Advanced certification in genomic data analysis, including NGS data processing, DNA sequencing, bioinformatics tools (BWA, Samtools, Bedtools), and statistical modeling using command line tools and R.

Bioinformatics Certification
SAS Statistical Business Analyst

This certification demonstrates proficiency in using SAS software for statistical modeling and data analysis in a business context. Key skills include hypothesis testing, linear and logistic regression, and model fitting to make data-driven decisions.

Bioinformatics Certification
Google Data Analytics

The Google Data Analytics certification showcases expertise in data cleaning, data visualization, and analysis using tools like Google Sheets, SQL, Tableau, and R. It focuses on preparing, processing, and analyzing data to drive strategic business decisions.

Bioinformatics Certification
Machine Learning/AI Engineer

The Machine Learning/AI Engineer program from Codecademy provides a strong foundation in machine learning concepts, neural networks, natural language processing, and AI model deployment. It focuses on using TensorFlow and scikit-learn to build intelligent systems.

Bioinformatics Certification
Build Deep Learning Models with TensorFlow

This program from Codecademy focuses on constructing and training deep learning models. It covers neural networks, convolutional networks, and recurrent networks using TensorFlow, enabling practical applications of deep learning in real-world scenarios.

Bioinformatics Certification
Software Engineering for Data Scientists

This course by Codecademy equips data scientists with essential software engineering skills. It covers version control, modular code development, testing, and deployment, enabling data scientists to write clean, efficient, and scalable code for real-world applications.

Bioinformatics Certification
Data Science Foundations

This course provides a comprehensive introduction to the core concepts and tools of data science. It covers data manipulation, statistical analysis, data visualization, and fundamental programming skills, laying a solid foundation for aspiring data scientists.

Bioinformatics Certification
R for Data Science

I have successfully completed the R for Data Science program, which provided comprehensive training in R programming, data analysis, and data science techniques. This certification demonstrates my ability to apply data-driven approaches to solving complex problems using R.