Gunjun Lee

Email: gunjunlee97@gmail.com

Address: Seoul, Republic of Korea

blog: blog.gunjunlee.com

github: gunjunlee

Research interests: Computer architecture, Machine learning, LLM, Speculative decoding, LLM inference / serving

Work Experience

Server Manager Scale lab

Seoul, Republic of Korea 2024.03~

Machine Learning Engineer Hyperconnect (Match Group, Inc)

Seoul, Republic of Korea 2019.10~2023.07

Developed a Recommendation system based on Machine Learning for the in-app matching and deploying to production.
Built model training / inference pipeline and data pipeline with NVIDIA Triton Inference Server, Pytorch, BigQuery, Kubeflow, Prometheus/Grafana and FastAPI.
Optimized ML serving costs by reducing software overhead.
Developed a iOS application based on CoreML and Metal-written custom kernel functions.

Machine Learning Engineer EstSoft

Seoul, Republic of Korea 2018.04~2019.10

Seoul National University, PhD in Department of Intelligence and Information 2025.09 - Current (Advisor: Jung Ho Ahn)

Seoul National University, MS in Department of Intelligence and Information 2024.03 - 2025.08 (Advisor: Jung Ho Ahn)

Seoul National University, BS in Electrical and Computer Engineering 2015.03 - 2022.02

From Tokens to Layers: Redefining Stall-Free Scheduling for LLM Serving with Layered Prefill (MLSys, 2026)

Gunjun Lee, Jiwon Kim, Jaiyoung Park, Younjoo Lee, Jung Ho Ahn
Embed GitHub
keywords & skills: C++, CUDA, vLLM, torch.compile, CUDA graph, ZeroMQ, Tensor Parallelism, torch Extension, nsys, ncu

The New LLM Bottleneck: A Systems Perspective on Latent Attention and Mixture-of-Experts (arXiv, 2025)

Sungmin Yun, Seonyong Park, Hwayong Nam, Younjoo Lee, Gunjun Lee, Kwanhee Kyung, Sangpyo Kim, Nam Sung Kim, Jongmin Kim, Hyungyo Kim, Juhwan Cho, Seungmin Baek, Jung Ho Ahn