Portrait
Jun Bai (白骏)
Research Engineer
BIGAI NLCo Lab
About Me

I received my Ph.D. degree from the School of Computer Science and Engineering at Beihang University, China. I completed my doctoral studies under the supervision of Prof. Wenge Rong and Prof. Chuantao Yin.

My research primarily focuses on building Trustworthy AI systems. My current research themes include:

  • Investigating and Improving the Faithfulness of LLMs.
  • Exploring and Enhancing the CoT Monitorability of LRMs.
  • Understanding the Internal Workings of LLMs through Mechanistic Interpretability.
Education
  • Beihang University
    Beihang University
    Ph.D. in Computer Science
    Sep. 2020 - Nov. 2024
Employment
  • BIGAI NLCo Lab
    BIGAI NLCo Lab
    Research Engineer
    Dec. 2024 - present
News
2025
🚀 We are excited to announce the release of the Native Parallel Reasoner.
Dec 09
RouterLens, TongSearch-QR, and CogAtom are accepted by EMNLP 2025.
Aug 20
CLG has been accepted by ACL 2025.
May 15
Selected Publications (view all )
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Tong Wu*, Yang Liu*, Jun Bai*, Zixia Jia, Shuyi Zhang, Ziyong Lin, Yanting Wang, Song-Chun Zhu, Zilong Zheng (* equal contribution)

Preprint 2025

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Tong Wu*, Yang Liu*, Jun Bai*, Zixia Jia, Shuyi Zhang, Ziyong Lin, Yanting Wang, Song-Chun Zhu, Zilong Zheng (* equal contribution)

Preprint 2025

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Jun Bai, Minghao Tong, Yang Liu, Zixia Jia, Zilong Zheng

EMNLP 2025 Top 2%

Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

Jun Bai, Minghao Tong, Yang Liu, Zixia Jia, Zilong Zheng

EMNLP 2025 Top 2%

Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks
Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks

Xubo Qin, Jun Bai, Jiaqi Li, Zixia Jia, Zilong Zheng

EMNLP 2025

Reinforced Query Reasoners for Reasoning-intensive Retrieval Tasks

Xubo Qin, Jun Bai, Jiaqi Li, Zixia Jia, Zilong Zheng

EMNLP 2025

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models

Zhuofan Chen, Jiyuan He, Yichi Zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong

EMNLP Findings 2025

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models

Zhuofan Chen, Jiyuan He, Yichi Zhang, Xing Hu, Haoxing Wen, Jun Bai, Wenge Rong

EMNLP Findings 2025

Selecting Text Classification Model through Maximizing Posterior Evidence over Informative Sub-space
Selecting Text Classification Model through Maximizing Posterior Evidence over Informative Sub-space

Zhiwei Sun, Jun Bai#, Zhuofan Chen, Chen Li, Wenge Rong, Zhang Xiong (# corresponding author)

Frontiers of Computer Science 2025

Selecting Text Classification Model through Maximizing Posterior Evidence over Informative Sub-space

Zhiwei Sun, Jun Bai#, Zhuofan Chen, Chen Li, Wenge Rong, Zhang Xiong (# corresponding author)

Frontiers of Computer Science 2025

Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching
Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching

Jianfei Zhang, Bei Li, Jun Bai, Rumei Li, Yanmeng Wang, Chenghua Lin, Wenge Rong

ACL Findings 2025

Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching

Jianfei Zhang, Bei Li, Jun Bai, Rumei Li, Yanmeng Wang, Chenghua Lin, Wenge Rong

ACL Findings 2025

Rectifying and Discriminating Hard Negatives for Biomedical Retrieval Question Answering
Rectifying and Discriminating Hard Negatives for Biomedical Retrieval Question Answering

Jun Bai, Zhenzi Li, Bo Zhao, Chen Li, Chenghua Lin, Wenge Rong

IEEE Transactions on Computational Biology and Bioinformatics 2025

Rectifying and Discriminating Hard Negatives for Biomedical Retrieval Question Answering

Jun Bai, Zhenzi Li, Bo Zhao, Chen Li, Chenghua Lin, Wenge Rong

IEEE Transactions on Computational Biology and Bioinformatics 2025

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment
Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Jianfei Zhang, Jun Bai, Bei Li, Yanmeng Wang, Rumei Li, Chenghua Lin, Wenge Rong

COLING 2025

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Jianfei Zhang, Jun Bai, Bei Li, Yanmeng Wang, Rumei Li, Chenghua Lin, Wenge Rong

COLING 2025

Leveraging Estimated Transferability over Human Intuition for Model Selection in Text Ranking
Leveraging Estimated Transferability over Human Intuition for Model Selection in Text Ranking

Jun Bai, Zhuofan Chen, Zhenzi Li, Hanhua Hong, Jianfei Zhang, Chen Li, Chenghua Lin, Wenge Rong

EMNLP 2024

Leveraging Estimated Transferability over Human Intuition for Model Selection in Text Ranking

Jun Bai, Zhuofan Chen, Zhenzi Li, Hanhua Hong, Jianfei Zhang, Chen Li, Chenghua Lin, Wenge Rong

EMNLP 2024

How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey
How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey

Jun Bai, Xiaofeng Zhang, Chen Li, Hanhua Hong, Xi Xu, Chenghua Lin, Wenge Rong

EMNLP Findings 2023

How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey

Jun Bai, Xiaofeng Zhang, Chen Li, Hanhua Hong, Xi Xu, Chenghua Lin, Wenge Rong

EMNLP Findings 2023

Permutation Invariant Training for Paraphrase Identification
Permutation Invariant Training for Paraphrase Identification

Jun Bai, Chuantao Yin, Hanhua Hong, Jianfei Zhang, Chen Li, Yanmeng Wang, Wenge Rong

ICASSP 2023 Oral

Permutation Invariant Training for Paraphrase Identification

Jun Bai, Chuantao Yin, Hanhua Hong, Jianfei Zhang, Chen Li, Yanmeng Wang, Wenge Rong

ICASSP 2023 Oral

Improving Variational Autoencoders with Density Gap-based Regularization
Improving Variational Autoencoders with Density Gap-based Regularization

Jianfei Zhang, Jun Bai, Chenghua Lin, Yanmeng Wang, Wenge Rong

NeurIPS 2022

Improving Variational Autoencoders with Density Gap-based Regularization

Jianfei Zhang, Jun Bai, Chenghua Lin, Yanmeng Wang, Wenge Rong

NeurIPS 2022

Improving Biomedical ReQA with Consistent NLI-transfer and Post-whitening
Improving Biomedical ReQA with Consistent NLI-transfer and Post-whitening

Jun Bai, Chuantao Yin, Zimeng Wu, Jianfei Zhang, Yanmeng Wang, Guanyi Jia, Wenge Rong, Zhang Xiong

IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022

Improving Biomedical ReQA with Consistent NLI-transfer and Post-whitening

Jun Bai, Chuantao Yin, Zimeng Wu, Jianfei Zhang, Yanmeng Wang, Guanyi Jia, Wenge Rong, Zhang Xiong

IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022

Adversarial Knowledge Distillation based Biomedical Factoid Question Answering
Adversarial Knowledge Distillation based Biomedical Factoid Question Answering

Jun Bai, Chuantao Yin, Jianfei Zhang, Yanmeng Wang, Yi Dong, Wenge Rong, Zhang Xiong

IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022

Adversarial Knowledge Distillation based Biomedical Factoid Question Answering

Jun Bai, Chuantao Yin, Jianfei Zhang, Yanmeng Wang, Yi Dong, Wenge Rong, Zhang Xiong

IEEE/ACM Transactions on Computational Biology and Bioinformatics 2022

Enhancing Dual-encoders with Question and Answer Cross-embeddings for Answer Retrieval
Enhancing Dual-encoders with Question and Answer Cross-embeddings for Answer Retrieval

Yanmeng Wang, Jun Bai, Ye Wang, Jianfei Zhang, Wenge Rong, Zongcheng Ji, Shaojun Wang, Jing Xiao

EMNLP Findings 2021

Enhancing Dual-encoders with Question and Answer Cross-embeddings for Answer Retrieval

Yanmeng Wang, Jun Bai, Ye Wang, Jianfei Zhang, Wenge Rong, Zongcheng Ji, Shaojun Wang, Jing Xiao

EMNLP Findings 2021

Paragraph Level Multi-perspective Context Modeling for Question Generation
Paragraph Level Multi-perspective Context Modeling for Question Generation

Jun Bai, Wenge Rong, Feiyu Xia, Yanmeng Wang, Yuanxin Ouyang, Zhang Xiong

ICASSP 2021

Paragraph Level Multi-perspective Context Modeling for Question Generation

Jun Bai, Wenge Rong, Feiyu Xia, Yanmeng Wang, Yuanxin Ouyang, Zhang Xiong

ICASSP 2021

All publications