A Comparison of Chain-of-Thought Bypass Errors in Large Reasoning Models trained with RLVR vs. SFT
2025Alistair Cheong
Mini Research Project done as part of CMU's Undergraduate First Year Writing Course
Blog Post Coming Soon
Alistair Cheong
Mini Research Project done as part of CMU's Undergraduate First Year Writing Course
Cao Yuxuan, Wu Jiayang, Alistair Cheong Liang Chuen, Bryan Shan Guanrong, Theodore Lee Chong Jen, Sherman Chann Zhi Shen
3rd Workshop, C3NLP, NAACL 2025
Mahsa Paknezhad, Cuong Phuc Ngo, Amadeus Aristo Winarto, Alistair Cheong, Chuen Yang Beh, Jiayang Wu, Hwee Kuan Lee
Neurocomputing, Volume 495, Pages 178-193