Pohang University of Science and Technology, South Korea
Daijin Kim received the BS degree in Electronic and Engineering from Yonsei University, Seoul, South Korea, in 1981, and the MS degree in Electrical Engineering from the Korea Advanced Institute of Science and Technology (KAIST), Taejon, 1984. In 1991, he received the PhD degree in Electrical and Computer Engineering from Syracuse University, Syracuse, NY. During 1992-1999, he was an Associate Professor in the Department of Computer Engineering at DongA University, Pusan, Korea. He is currently a Professor in the Department of Computer Science and Engineering at POSTECH, Pohang, Korea. His research interests include face and human analysis, machine intelligence and advanced driver assistance systems.
Recently, many face alignment methods using convolutional neural networks (CNN) have been introduced due to their high accuracies. However, they do not show real-time processing due to their high computational costs. In this paper, we propose a three-stage convolutional neural regression network (CNRN) to achieve a highly accurate face alignment in the real-time. The first stage consists of one CNRN that maps the facial image into the center positions of seven facial parts such as eyes, nose, mouth, etc. We obtain 68 local facial patches by aligning the center positions of seven facial parts onto the mean shape. The second stage consists of seven independent CNRNs, where each CNRN maps the local facial patches within its facial part into their displacements of x and y direction to reach the target positions. We obtain the fitted whole facial features and make a warped facial image from them. The third stage consists of one CNRN that maps the warped facial image into the appearance error. We repeat the second and third stage until the appearance error becomes small. The proposed method is fast because it trains first the facial parts and then facial features within the facial part like a coarse to fine fitting and each CNRN is relatively simple. The proposed method is highly accurate because it trains the facial features iteratively by performing the local regression on the facial features and the global regression on the warped appearance image. In the experiments, the proposed method will yield more accurate and stable face alignment or tracking under heavy occlusion and large pose variation than the existing the state of the art methods and run in the real-time.