TY - GEN
T1 - Hierarchical Coding for Talking-Head Video
AU - Liu, Yu
AU - Li, Shibo
AU - Zhu, Shuyuan
AU - Yeung, Siu Kei Au
AU - Wen, Xing
AU - Zeng, Bing
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Talking-head video is very popular in video conference and social media, where the camera captures the movement of user's head and the change of facial expression. In this paper, we propose a hierarchical coding scheme for the compression of talking-head video. In our proposed method, three data layers, including one base layer, one enhancement layer and one feature layer, are formed as the input of encoder. More specifically, the base layer is generated by spatially sub-sampling the source video. The enhancement layer is composed by the specific key frames and the feature layer is produced based on the extracted facial landmarks. These layers are separately compressed but fused together to reconstruct the video signal in the decoder side. To achieve a high-quality reconstruction, we design the multi-feature fusion network in which the feature layer is used to guide the fusion of base layer and enhancement layer. The experiment results demonstrate the good performance of our proposed method for the coding of talking-head video.
AB - Talking-head video is very popular in video conference and social media, where the camera captures the movement of user's head and the change of facial expression. In this paper, we propose a hierarchical coding scheme for the compression of talking-head video. In our proposed method, three data layers, including one base layer, one enhancement layer and one feature layer, are formed as the input of encoder. More specifically, the base layer is generated by spatially sub-sampling the source video. The enhancement layer is composed by the specific key frames and the feature layer is produced based on the extracted facial landmarks. These layers are separately compressed but fused together to reconstruct the video signal in the decoder side. To achieve a high-quality reconstruction, we design the multi-feature fusion network in which the feature layer is used to guide the fusion of base layer and enhancement layer. The experiment results demonstrate the good performance of our proposed method for the coding of talking-head video.
KW - HEVC
KW - Talking-head video
KW - coding
KW - facial landmarks
KW - fusion
UR - http://www.scopus.com/inward/record.url?scp=85142493054&partnerID=8YFLogxK
U2 - 10.1109/ISCAS48785.2022.9937480
DO - 10.1109/ISCAS48785.2022.9937480
M3 - Conference contribution
AN - SCOPUS:85142493054
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
SP - 3043
EP - 3047
BT - IEEE International Symposium on Circuits and Systems, ISCAS 2022
T2 - 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022
Y2 - 27 May 2022 through 1 June 2022
ER -