Abstract: Audio-driven 3D talking head generation aims to generate vivid and realistic facial expressions from speech signals. Although current systems have mastered lip synchronization, the lack of ...