ZIYUE ZENG

Master’s Student → Incoming PhD Student at Waseda University

Waseda University

Biography

Ziyue Zeng is a master’s student at Waseda University (Watanabe Laboratory), with an incoming PhD position at Kato Laboratory starting September 2026. His research focuses on diffusion-based deepfake detection, video generation and frame interpolation, and 3DGS-based talking head synthesis. He is also a Research Assistant at NICT Japan, working on next-generation video coding technologies.

Interests

Diffusion-based Deepfake Detection
Video Generation & Frame Interpolation
Voice-driven Speaker Head Modeling & 3DGS
New Video Encoding Paradigm

Education

Ph.D. in Fundamental Science and Engineering (Kato Lab), Sep.2026 – Future
Waseda University
M.S. in Fundamental Science and Engineering (Watanabe Lab), Sep.2024 – Jul.2026
Waseda University
B.S. in Artificial Intelligence, Sep.2020 – Jul.2024
Chongqing University

Skills

Technical

Python

Deep Learning

Data Science

Hobbies

Mountain Climbing

Cats

Photography

Experience

Research Assistant

NICT Japan (情報通信研究機構)

April 2025 – April 2026 Tokyo, Japan

Applying frame interpolation techniques flexibly to video compression and transmission
Video generation and frame interpolation using diffusion models: Bi-AGMI (IEEE GCCE 2025 oral)
Developing video slicing algorithms driven by motion intensity analysis: FRS (IEVC 2026)

Graduate Student (Watanabe Laboratory)

Waseda University

September 2024 – July 2026 Tokyo, Japan

Mastered the theory, implementation, and application of diffusion models
Focused on deepfake detection using a novel feature extractor: TSG (ACM MMAsia 2025 oral)
Collaborated on combining Stable Video Diffusion with ControlNet for large-motion frame interpolation

Research Assistant

Chongqing University – Information Processing Lab

September 2021 – June 2024 Chongqing, China

Conducted research in information theory and fusion
Published two journal articles in information fusion (Impact Factor 5.3)

Projects

Audio-Driven 3D Talking Head

A 3DMM-guided diffusion framework for audio-driven talking head generation based on 3DGS, enabling controllable facial motion and high-fidelity video synthesis.

TSG: Time Step Generating

A universal synthesized deepfake image & video detector, independent of pretraining models, specific datasets, or sampling algorithms. (ACM MMAsia 2025 Oral)

Next-Generation Video Coding

Research on a new video encoding paradigm driven by diffusion models, starting from ultra-low bit-rates to gradually resolve bottlenecks in traditional codecs.