Bridging deep learning, embodiment, and interaction design—building multimodal systems where virtual humans listen, speak, and gesture with nuance in real time.
Co-Speech Gesture Generation
Multimodal Representation Learning
Digital Humans for XR
Agentic & RAG-Driven Avatars
LLMs + Motion
Explore Research
View Demos
10+ yrs
XR & AI Research
20+
Peer-Reviewed Pubs
2×
Major Awards
InnoCORE Fellow
KAIST · SNU Vision Lab
Seoul, KR
I design scalable hybrid pipelines for text-to-gesture generation, controllable motion synthesis, and LLM-infused digital humans.
Email
[email protected]
Google Scholar
Publications
My work is anchored in multimodal understanding & generation—aligning text, audio, motion, and high-density knowledge.
Architectures for aligning language, prosody, and 3D motion—using contrastive learning and shared latent spaces to generate co-speech gestures that feel intentional rather than random.
GestureCLR-style encoders
Temporal transformers
Shared latent spaces
Systems like RIDGE and Enhanced Gesture Units enable virtual agents to perform semantically aligned, style-aware gestures in real-time AR/MR environments.
Co-speech gestures
Pre-viz & storyboarding
Medical explainers
Retrieval-Augmented Generation and agent frameworks that let virtual humans reason over dense corpora while responding with grounded language and matching non-verbal behavior.
RAG pipelines
Context-aware avatars
Evaluation frameworks
Visualizing motion synthesis and gesture generation.
* Placeholder videos
Text-to-Gesture Synthesis
Screenplay to 3D Animation
RAG-Driven Historical Avatar
2025 – Ongoing
InnoCORE Postdoctoral Researcher
KAIST
Controllable human motion synthesis and LLM-driven context understanding for virtual human behavior.
2023 – 2025
Postdoctoral Researcher
KIST
Human-to-avatar gesticulation learning and interactive digital human technology for flagship virtual avatar platforms.
2017 – 2023
Research Assistant · Ph.D.
UST – KIST School
Ph.D. in AI-Robotics on scalable hybrid text-to-gesture generation for interactive digital humans.
2014 – 2017
Engineer Instructor
University of Central Punjab
Designed and delivered lab curricula in programming fundamentals, OOP, and databases.
Computer Animation & Virtual Worlds · Best Paper Award (CASA 2024)
RIDGE fuses LLM-generated specific rules with contrastively learned generalized gesture retrieval in a shared latent space.
Hybrid rule + deep learning
Shared latent motion space
Industry-ready pipeline
KOCCA Flagship · ISMAR & SIGGRAPH Asia Real-Time Live
A screenplay-to-3D pipeline that converts scripts into shot-level visualizations with virtual actors.
Screenplay understanding
Virtual cinematography
Computer Animation & Virtual Worlds · ISMAR 2025
A Retrieval-Augmented Generation system that navigates dense historical archives to answer nuanced queries.
RAG pipelines
Cultural heritage
A sample of recent work. Full list available on Google Scholar.
G. Ali, H.Y. Kim, J.-I. Hwang
Hybrid rule + retrieval framework delivering realistic, deployable co-speech gestures.
G. Ali, W. Kim, M.S. Anwar, J.-I. Hwang, A. Choi
Gesture units for cross-lingual, semantically aligned gesture synthesis.
J.H. Lee, G. Ali, J.-I. Hwang
RAG system over dense historical corpora, powering embodied explainers.
G. Ali, H. Kim, B. Han, et al.
Screenplay-driven multi-output pre-visualization for film production.
Email
[email protected]
Location
Seoul, South Korea
LinkedIn
/in/ghazanfar309
Scholar
Profile ↗
For speaking invitations, research collaborations, or co-developing digital human applications, feel free to reach out.