AI lip-sync is the use of generative AI to re-animate an actor's mouth movements so they match a new piece of dialogue. The audio is replaced — usually with a translated voiceover, a script change, or a fix to a misspoken line — and the visible phonemes on the actor's face are regenerated frame by frame so the performance reads as if the new line was spoken on set.
Why AI lip-sync exists
For decades, the only way to change what an actor says on screen was to bring them back into a studio, re-record the line via ADR (automated dialogue replacement), and accept that the visible mouth movements would no longer match. Audiences are surprisingly tolerant of this in feature film and TV, but they are not tolerant of it in advertising. A 30-second commercial lives or dies on credibility, and a dubbed mouth that drifts a frame out of sync is enough to break the spell.
AI lip-sync solves the visible half of the problem. The audio is replaced as before, and the lower face is regenerated to match the new sounds. The original performance — eyes, brows, head movement, body language — is preserved. Only the mouth shape changes.
How AI lip-sync works
At a high level, the pipeline does three things. First, it analyses the new audio to extract a sequence of phonemes (the building blocks of speech). Second, it maps each phoneme to a viseme (the corresponding mouth shape). Third, it generates the lower-face frames that produce those visemes while staying consistent with the original lighting, skin tone, head pose, and beard or makeup detail in every frame.
The hard part is the third step. A naïve implementation produces a mouth that looks pasted on. A good implementation re-renders the lower face into the original cinematography so the swap is invisible. The difference between the two is mostly down to how much production-grade calibration sits around the model, not the model itself.
Where AI lip-sync is used in advertising
Multi-market localisation. The most common use case. A hero film is shot in one language and lip-synced into many. Combined with face-swap localisation, the result is a market variant that reads as locally cast and locally voiced from a single shoot. See multi-market versioning for the wider workflow.
Late script changes. When a line needs to change after the shoot — a legal note, a product update, a creative tweak — AI lip-sync can implement the change without a reshoot. Turnaround drops from weeks to days.
Performance fixes. A misspoken line, a fluffed brand name, or a take that is otherwise the best on the day but has one bad word can be repaired in post.
Research and creative testing. Brands can test the same talent in several language variants in parallel. Because only the audio and the mouth change, the test variable is cleanly isolated.
Consent, contracts, and ethics
AI lip-sync modifies an actor's performance, which means it requires explicit, written permission from that actor. In practice this means the talent agreement must allow AI modification of the performance, with the scope of permitted use spelled out in writing. If the original contract pre-dates the technology, a written addendum is needed before the work can be commissioned. Working without that consent is a deepfake; working with it is normal advertising production.
The full set of standards Myth Labs applies to voice cloning and likeness work — including documented consent, fully licensed synthetic talent, and human review on every output — is set out on our AI trust and governance page.
What AI lip-sync cannot fix
It is worth being clear about the limits of the technique. AI lip-sync is excellent at making the mouth match new dialogue, but it is not a substitute for a good performance. If the original take has the wrong emotional reading, lip-sync will not change that. If the actor's body language reads as confused while the new line is meant to read as confident, the disconnect will still be visible to the audience.
Lip-sync also has very little to say about timing. If the new line is materially longer than the original, the editor will need to find frames to extend the shot or accept a cut that lands earlier in the line. This is one of the reasons lip-sync briefs work best when the new dialogue is roughly the same syllable count as the original — translators and copywriters who understand this constraint produce variants that integrate cleanly.
Quality and turnaround
Quality varies enormously across providers. The visible artefacts to watch for are mouth corners that flicker frame to frame, teeth that change shape between cuts, and a lower face that drifts out of skin-tone match with the rest of the head. A production-grade pipeline catches these in QC and either reprocesses the affected shots or hands them to a compositor for manual fix.
Turnaround on a typical brief is one to three working days per language for a 30-second spot, including human QC. At Myth Labs, AI lip-sync runs as one stage inside our Agent Myth AdLocalise pipeline, alongside face-swap, voice cloning, visual resynthesis, and cultural adaptation.
For the wider context on how AI is reshaping localisation, see our long-form guide on face-swap localisation in advertising and the ad localisation process.
