TimeColor: Flexible Reference Colorization via Temporal Concatenation

1Tsinghua University, 2HKUST
🎵 For the best viewing experience, go full screen and turn up your volume 🎵

Abstract

Most colorization models condition only on a single reference, typically the first frame of the scene. However, this approach ignores other sources of conditional data, such as character sheets, background images, or arbitrary colorized frames. We propose TimeColor, a sketch-based video colorization model that supports heterogeneous, variable-count references with the use of explicit per-reference region assignment. TimeColor encodes references as additional latent frames which are concatenated temporally, permitting them to be processed concurrently in each diffusion step while keeping the model's parameter count fixed. TimeColor also uses spatiotemporal correspondence-masked attention to enforce subject--reference binding in addition to modality-disjoint RoPE indexing. These mechanisms mitigate shortcutting and cross-identity palette leakage. Experiments on SAKUGA-42M under both single- and multi-reference protocols show that TimeColor improves color fidelity, identity consistency, and temporal stability over prior baselines.

Gallery: Multi-Reference

Stack flexible amount of subject and background references across time, colorize the entire scene.

References
Lineart
Result

Gallery: Arbitrary Frame Reference

Use a single-frame reference from any moment to guide the colorization

Reference
Lineart
Result

Gallery: Starting Frame Reference

From the first frame, color the whole scene

Reference
Lineart
Result

Flexible Usage

Demonstrating reference/scene reusability with consistent 720×480 displays

Reuse References across scenes
Tailor the references for current scene

Comparison

Methods: Ours, LVCD, ToonCrafter, AniDoc, LongAnimation, ToonComposer

Disclaimer

All images, videos, and related materials on this page are provided exclusively for academic and research use. TimeColor is a research project and is not intended for commercial applications.