Kolors is a text-to-image generation model developed by the Kuaishou Kolors team. It utilizes advanced latent diffusion techniques to produce high-quality images from textual descriptions. This model is designed to handle a wide range of image synthesis tasks, supporting inputs in both Chinese and English.
Kolors has been rigorously evaluated against other state-of-the-art models through a dataset named KolorsPrompts, which includes over 1,000 prompts across various categories and dimensions. The evaluation process involved both human and machine assessments, where Kolors demonstrated superior performance in terms of visual appeal, text faithfulness, and overall satisfaction.
Clone the repository and install dependencies:
git clone https://github.com/Kwai-Kolors/Kolors
cd Kolors
conda create --name kolors python=3.8
conda activate kolors
pip install -r requirements.txt
python3 setup.py install
Download model weights:
huggingface-cli download --resume-download Kwai-Kolors/Kolors --local-dir weights/Kolors
To generate an image from text:
python3 scripts/sample.py "一张瓢虫的照片,微距,变焦,高质量,电影,拿着一个牌子,写着‘可图’"
For a more interactive experience, users can run a web demo:
python3 scripts/sampleui.py
Kolors is released under the Apache-2.0 license, allowing for both academic and commercial use.
For more detailed information, including further technical details and access to additional resources such as technical reports and community discussions, users are encouraged to visit the official GitHub repository of Kolors.