MAP-Elites with Transverse Assessment for Multimodal Problems in Creative Domains
International Conference on Artificial Intelligence in Music, Sound, Art and Design(2024)
摘要
The recent advances in language-based generative models have paved the way
for the orchestration of multiple generators of different artefact types (text,
image, audio, etc.) into one system. Presently, many open-source pre-trained
models combine text with other modalities, thus enabling shared vector
embeddings to be compared across different generators. Within this context we
propose a novel approach to handle multimodal creative tasks using Quality
Diversity evolution. Our contribution is a variation of the MAP-Elites
algorithm, MAP-Elites with Transverse Assessment (MEliTA), which is tailored
for multimodal creative tasks and leverages deep learned models that assess
coherence across modalities. MEliTA decouples the artefacts' modalities and
promotes cross-pollination between elites. As a test bed for this algorithm, we
generate text descriptions and cover images for a hypothetical video game and
assign each artefact a unique modality-specific behavioural characteristic.
Results indicate that MEliTA can improve text-to-image mappings within the
solution space, compared to a baseline MAP-Elites algorithm that strictly
treats each image-text pair as one solution. Our approach represents a
significant step forward in multimodal bottom-up orchestration and lays the
groundwork for more complex systems coordinating multimodal creative agents in
the future.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要