Optimization of Multimodal Generative Models for Creative Content Generation

Ayushman Bhowmik, Ruchi Sharma, Tiansheng Yang*, Lu Wang, Bharati Rathore, Hrudaya Kumar Tripathy

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This research paper presents an in-depth examination of recent developments in multimodal generative models with a specific focus on enhancing creative content generation. We introduce a novel architectural framework that seamlessly integrates text, image, and audio modalities, enabling cross-modal translation of creative concepts. Extensive empirical evaluations demonstrate the model’s proficiency in generating creative content, characterized by high quality, coherence, and diversity. In our day-to-day life, the applications are massive which include digital and physical marketing and personalizing our choices in our day-to-day life.
Original languageEnglish
Title of host publicationProceedings of Fifth Doctoral Symposium on Computational Intelligence
Subtitle of host publicationDoSCI 2024, Volume 4
EditorsAbhishek Swaroop, Vineet Kansal, Giancarlo Fortino, Aboul Ella Hassanien
PublisherSpringer
Pages291-299
ISBN (Electronic)978-981-97-6726-7
ISBN (Print)978-981-97-6725-0
DOIs
Publication statusE-pub ahead of print - 6 Nov 2024
Event5th Doctoral Symposium on Computational Intelligence - Institute of Engineering & Technology, Lucknow, India
Duration: 10 May 202410 May 2024

Publication series

NameLecture Notes in Networks and Systems
PublisherSpringer
ISSN (Print)2367-3370
ISSN (Electronic)2367-3389

Conference

Conference5th Doctoral Symposium on Computational Intelligence
Abbreviated titleDoSCI 2024
Country/TerritoryIndia
CityLucknow
Period10/05/2410/05/24

Cite this