Large Model Based Crossmodal Chinese Poetry Creation
Oct 4, 2024·
,,,·
0 min read
L. Yang
Equal contribution
张志东
Equal contribution
,K. Niu
S. Pan
W. Zhu
C. Ma
System StructureAbstract
Generating Chinese poetry is a complex task with significant potential for large models. However, most current systems only support single-model of input and the output lacks interpretability. This paper proposes a large model based system that supports cross-modal input of text and image, provides interpretable annotations for generated Chinese poems, and sup- ports multiple rounds of iterative optimization. First, it analyzes images with CLIP and MiniGPT-4 and generates descriptive text from analysis with ERNIE-4.0. Then, it generates Chinese ancient poems from the input text and descriptive text by ERNIE-4.0, using our devised prompts based on CRISPE. Finally, it evaluates and then optimizes the created poems with prompts based on few-shot. Preliminary evaluations have validated the efficacy of our poetry scoring criteria and demonstrated the superior performance of the system when utilizing the conjunction of text and imagery as cross-modal inputs.
Type
Publication
In 2024 IEEE Smart World Congress (SWC)