02444nas a2200277 4500000000100000000000100001008004100002260001200043653004700055653002200102653003200124100001700156700001100173700001700184700001400201700001300215700001600228700001600244700001000260245010000270856008300370300000900453490001300462520167700475022001402152 9998 d c02/202410aAspect-Based Multimodal Sentiment Analysis10aOptimal Transport10aSocial Media Opinion Mining1 aLinhao Zhang1 aLi Jin1 aGuangluan Xu1 aXiaoyu Li1 aXian Sun1 aZequn Zhang1 aYanan Zhang1 aQi Li00aOptimal Target-Oriented Knowledge Transportation For Aspect-Based Multimodal Sentiment Analysis uhttps://www.ijimai.org/journal/sites/default/files/2024-02/ip2024_02_005_0.pdf a1-110 vIn press3 aAspect-based multimodal sentiment analysis under social media scenario aims to identify the sentiment polarities of each aspect term, which are mentioned in a piece of multimodal user-generated content. Previous approaches for this interdisciplinary multimodal task mainly rely on coarse-grained fusion mechanisms from the data-level or decision-level, which have the following three shortcomings:(1) ignoring the category knowledge of the sentiment target mentioned in the text) in visual information. (2) unable to assess the importance of maintaining target interaction during the unimodal encoding process, which results in indiscriminative representations considering various aspect terms. (3) suffering from the semantic gap between multiple modalities. To tackle the above challenging issues, we propose an optimal target-oriented knowledge transportation network (OtarNet) for this task. Firstly, the visual category knowledge is explicitly transported through input space translation and reformulation. Secondly, with the reformulated knowledge containing the target and category information, the target sensitivity is well maintained in the unimodal representations through a multistage target-oriented interaction mechanism. Finally, to eliminate the distributional modality gap by integrating complementary knowledge, the target-sensitive features of multiple modalities are implicitly transported based on the optimal transport interaction module. Our model achieves state-of-theart performance on three benchmark datasets: Twitter-15, Twitter-17 and Yelp, together with the extensive ablation study demonstrating the superiority and effectiveness of OtarNet. a1989-1660