02714nas a2200265 4500000000100000000000100001008004100002260001200043653002500055653002700080653001600107653004600123653002600169100001900195700001400214700001800228700001600246700001600262245011900278856005800397300000900455490001300464520195700477022001402434 9998 d c09/202410aAttention Mechanisms10aContextual Information10aMulti-Modal10aSpatio-Temporal Interaction and Awareness10aTrajectory Prediction1 aXiaoliang Wang1 aLian Zhou1 aKuan-Ching Li1 aShiqi Zheng1 aHuijing Fan00aIAtraj: Multi-Modal Trajectory Prediction Through Contextual Information Spatio-Temporal Interaction and Awareness uhttps://www.ijimai.org/journal/bibcite/reference/3488 a1-120 vIn press3 aAccurately and feasibly predicting the future trajectories of autonomous vehicles is a critically important task. However, this task faces significant challenges due to the variability of driving intentions and the complexity of social interactions. These challenges primarily arise from the need to understand one’s driving behaviors and model the interaction information of the surrounding environment. A substantial amount of research has been focused on integrating interaction information from the surrounding environment, mainly using raster images or High-Definition maps (HD maps). However, the real-time update of environmental maps and the high computational cost associated with processing interaction information using compatible technologies such as vision have become limiting factors. Additionally, ineffective simulation and modeling of real driving scenarios, coupled with inadequate understanding of contextual environmental information, result in lower prediction accuracy. To overcome these challenges, we propose a multi-modal trajectory prediction model based on sequence modeling namely IAtraj, incorporating multiple attention mechanisms, focuses on the three critical elements in real traffic scenarios: the target agent’s historical trajectory, effective interactions with neighboring vehicles, and lane supervision and retention strategies. To better model these elements, we design modules for Temporal Interaction (TI), Spatial Interaction (SI), and Lane Awareness (LA). Through extensive experiments conducted on the publicly available nuScenes dataset, IAtraj exhibits outstanding performance, successfully addressing the challenges of temporal dependencies in trajectory sequences and the representation of scene changes. Finally, comprehensive ablation experiments validate the effectiveness of each significant module, reinforcing the reliability and robustness of IAtraj in dealing with complex traffic scenarios. a1989-1660