01615nas a2200265 4500000000100000000000100001008004100002260001200043653002000055653001800075653003100093653000800124653002500132100001200157700001400169700001500183700001700198700001300215245007500228856005800303300001000361490000600371520095800377022001401335 2025 d c03/202510aComputer vision10aDeep Learning10aGated Graph Neural Network10aHOI10aImage Classification1 aZhan Su1 aRuiyun Yu1 aShihao Zou1 aBingyang Guo1 aLi Cheng00aSpatial-Aware Multi-Level Parsing Network for Human-Object Interaction uhttps://www.ijimai.org/journal/bibcite/reference/3334 a39-480 v93 aHuman-Object Interaction (HOI) detection focuses on human-centered visual relationship detection, which is a challenging task due to the complexity and diversity of image content. Unlike most recent HOI detection works that only rely on paired instance-level information in the union range, our proposed Spatial-aware Multilevel Parsing Network (SMPNet) uses a multi-level information detection strategy, including instance-level visual features of detected human-object pair, part-level related features of the human body, and scene-level features extracted by the graph neural network. After fusing the three levels of features, the HOI relationship is predicted. We validate our method on two public datasets, V-COCO and HICO-DET. Compared with prior works, our proposed method achieves the state-of-the-art results on both datasets in terms of mAProle, which demonstrates the effectiveness of our proposed multi-level information detection strategy. a1989-1660