TY - JOUR KW - Attention Mechanisms KW - Convolutional Peephole Long Short-Term Memory KW - Feature Selection KW - Improved Jellyfish Optimization Algorithm KW - Speech Emotion Recognition AU - Ramya Paramasivam AU - K. Lavanya AU - Parameshachari Bidare Divakarachari AU - David Camacho AB - Speech Emotion Recognition (SER) plays an important role in emotional computing which is widely utilized in various applications related to medical, entertainment and so on. The emotional understanding improvises the user machine interaction with a better responsive nature. The issues faced during SER are existence of relevant features and increased complexity while analyzing of huge datasets. Therefore, this research introduces a wellorganized framework by introducing Improved Jellyfish Optimization Algorithm (IJOA) for feature selection, and classification is performed using Convolutional Peephole Long Short-Term Memory (CP-LSTM) with attention mechanism. The raw data acquisition takes place using five datasets namely, EMO-DB, IEMOCAP, RAVDESS, Surrey Audio-Visual Expressed Emotion (SAVEE) and Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D). The undesired partitions are removed from the audio signal during pre-processing and fed into phase of feature extraction using IJOA. Finally, CP LSTM with attention mechanisms is used for emotion classification. As the final stage, classification takes place using CP-LSTM with attention mechanisms. Experimental outcome clearly shows that the proposed CP-LSTM with attention mechanism is more efficient than existing DNN-DHO, DH-AS, D-CNN, CEOAS methods in terms of accuracy. The classification accuracy of the proposed CP-LSTM with attention mechanism for EMO-DB, IEMOCAP, RAVDESS and SAVEE datasets are 99.59%, 99.88%, 99.54% and 98.89%, which is comparably higher than other existing techniques. IS - In press M1 - In press N2 - Speech Emotion Recognition (SER) plays an important role in emotional computing which is widely utilized in various applications related to medical, entertainment and so on. The emotional understanding improvises the user machine interaction with a better responsive nature. The issues faced during SER are existence of relevant features and increased complexity while analyzing of huge datasets. Therefore, this research introduces a wellorganized framework by introducing Improved Jellyfish Optimization Algorithm (IJOA) for feature selection, and classification is performed using Convolutional Peephole Long Short-Term Memory (CP-LSTM) with attention mechanism. The raw data acquisition takes place using five datasets namely, EMO-DB, IEMOCAP, RAVDESS, Surrey Audio-Visual Expressed Emotion (SAVEE) and Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D). The undesired partitions are removed from the audio signal during pre-processing and fed into phase of feature extraction using IJOA. Finally, CP LSTM with attention mechanisms is used for emotion classification. As the final stage, classification takes place using CP-LSTM with attention mechanisms. Experimental outcome clearly shows that the proposed CP-LSTM with attention mechanism is more efficient than existing DNN-DHO, DH-AS, D-CNN, CEOAS methods in terms of accuracy. The classification accuracy of the proposed CP-LSTM with attention mechanism for EMO-DB, IEMOCAP, RAVDESS and SAVEE datasets are 99.59%, 99.88%, 99.54% and 98.89%, which is comparably higher than other existing techniques. PY - 9998 SE - 1 SP - 1 EP - 14 T2 - International Journal of Interactive Multimedia and Artificial Intelligence TI - A Robust Framework for Speech Emotion Recognition Using Attention Based Convolutional Peephole LSTM UR - https://www.ijimai.org/journal/bibcite/reference/3532 VL - In press SN - 1989-1660 ER -