TY - JOUR KW - ChatGPT KW - Generative AI KW - Mathematical Problems KW - Wolfram Mathematica AU - Alejandro L. García Navarro AU - Nataliia Koneva AU - José Alberto Hernández AU - Alfonso Sánchez-Macián AB - In November 2022, ChatGPT v3.5 was announced to the world. Since then, Generative Artificial Intelligence (GAI) has appeared in the news almost daily, showing impressive capabilities at solving multiple tasks that have surprised the research community and the world in general. Indeed the number of tasks that ChatGPT and other Large Language Models (LLMs) can do are unimaginable, especially when dealing with natural text. Text generation, summarisation, translation, and transformation (into poems, songs, or other styles) are some of its strengths. However, when it comes to reasoning or mathematical calculations, ChatGPT finds difficulties. In this work, we compare different flavors of ChatGPT (v3.5, v4, and Wolfram GPT) at solving 20 mathematical tasks, from high school and first-year engineering courses. We show that GPT-4 is far more powerful than ChatGPT-3.5, and further that the use of Wolfram GPT can even slightly improve the results obtained with GPT-4 at these mathematical tasks. IS - In press M1 - In press N2 - In November 2022, ChatGPT v3.5 was announced to the world. Since then, Generative Artificial Intelligence (GAI) has appeared in the news almost daily, showing impressive capabilities at solving multiple tasks that have surprised the research community and the world in general. Indeed the number of tasks that ChatGPT and other Large Language Models (LLMs) can do are unimaginable, especially when dealing with natural text. Text generation, summarisation, translation, and transformation (into poems, songs, or other styles) are some of its strengths. However, when it comes to reasoning or mathematical calculations, ChatGPT finds difficulties. In this work, we compare different flavors of ChatGPT (v3.5, v4, and Wolfram GPT) at solving 20 mathematical tasks, from high school and first-year engineering courses. We show that GPT-4 is far more powerful than ChatGPT-3.5, and further that the use of Wolfram GPT can even slightly improve the results obtained with GPT-4 at these mathematical tasks. PY - 9998 SE - 1 SP - 1 EP - 11 T2 - International Journal of Interactive Multimedia and Artificial Intelligence TI - On the Use of Large Language Models at Solving Math Problems: A Comparison Between GPT-4, LlaMA-2 and Gemini UR - https://www.ijimai.org/journal/bibcite/reference/3565 VL - In press SN - 1989-1660 ER -