Workshops:
Scope:
With the rapid growth of video surveillance applications and services, the amount of surveillance videos has become extremely “big” which makes human monitoring tedious and difficult. Therefore, there exists a huge demand for smart surveillance techniques which can perform monitoring in an automatic or semi-automatic way. Firstly, with the huge amount of surveillance videos in storage, video analysis tasks such as event detection, action recognition, and video summarization are of increasing importance in applications including events-of-interest retrieval and abnormality detection. Secondly, with the fast increase of semantic data (e.g., objects’ trajectory & bounding box) extracted by video analysis techniques, the semantic data have become an essential data type in surveillance systems, introducing new challenging topics, such as efficient semantic data processing and semantic data compression, to the community. Thirdly, with the rapid growth from the static centric-based processing to the dynamic collaborative computing and processing among distributed video processing nodes or cameras, new challenges such as multi-camera joint analysis, human re-identification, or distributed video processing are being issued in front of us. The requirement of these challenges is to extend the existing approaches or explore new feasible techniques. This workshop is intended to provide a forum for researchers and engineers to present their latest innovations and share their experiences on all aspects of design and implementation of new surveillance video analysis and processing techniques.
Link to the workshop website and/or CfP:
Confirmed list of workshop chairs:
- Prof. Weiyao Lin, Shanghai Jiaotong University, China
- Prof. John See, Heriot-Watt University Malaysia, Malaysia
- Prof. Yipeng Liu, University of Electronic Science and Technoligy of China, China
- Prof. Junhui Hou, City University of Hong Kong, China
- Prof. Thierry Bouwmans, La Rochelle Université, France
- Prof. Thittaporn Ganokratana, King Mongkut’s University of Technology Thonburi, Thailand
Scope
This workshop will explore advancements in large multimodal models for pixel-level scene understanding, addressing challenges such as generalization across domains, efficient model training, and integration of vision-language understanding at the pixel level. Topics include semantic and instance segmentation, multimodal data fusion, autonomous systems, and novel applications in healthcare, robotics, and geospatial analysis.
Link to the workshop website and/or CfP:
https://sites.google.com/view/lmm-psu
Confirmed list of workshop chairs:
- Rao Muhammad Anwer, MBZUAI, rao.anwer@mbzuai.ace.ae
- Hisham Hisham Cholakkal, MBZUAI, hisham.cholakkal@mbzuai.ac.ae
- Jorma Laaksonen, Aalto University, jorma.laaksonen@aalto.fi
- Wenguan Wang, Zhenjian University, wenguanwang.ai@gmail.com
- Jiale Cao Tianjin University connor@tju.edu.cn
- Yutong Xie University of Adelaide Australia, MBZUAI UAE yutong.xie@mbzuai.ac.ae
Scope
Recent advances brought by AI-Generated Content (AIGC) have been an innovative engine for digital content generation, drawing more and more attention from both academia and industry. Across creative fields, AI has sparked new genres and experimentations in painting, music, film, storytelling, fashion and design. Researchers explore the concept of co-creation with AI systems as well as the ethical implications of AI generated images and texts. AI has been applied to art historical research and media studies. The aesthetic value of AI generated content and AI’s impact on art appreciation have also been a contended subject in recent scholarship. AI has not only exhibited creative potential, but also stimulated research from diverse perspectives of neuroscience, cognitive science, psychology, literature, art history, media and communication studies. Despite all these promising features of AI for Art, we still have to face the many challenges such as the biases in AI models, lack of transparency and explainability in algorithms, and copyright issues of training data and AI art works.
This is the 7th AIART workshop to be held in conjunction with ICME 2025 in Nantes, France, and it aims to bring forward cutting-edge technologies and most recent advances in the area of AI art as well as perspectives from neuroscience, cognitive science, psychology, literature, art history, media and communication studies.
Link to the workshop website and/or CfP:
Confirmed list of workshop chairs:
- Luntian Mou, Beijing University of Technology, Beijing, China (Email: ltmou@bjut.edu.cn)
- Feng Gao, Peking University, Beijing, China (Email: gaof@pku.edu.cn)
- Kejun Zhang, Zhejiang University, Hangzhou, China (Email: zhangkejun@zju.edu.cn
- Zeyu Wang, The Hong Kong University of Science and Technology (Guangzhou), China, (Email: zeyuwang@ust.hk)
- Gerui Wang, Stanford University, California, USA, (Email: grwang@stanford.edu)
- Ling Fan, Tezign.com; Tongji University Design Artificial Intelligence Lab, Shanghai, China, (Email: lfan@tongji.edu.cn)
- Nick Bryan-Kinns, University of the Arts London, London, The United Kingdom, (Email: n.bryankinns@arts.ac.uk)
- Ambarish Natu, Australian Government, Australian Capital Territory, Australia, (Email: ambarish.natu@gmail.com)
Scope
Underwater information processing plays a critical role in addressing global challenges, including marine resource exploration, environmental conservation, and ecological protection. However, the underwater environment remains one of the most complex and dynamic domains, with persistent challenges such as low visibility, acoustic distortion, and heterogeneous data sources. The integration of cutting-edge multimedia technologies and advanced multi-modal data fusion presents transformative solutions to these challenges, offering innovative pathways for understanding and utilizing underwater environments.
This workshop invites high-quality submissions that explore the use of multimediatechnologies, such as imaging, video, sonar, acoustics, and laser sensing, for advancing underwater information processing. It aims to foster interdisciplinary collaboration and highlight the latest breakthroughs in leveraging multimedia for environmental monitoring, navigation, and resource exploration.
Link to the workshop website and/or CfP:
Confirmed list of workshop chairs:
- Junyu Dong (Ocean University of China, China)
- Guangtao Zhai (Shanghai Jiao Tong University, China)
- Yakun Ju ( University of Leicester, UK)
- Sen Wang (Imperial College London, UK)
- Hui Yu (University of Glasgow)
Scope
The second edition of the LIVES workshop aims to address the critical challenges and opportunities in achieving ultra-low latency in live video streaming. With the proliferation of video streaming applications–ranging from virtual events and esports to online learning and gaming–the demand for seamless, real-time content delivery is more pressing than ever. This workshop will explore cutting-edge research, innovations, and practical solutions to minimize latency across the entire streaming workflow, ensuring a superior Quality-of-Experience (QoE) for end users.
Link to the workshop website and/or CfP:
https://athena.itec.aau.at/events/surpassing-latency-limits-in-adaptive-live-video-streaming/
Confirmed list of workshop chairs:
- Abdelhak Bentaleb, Assistant Professor of Computer Science, Concordia University
- Farzad Tashtarian, Post Doctoral researcher, Alpen-Adria-Universität Klagenfurt, Austria
- Tanja Kojić, Research Assistant · Technische Universität Berlin, Germany
Scope
Facial expressions play a vital role in human communication, accounting for about 55% of how we interpret others’ feelings and attitudes. Research in psychology, neuroscience, and computer science has extensively explored facial expressions, with deep learning advancements enabling significant progress in facial expression recognition. However, the assumption that facial expressions are a universal language signaling internal emotional states is increasingly questioned due to consistent cross-cultural disagreements in interpreting emotion and intensity. This workshop will explore cultural variations in the perception, interpretation, and expression of emotions, challenging the universality hypothesis and examining the psychological and biological factors influencing facial expression recognition across cultures.
Link to the workshop website and/or CfP:
https://sites.google.com/view/icme2025workshopculture/main
Confirmed list of workshop chairs:
- Dr Yante Li, University of Oulu
- Hongchuan Yu, Bournemouth University
- Hui Yu, University of Glasgow
- Guoying Zhao, University of Oulu
Scope
Today, ubiquitous multimedia sensors and large-scale computing infrastructures are producing at a rapid velocity of 3D multi-modality data, such as 3D point cloud acquired with LIDAR sensors, RGB-D videos recorded by Kinect cameras, meshes of varying topology, and volumetric data. 3D multimedia combines different content forms such as text, audio, images, and video with 3D information, which can perceive the world better since the real world is 3-dimensional instead of 2-dimensional. For example, the robots can manipulate objects successfully by recognizing the object via RGB frames and perceiving the object size via point cloud. Researchers have strived to push the limits of 3D multimedia search and generation in various applications, such as autonomous driving, robotic visual navigation, smart industrial manufacturing, logistics distribution, and logistics picking. The 3D multimedia (e.g., the videos and point cloud) can also help the agents to grasp, move and place the packages automatically in logistics picking systems. Therefore, 3D multimedia analytics is one of the fundamental problems in multimedia understanding. Different from 3D vision, 3D multimedia analytics mainly concentrate on fusing the 3D content with other media. It is a very challenging problem that involves multiple tasks such as human 3D mesh recovery and analysis, 3D shapes and scenes generation from real-world data, 3D virtual talking head, 3D multimedia classification and retrieval, 3D semantic segmentation, 3D object detection and tracking, 3D multimedia scene understanding, and so on. Therefore, the purpose of this workshop is to: 1) bring together the state-of-the-art research on 3D multimedia analysis; 2) call for a coordinated effort to understand the opportunities and challenges emerging in 3D multimedia analysis; 3) identify key tasks and evaluate the state-of-the-art methods; 4) showcase innovative methodologies and ideas; 5) introduce interesting real-world 3D multimedia analysis systems or applications; and 6) propose new real-world or simulated datasets and discuss future directions. We solicit original contributions in all fields of 3D multimedia analysis that explore the multi-modality data to generate the strong 3D data representation. We believe this workshop will offer a timely collection of research updates to benefit researchers and practitioners in the broad multimedia communities.
Link to the workshop website and/or CfP:
https://3dmm-icme2025.github.io/
Confirmed list of workshop chairs:
- Peng Dai, Noah’s Ark Canada, Toronto, Canada
- Shan An, Tianjin University, Tianjin, China
- Kun Liu, JD Explore Academy, Beijing, China
- Xuri Ge, Shandong University, Jinan, China
- Guoxin Wang, Zhejiang University, Hangzhou, China
- Wu Liu, University of Science and Technology of China, Hefei, China
- Antonios Gasteratos, Democritus University of Thrace, Greece
Scope
his one-day workshop will explore the dynamic intersection of artificial intelligence and multimedia with an emphasis on music and audio technologies. The workshop explores how AI is transforming music creation, recognition, and education, ethical and legal implications, as well as business opportunities. We will investigate how AI is changing the music industry and education—from composition to performance, production, collaboration, and audience experience. Participants will gain insights into the technological challenges in music and how AI can enhance creativity, enabling musicians and producers to push the boundaries of their art. The workshop will cover topics such as AI-driven music composition, where algorithms generate melodies, harmonies, and even full orchestral arrangements. We will discuss how AI tools assist in sound design, remixing, and mastering, allowing for new sonic possibilities and efficiencies in music production. Additionally, we’ll examine AI’s impact on music education and the careers of musicians, exploring advanced learning tools and teaching methods. AI technologies are increasingly adopted in the music and entertainment industry. The workshop will also discuss the legal and ethical implications of AI in music, including questions of authorship, originality, and the evolving role of human artists in an increasingly automated world. This workshop is designed for AI researchers, musicians, producers, and educators interested in the current status and future of AI in music.
Link to the workshop website and/or CfP:
https://ai4musicians.org/2025icme.html
Confirmed list of workshop chairs:
- Yung-Hsiang Lu, Purdue, USA
- Kristen Yeon-Ji Yun, Purdue, USA
- George K. Thiruvathukal, Loyola University Chicago, USA
- Benjamin Shiue-Hal Chou
Scope
Accurate and reliable analysis of human movement has become increasingly critical in domains such as sports performance, healthcare, rehabilitation, and human-computer interaction. Recent advancements in AI, particularly generative models, have transformed the field by enabling the synthesis of realistic human motion, augmenting datasets, and improving motion prediction and analysis. Combined with innovations in edge computing, wearable devices, and multimodal data fusion, these technologies are driving breakthroughs in real-time, non-invasive human motion analysis. This workshop provides a platform to explore cutting-edge methodologies, applications, and challenges in leveraging computer vision and generative AI for human motion analysis. Key topics include the integration of generative AI for motion synthesis and simulation, privacy-preserving models, and systems addressing fairness and inclusivity. Emerging applications include personalized rehabilitation systems, virtual coaching, immersive entertainment, synthetic data for bias mitigation, and crowd motion prediction. The forum aims to connect researchers, practitioners, and industry professionals to discuss theoretical advancements, real-world implementations, and future directions in this evolving field.
Link to the workshop website and/or CfP:
https://sites.google.com/view/hma2025/
Confirmed list of workshop chairs:
- Vinay Kaushik, Assistant Professor, Indian Institute of Information Technology, Sonepat, India, vkaushik@iiitsonepat.ac.in
- Amit Kumar Gupta, CTO, Vuemotion Labs, Sydney, Australia, amit.kmr.gupta@gmail.com, amitgupta@vuemotion.com
- Jianquan Liu, Ph.D., Director of Video Insights Discovery Research Group, Visual Intelligence Research Laboratories, NEC Corporation, Japan, jqliu@acm.org
- Vittorio Murino, Ph.D., Dipartimento di Informatica, Universita` degli Studi di Verona, Ca’ Vignal 2, Strada Le Grazie 15, 37134 Verona, Italy, vittorio.murino@univr.it
- Prerana Mukherjee, Jawaharlal Nehru University, Delhi, India, prerana@jnu.ac.in
- Brejesh Lall, Professor, Indian Institute of Technology, Delhi, India, brejesh@ee.iitd.ac.in