CALL FOR SUBMISSIONS > Special Session Papers

Special Sessions

CBMI 2026 aims to host special sessions during the conference. Special sessions are mini-venues, each focusing on topic within the content-based multimedia indexing field which is not directly covered by the list of topics for the conference, but which are beneficial to the community.

Special session should include four to five papers, which can be invited, or regular submissions. Special session papers will supplement the regular research papers and be included in the proceedings of CBMI 2026.

In order to ensure the high quality of all conference papers, all papers submitted to special sessions at CBMI 2026 will be peer-reviewed through a standard review process, including invited papers.

Special Session List :

[VR4B-2026] Video Retrieval for Beginners
[MusiCHER-2026] Challenges and opportunities on content-based multimedia indexing in the sphere of musical cultural heritage
[UHBER-2026] Multimodal Data Analysis for Understanding of Human Behaviour, Emotions and their Reasons
[ETCE-2026] Integrating ethical technologies and community engagement in multimedia technologies development
[ExFMA-2026] Explainability and Fairness in Multimedia Analysis
[MDFSA-2026] Multimodal Data Analysis and Fusion for Smart Agriculture

[VR4B-2026] Video Retrieval for Beginners

Despite the advances in automated content description using deep learning, and the emergence of joint image-text embedding models, many video retrieval tasks still require a human user in the loop. This is particularly true when the information need is fuzzy (e.g., a question about the content), or when the underlying dataset is homogeneous, i.e. contains data from one domain, differing in small details and with little or no editorial structure.

Interactive video retrieval (IVR) systems address these challenges. In order to assess their performance, multimedia retrieval benchmarks such as Video Browser Showdown (VBS) or Lifelog Search Challenge (LSC) have been established. These benchmarks provide large-scale datasets as well as task settings and evaluation protocols, allowing to measure progress in research on IVR systems. However, in order to achieve the best possible performance of the participating systems, they are usually operated by members of the development team. This setting does not allow to properly measure usability aspects of the system, which are important in order to deploy them successfully in a target application context, where they need to be operated by domain experts rather than video retrieval researchers.

This special session aims at providing better insights into how such systems are usable by users with a solid IT background, but unfamiliar with the details behind how the system operates. The special session thus calls for papers describing IVR systems, focusing on the features that make them usable by these users, and will provide a forum for demonstrating and testing these systems.

The contributions to this session will be short papers (4 pages + references), describing the participating IVR system, or multiple such systems to be compared. In contrast to papers such as those submitted to VBS or LSC, these papers should focus on the usability for users that are not retrieval experts, and the search and browsing features expected to be of particular interest to these users. Papers may also provide analysis of a system’s performance in VBS, LSC or a similar competition, and present hypotheses concerning the usability of the system by novices to be tested in the interactive session at CBMI.

The review process is single-blind, i.e. submissions do not need to be anonymized. The papers will be reviewed by two reviewers of the special session (organizers or other IVR experts) and at least one reviewer from the general pool of CBMI reviewers.

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS VR4B-2026

SS chairs

Werner Bailer, JOANNEUM RESEARCH, Austria
Cathal Gurrin, Dublin City University, Ireland
Björn Þór Jónsson, Reykjavík University, Iceland
Klaus Schöffmann, Klagenfurt University, Austria

[MusiCHER-2026] Challenges and opportunities on content-based multimedia indexing in the sphere of musical cultural heritage

Cultural heritage offers the space to a wide spectrum of disciplines and scientific fields to develop and implement innovative methods and toolkits based on suitable and appropriate use cases. The musical cultural heritage is an under-represented sector and combines both tangible and intangible cultural heritage. The MusiCHER Special Session aims to enhance the discussion on multimedia content analysis and indexing, multimedia user experiences, and applications of multimedia indexing and retrieval, within the context of the musical cultural heritage. Emerging technologies of Artificial Intelligence and Immersive Technologies could also play a supportive role for enhanced accuracy in the study, analysis, and conservation, as well as in a better understanding of complex (e.g., church organs), rare, and/or endangered (e.g., ancient musical instruments) musical cultural heritage. The MusiCHER SS welcomes contributions from a wide spectrum of disciplines and fields that set the musical cultural heritage as a priority.

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS MusiCHER-2026

SS chairs

Dr. Eleftherios Anastasovitis (ITI-CERTH, GR))
Dr. Spiros Nikolopoulos (ITI-CERTH, GR)
Ms. Georgia Georgiou (ITI-CERTH, GR)

[UHBER-2026] Multimodal Data Analysis for Understanding of Human Behaviour, Emotions and their Reasons

Understanding of human behaviour is needed in many application areas: for example, safety at work or in transport, security, supporting the elderly, supporting learning, supporting wellbeing. Humans are good at understanding other humans, and they do it via observing their behaviour and/or via interactions, but AI is notably lagging behind in this respect due to notable individual differences and context-dependency of human behaviour. In addition, many studies into this problem collect data in the controlled laboratory settings, which do not reveal all complexity and context-dependency of human behaviour and hence do not provide adequate data to train AI to better understand human behaviour.

Topics of this special session include the following:

Advances in AI methods for understanding human behaviour and deriving
preferences, intentions, personality traits
emotion / mental state / cognition
work or study task performance/ engagement/ confusion
implicit feedback
potential security and safety threats and risks
Advances in AI methods to understand reasons for the above.
Advances in adapting AI methods to different end users and contexts without collecting a lot of labels from each user and/or for each context: transfer learning, semi-supervised learning, anomaly detection, one-shot learning etc
Studies into collecting real-life human behaviour data and its analysis
Advances in information fusion, including information from various heterogeneous sources.

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS UHBER-2026

SS chairs

Elena Vildjiounaite, VTT Technical Research Centre of Finland, Finland
Mihai Mitrea, Telecom SudParis, France
Ioan Marius Bilasco, University of Lille, France
Benjamin Allaert, IMT-Nord-Europe, France

[ETCE-2026] Integrating ethical technologies and community engagement in multimedia technologies development

Multimedia technologies are transforming not only how information and experiences are created and consumed, but also how communities interact with digital and physical spaces, participate in decision-making processes, and shape their social and environmental environments. Advances in multimedia analysis, artificial intelligence, and interactive systems are increasingly embedded in processes such as public space planning, mobility design, and civic participation, enabling the collection, analysis, and visualization of multimodal data (e.g., audio, visual, spatial, and interaction data) that shape how citizens communicate, participate, and make decisions in public and social contexts.

However, these developments raise critical challenges related to ethics, transparency, accessibility, inclusion, and trust, particularly when AI-driven multimedia systems are deployed in socially sensitive and participatory contexts.

This special session addresses these challenges by focusing on human-centered approaches at the intersection of ethical AI and participatory design of public spaces and mobility. It brings together research that integrates multimedia systems, AI-driven tools, and co-creation methodologies to support inclusive community engagement while ensuring responsible, transparent, and trustworthy technological development. The session is organized by two Horizon Europe projects, SPICE and ALFIE, which emphasize ethics-by-design, participatory frameworks, and robust data governance practices in the development of digital tools for societal use.

A central motivation of the session is the recognition that multimedia technologies are no longer neutral intermediaries, but active agents shaping social interaction, access to information, and collective decision-making. As such, their design and deployment require explicit consideration of ethical, legal, and societal requirements, including data protection, explainability, accessibility, and accountability. In parallel, participatory and co-creative approaches highlight the need for multimedia and AI systems that can meaningfully incorporate citizen input, diverse perspectives, and local knowledge, particularly in the context of public space transformation, mobility, and sustainability.

The session therefore aims to foster contributions that explore how multimedia analysis, AI, and interactive systems can support:

inclusive and citizen-driven multimedia platforms,
ethical, transparent, and trustworthy AI for multimedia applications,
multimodal data analysis for social engagement and community empowerment,
explainable and responsible AI in multimedia contexts,
applied and experimental case studies at the intersection of AI, multimedia, and society,
participatory co-creation of public and social spaces,
ethical, trustworthy, and legally compliant AI models tailored to human and societal needs,
multimedia- and AI-supported approaches for Nature-Based Solutions (NBS),
accessible multimedia systems and AI tools addressing diverse user abilities and contexts.

Overall, this special session focuses on how multimedia analysis, indexing, and AI technologies can be applied in ethical and participatory real-world contexts. It brings together research on multimedia systems, AI-driven tools, and interactive platforms with approaches that ensure transparency, accessibility, data protection, and citizen involvement. It provides a dedicated forum for transdisciplinary contributions that combine methodological innovation with real-world experimentation, contributing to the development of multimedia systems that are not only technically advanced, but also aligned with societal values and community needs.

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS ETCE-2026

SS chairs

Stefanos Vrochidis, Centre for Research and Technology Hellas (CERTH)
Sotiris Diplaris, Centre for Research and Technology Hellas (CERTH)
Nefeli Georgakopoulou, Centre for Research and Technology Hellas (CERTH)
Pilar Orero, Universitat Autònoma de Barcelona (UAB)
Konstantinos Avgerinakis, Catalink

[ExFMA-2026] Explainability and Fairness in Multimedia Analysis

Recent advances in machine learning, and in particular deep learning, have led to remarkable performance gains in multimedia analysis tasks. However, it has also raised questions about the reliability, explicability, and fairness of their predictions for decision-making (e.g., the black box problem of the deep models and the risk of biased outcomes). This lack of transparency and potential unfairness raises many ethical and political concerns that prevent wider adoption of this potentially highly beneficial technology, especially when such systems are deployed in high-stakes or socially sensitive domains.

Most multimedia applications, such as person detection/tracking, face recognition, or lifelog analysis, involve sensitive personal information. This raises both legal issues, such as data protection and regulations in the ongoing European AI regulation, as well as ethical concerns related to discrimination, demographic bias, and potential misuse of these technologies.

These challenges are particularly acute in multimedia applications, where models operate on high-dimensional, multimodal data, and where predictions frequently rely on subtle semantic cues that are difficult to interpret even for human experts. Biases may emerge from data imbalance, annotation practices, model design, or deployment contexts, and may disproportionately affect certain individuals or communities. It is therefore crucial not only to understand how predictions correlate with information perception and expert decision-making but also whether they are equitable across groups and aligned with societal values. The objective of eXplainable AI (XAI) and Fair AI is to improve transparency, mitigate bias, and foster meaningful human

understanding of AI systems.

This special session focuses on methods and applications for explainable and fair multimedia analysis, with an emphasis on explanations that are faithful to the underlying models, meaningful to end users, actionable for domain experts, and supportive of bias detection and mitigation. The goal is to bring together researchers and practitioners working on theoretical, methodological, and applied aspects of explainability, fairness, evaluation, and interaction in multimedia AI systems.

Topics of interest include (but are not limited to):

Analysis of the influencing factors relevant to the final decision as an essential step to understand and improve the underlying processes involved.
Methods for bias detection, fairness assessment, and mitigation in multimedia dataset and models.
Fairness-aware learning strategies for multimedia analysis.
Information visualization for models or their predictions.
Visual analytics and Interactive applications for XAI.
Performance evaluation metrics and protocols for explainability.
Performance evaluation metrics and protocols for fairness.
Sample-centric and dataset-centric explanations, including subgroup analyses
Attention mechanisms for XAI.
XAI-based pruning.
XAI for multimedia systems supporting domain experts (e.g., healthcare, security, cultural heritage).
Open challenges from industry or existing and emerging regulatory frameworks.
Industrial use cases and deployment challenges.

The special session aims to collect high-quality scientific contributions that advance the state of the art in explainable and fair multimedia analysis, and to foster interdisciplinary discussion on how transparency, fairness, and accountability can be jointly addressed in multimedia AI systems. By integrating explainability and fairness, the session seeks to promote trustworthy AI technologies that enhance societal benefit while minimizing risks of bias, discrimination, and unintended harm.

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS ExFMA-2026

SS chairs

Chiara Galdi, EURECOM, Sophia Antipolis, France.
Romain Bourqui, Université of Bordeaux
Martin Winter, JOANNEUM RESEARCH - DIGITAL, Graz, Austria.
Romain Giot, Université of Bordeaux

[MDFSA-2026] Multimodal Data Analysis and Fusion for Smart Agriculture

Smart agriculture relies on Artificial Intelligence (AI) models and Earth Observation (EO) data. EO data is Big Data, considering only the peta-bytes per day from the Copernicus EO programme of EU. In this context, the AI models should learn from/deal with multimodal data. The development of robust AI models for smart agriculture fundamentally depends on access to large-scale, high-quality EO datasets that capture the spatial, temporal and spectral variability inherent in agricultural systems. Using Deep Learning architectures for crop classification, phenological monitoring and stress detection requires massive volumes of data spanning diverse geographical regions, climatic conditions and growing seasons to ensure model generalizability.

Multimodal data fusion for agriculture represents an emerging interdisciplinary field that combines heterogeneous data sources—including satellite imagery, drone-based sensors, IoT devices, weather data, soil sensors, and genomic information—to create comprehensive analytical frameworks for precision agriculture and sustainable food production. This approach leverages advanced machine learning, computer vision and signal processing techniques to integrate temporal, spatial and spectral data streams, enabling more accurate crop monitoring, yield prediction, disease detection and resource optimization than any single data modality could achieve alone.

The field addresses critical challenges in feeding a growing global population while minimizing environmental impact, drawing on expertise from remote sensing, agricultural science, data science and environmental engineering.

This special session welcomes contributions that uses different types of data for agricultural applications and are related to one or more topics of interest:

Fusion Methodologies and Algorithms

Deep learning architectures for multimodal integration
Spatiotemporal data fusion techniques
Uncertainty quantification in fused predictions
Transfer learning across different sensors and geographic regions
Attention mechanisms for modality weighting
Multimodal large models for agriculture

Data Acquisition and Sensing Technologies

Hyperspectral and multispectral imaging systems
UAV-based agricultural monitoring
IoT sensor networks for soil moisture, nutrients, and microclimate
Synthetic aperture radar (SAR) for all-weather crop monitoring
Phenotyping platforms and high-throughput field sensing

Agricultural Applications

Crop yield forecasting and early warning systems
Plant disease and pest detection
Precision irrigation and fertilization management
Soil health assessment and carbon sequestration monitoring
Livestock monitoring through integrated sensor systems

Emerging Challenges

Edge computing and on-farm processing
Data standardization and interoperability across platforms
Climate adaptation and resilience modeling

Integration with Decision Support

Explainable AI for farmer-facing applications
Real-time alert systems and recommendation engines
Economic modeling and cost-benefit analysis
Policy implications and regulatory framework

Important dates :

Paper deadline: 20 APRIL 2026 11 MAY 2026
Notification: 22 MAY 2026 15 JUNE 2026
Camera-ready: 15 JUNE 2026 22 JUNE 2026

Paper submission : Author Guidelines

Please indicate in the comments that this paper is for SS MDFSA-2026

SS chairs

Prof. Mihai Ivanovici, Transilvania University of Brasov, Romania
Prof. Corneliu Florea, National University of Science and Technology POLITEHNCA of Bucharest, Romania

Privacy | Accessibility