- 国際学会に採択された論文
- マルチモーダル論文リスト
- ICML (International Conference on Machine Learning)
- ICLR (International Conference on Learning Representations)
- NeurIPS (Conference and Workshop on Neural Information Processing Systems)
- IJCAI (International Joint Conference on Artificial Intelligence)
- CVPR (Conference on Computer Vision and Pattern Recognition)
- ACL (Association for Computational Linguistics)
国際学会に採択された論文
機械学習やディープラーニングの領域で、マルチモーダルの研究は幅広く扱われています。
マルチモーダルに関する論文は多数発表されています。
ディープラーニングに関係がある国際学会、
- International Conference on Learning Representations (ICLR)
- Conference and Workshop on Neural Information Processing Systems (NeurIPS)
- Conference on Computer Vision and Pattern Recognition (CVPR)
などから、2020年に採択された論文のうち、マルチモーダルに関係する論文を紹介します。
マルチモーダル論文リスト
ICML (International Conference on Machine Learning)
ICLR (International Conference on Learning Representations)
AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
NeurIPS (Conference and Workshop on Neural Information Processing Systems)
Self-Supervised MultiModal Versatile Networks
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
Labelling unlabelled videos from scratch with multi-modal self-supervision
Deep Multimodal Fusion by Channel Exchanging
Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence
Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders
CoMIR: Contrastive Multimodal Image Representation for Registration
CodeCMR: Cross-Modal Retrieval For Function-Level Binary Source Code Matching
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Learning Representations from Audio-Visual Spatial Alignment
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
See, Hear, Explore: Curiosity via Audio-Visual Association
IJCAI (International Joint Conference on Artificial Intelligence)
EViLBERT: Learning Task-Agnostic Multimodal Sense Embeddings
Embodied Multimodal Multitask Learning
Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets
Interpretable Multimodal Learning for Intelligent Regulation in Online Payment Systems
A Similarity Inference Metric for RGB-Infrared Cross-Modality Person Re-identification
Mucko: Multi-Layer Cross-Modal Knowledge Reasoning for Fact-based Visual Question Answering
Set and Rebase: Determining the Semantic Graph Connectivity for Unsupervised Cross-Modal Hashing
Modeling Dense Cross-Modal Interactions for Joint Entity-Relation Extraction
CVPR (Conference on Computer Vision and Pattern Recognition)
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition
Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior 概要動画
Creating Something From Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing 概要動画
Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data
Semantically Multi-modal Image Synthesis
Cross-Modal Deep Face Normals With Deactivable Skip Connections
Knowledge As Priors: Cross-Modal Knowledge Generalization for Datasets Without Superior Knowledge 概要動画
Cross-Modal Pattern-Propagation for RGB-T Tracking
A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation
End-to-End Adversarial-Attention Network for Multi-Modal Clustering
CoverNet: Multimodal Behavior Prediction Using Trajectory Sets
Where, What, Whether: Multi-Modal Learning Meets Pedestrian Detection
Multimodal Categorization of Crisis Events in Social Media
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA 概要動画
Modality Shifting Attention Network for Multi-Modal Video Question Answering 概要動画
Hypergraph Attention Networks for Multimodal Learning
MMTM: Multimodal Transfer Module for CNN Fusion
What Makes Training Multi-Modal Classification Networks Hard?
Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather 概要動画
Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation 概要動画
MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion 概要動画
Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text 概要動画
EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege’s Principle 概要動画
Multi-Modality Cross Attention Network for Image and Sentence Matching
nuScenes: A Multimodal Dataset for Autonomous Driving
Discriminative Multi-Modality Speech Recognition
ACL (Association for Computational Linguistics)
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
CH-SIMS: A Chinese Multimodal Sentiment Analysis Dataset with Fine-grained Annotation of Modality
Integrating Multimodal Information in Large Pretrained Transformers
MMPE: A Multi-Modal Interface for Post-Editing Machine Translation
Multimodal Neural Graph Memory Networks for Visual Question Answering
MultiQT: Multimodal learning for real-time question tracking in speech
Towards Emotion-aided Multi-modal Dialogue Act Classification
Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting
Fatality Killed the Cat or: BabelPic, a Multimodal Dataset for Non-Concrete Concepts
Multimodal and Multiresolution Speech Recognition with Transformers
Multimodal Quality Estimation for Machine Translation
Multimodal Transformer for Multimodal Machine Translation
GAIA: A Fine-grained Multimedia Knowledge Extraction System
Adaptive Transformers for Learning Multimodal Representations
Cross-media Structured Common Space for Multimedia Event Extraction
Cross-modal Coherence Modeling for Caption Generation
Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage
Cross-Modality Relevance for Reasoning on Language and Vision
コメント