Sign In

Communications of the ACM

ACM TechNews

Mutual Attention Inception Network Developed for Remote Sensing Visual Question Answering

View as: Print Mobile App Share:
A diagram depicting a part of the proposed method.

Remote sensing visual question answering mainly aims at making semantic understanding of remote sensing images objective and interactive. Specifically, given an RSI, an intelligent agent will answer a question about the remote sensing scene.

Credit: XIOPM

Chinese Academy of Sciences researchers have designed a novel mutual attention inception network (MAIN) and a remote sensing visual question answering (RSIVQA) dataset.

RSIVQA chiefly concerns adding objectivity and interactivity to semantic comprehension of remote sensing images (RSIs), with most techniques limited due to their disregard of RSIs' spatial information and the word-level semantic information of questions.

The MAIN combines a representation and fusion module; the former was designed to acquire image and question features which can provide better representations, while the latter augments the discrimination of representations that can yield correct answers by reinforcing image-question representations.

Experimental results indicated the method can identify image-question alignments under different evaluation metrics.

From Chinese Academy of Sciences
View Full Article


Abstracts Copyright © 2021 SmithBucklin, Washington, DC, USA


No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account