
Oluwagbenga Orimoogunje Master’s Thesis Defense, Friday, May 1, 2026 @ 3:30 pm Central Time
May 1 @ 3:30 pm - 4:30 pm
COMMITTEE CHAIR: Dr. Lijun Qian
TITLE: LLM-ASSISTED TASK-ADAPTIVE SEMANTIC COMMUNICATIONS OVER NON-IDEAL CHANNELS FOR VISION-LANGUAGE SYSTEMS
ABSTRACT: The growth of intelligent wireless systems and bandwidth-limited edge devices has created a need for communication methods that move beyond bit-level data transmission. Semantic communication addresses this by transmitting task-relevant meaning, which reduces bandwidth usage while preserving task performance. How- ever, existing approaches often rely on frozen encoders without task feedback and are trained under fixed or ideal signal-to-noise ratio (SNR) conditions. These limitations lead to poor generalization in realistic noisy channels. This thesis proposes a task-adaptive semantic communication framework for Visual Question Answering (VQA) over bandwidth-constrained Additive White Gaussian Noise (AWGN) channels. The system transmits a compact 16,384-dimensional semantic feature map from a ResNet-50 encoder, achieving a 9× compression of the 224 × 224 × 3 input image while maintaining strong task accuracy under channel noise. A frozen BERT-based question encoder provides language grounding, and a multimodal fusion head performs classification over 1,000 answer classes. Two main contributions are introduced. First, a multi-SNR training strategy improves robustness by training the encoder across a wide SNR range with a bias toward low-SNR conditions. This increases mean VQA accuracy for SNR = -5 dB by 3.8%, with a maximum gain of 6.3% at -20 dB, while preserving high-SNR performance (? = +0.3%). Second, a semantic feedback mechanism is developed using embedding-based language models, including GTE-Qwen2-7B, SBERT (all-mpnet-base-v2), BERT-base-uncased, and GPT-2, as frozen reward oracles in a REINFORCE framework. These models compute cosine similarity between predicted and ground-truth answers, providing richer training signals than binary rewards and enabling consistent encoder adaptation under noise and bandwidth constraints. This approach yields accuracy gains of 1.9–2.1% over the supervised baseline, while vision-based and heuristic rewards underperform and can reduce accuracy by up to 4.7% at -20 dB. Experiments on the VQAv2 dataset using a single NVIDIA Tesla V100 GPU confirm that text-based semantic similarity rewards outperform vision-based and heuristic feedback for encoder adaptation. This work presents a systematic comparison of embedding- based semantic judges as REINFORCE reward signals and demonstrates their effectiveness under realistic noisy channel conditions. The results support the development of robust and bandwidth-efficient semantic communication systems for future wireless networks.
Keywords: Semantic Communications, Deep Learning, Visual Question Answering, signal-to-noise ratio
Room Location: Electrical Engineering Conference Room 315D


