Research Applications of Neurosymbolic Methods

This position is open to undergraduate, MS, and Ph.D. students. Choose your corresponding research interest for the question "Which of the following match your research interests?" in the google form. Each team will be led by a Ph.D. student.

Applications of Neurosymbolic Methods

Project Overview

In our lab, we explore how neurosymbolic methods, probabilistic inference, and learning-based optimization can be applied to real-world domains. You will work on combining neural perception with symbolic structure, uncertainty modeling, and interpretable reasoning to address applied problems in vision, human-AI interaction, healthcare, energy systems, and multimodal understanding.

Research Focus Areas

1. Computer Vision and Video Understanding

Focus: Building models that understand visual environments, recognize human activities, and reason over temporal and relational structure in videos.

Key Problems:

Activity recognition in egocentric and third-person video
Procedural task understanding in complex environments
Object detection and tracking in dynamic scenes
Scene graph construction and relational reasoning

Methods:

Deep learning for visual recognition and feature extraction
Neurosymbolic models for interpretable activity and event reasoning
Probabilistic temporal models for multi-step predictions
Graph-based representations for structured scene understanding

Example Work:

CaptainCook4D: Egocentric 4D dataset for procedural task understanding (NeurIPS’24, DMLR’23)
Explainable Activity Recognition: Interpretable models for human activity understanding (TiiS’23)
Neurosymbolic Models for Activity Recognition and Image Classification: Deep dependency networks for multi-label classification in images and videoss (AISTATS’24)

2. Human-AI Interaction and Task Guidance

Focus: Developing systems that provide real-time assistance for physical and cognitive tasks through perception, prediction, and symbolic task knowledge.

Key Problems:

Real-time task guidance in augmented reality
Predictive assistance for multi-step procedural workflows
Error detection and recovery in human activities
Adaptive instruction generation

Methods:

Neurosymbolic models integrating perception with symbolic task graphs
Probabilistic inference for action prediction and intent estimation
Multimodal reasoning over visual, language, and contextual signals

Example Work:

Predictive Task Guidance in AR: Real-time guidance systems for complex tasks (IEEE VR’24)
CaptainCook4D: Egocentric 4D dataset for procedural task understanding (NeurIPS’24, DMLR’23)
Real-time AR Guidance Systems: Built systems accelerating task completion (DARPA PTG)

3. Medical and Healthcare Applications

Focus: Applying AI to clinical decision support, diagnostics, and health systems optimization.

Key Problems:

Disease diagnosis and prognosis
Treatment planning and personalization
Medical image analysis
Healthcare resource optimization

Methods:

Probabilistic models for uncertainty quantification and risk estimation
Explainable AI for clinical decision support
Graph-based patient modeling and knowledge graph inference
Learning-based models for diagnostic and prognostic prediction

Applications:

Disease spread modeling and intervention planning
Personalized treatment and risk-based stratification
Explainable medical image analysis
Hospital resource allocation and scheduling

4. Energy Systems and Infrastructure

Focus: Optimizing and forecasting behavior in large-scale infrastructure systems.

Key Problems:

Power grid optimization and stability analysis
Smart grid management and demand-side forecasting
Maintenance scheduling in large infrastructure networks
Integration of renewable energy sources

Methods:

Graph neural networks for grid and network modeling
Reinforcement learning for dynamic resource allocation
Probabilistic models for forecasting and reliability analysis
Combinatorial optimization for scheduling and planning

5. Natural Language Processing and Reasoning

Focus: Developing systems that combine language, vision, and structured knowledge for reasoning and decision making.

Key Problems:

Multimodal reasoning across text, images, and video
Knowledge-grounded question answering
Language-guided planning and action prediction
Document understanding and structured information extraction

Methods:

Neurosymbolic models integrating language with symbolic knowledge bases
Probabilistic reasoning for ambiguity resolution
Graph-based representations for knowledge and relational structure
Deep learning for language understanding and grounding tasks

Applications:

Visual question answering and multimodal inference
Instruction following and task planning
Knowledge base reasoning and retrieval
Multimodal document and scene interpretation

How To Apply

Please submit your details using the Google Form.

Note: Select “Applications in Multimodal Reasoning” or “Applications in Computer Vision and Video Understanding” or choose “Other” and specify your interests.

Selected students may be invited for a brief meeting to discuss fit and potential directions.

For general lab information and university details, see the main hiring page.

← Back to Main Hiring Page