Peixi Xiong

Staff AI Research Scientist/Engineer & Analog Film Photographer

Ph.D. in Computer Vision from Northwestern University, advised by Prof. Ying Wu. Currently a Staff AI Research Scientist/Engineer at Intel Labs, working on multimodal learning, structured reasoning, and agentic AI. Also an analog film photographer shooting medium format landscapes.

Google Scholar LinkedIn GitHub Instagram CV

Research & Academic

Focused on multimodal learning, structured reasoning, and foundation model applications, including Retrieval-Augmented Generation and agentic AI systems.

Published at CVPR, ECCV, EMNLP, ICLR, AAAI, ICCV, and WACV
Multiple U.S. patents in visual reasoning, RAG, and generative AI
Editorial Advisory Board Member, Information Processing & Management (Elsevier)

Analog Film Photography

Landscape-focused analog photography with ongoing collaborations with film brands, labs, manufacturers, and community features.

Medium format and 35mm analog film work
Self-developed C-41, E-6, and ECN-2 film processes
Collaborations with film brands, labs, manufacturers, and community features

Research

Research Summary

My research sits at the intersection of computer vision and natural language processing. I work on multimodal learning, structured visual reasoning, and foundation model applications — including Retrieval-Augmented Generation (RAG) and agentic AI systems. Currently a Staff AI Research Scientist/Engineer at Intel Labs.

Research Interests

Agentic AI Structured Reasoning Multimodal Learning Visual Question Answering Computer Vision Retrieval-Augmented Generation

Publications

Selected Publications

A selection of recent and representative work. Click below to see the full list.

CVPR 2026

RARE: Learn to RAnk and REtrieve for Monocular 3D Object Detection

Hyeonjeong Park, Peixi Xiong, Pei Yu, Wei Tang
CVPR 2025

Learning Partonomic 3D Reconstruction from Image Collections

Xiaoqian Ruan, Pei Yu, Dian Jia, Hyeonjeong Park, Peixi Xiong, Wei Tang
AAAI 2025

Dask: Distribution Rehearsing via Adaptive Style Kernel Learning for Exemplar-Free Lifelong Person Re-Identification

Kunlun Xu, Chenghao Jiang, Peixi Xiong, Yuxin Peng, Jiahuan Zhou
ICLR 2025 · SCOPE Workshop

Context Is All You Need: Efficient Retrieval Augmented Generation for Domain Specific AI

Peixi Xiong, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain
ECCV 2024

Textual-Visual Logic Challenge: Understanding and Reasoning in Text-to-Image Generation

Peixi Xiong, Michael Kozuch, Nilesh Jain
EMNLP 2024 · Findings

Learning to Ask Denotative and Connotative Questions for Knowledge-based VQA

Xiaoying Xing, Peixi Xiong, Lei Fan, Yunxuan Li, Ying Wu
Neurocomputing 2022 Journal

Visual Question Answering by Pattern Matching and Reasoning

Huayi Zhan, Peixi Xiong, Xin Wang, Xin Wang, Lan Yang
CVPR 2020 Oral

TA-Student VQA: Multi-Agents Training by Self-Questioning

Peixi Xiong, Ying Wu
CVPR 2019

Visual Query Answering by Entity-Attribute Graph Matching and Reasoning

Peixi Xiong, Huayi Zhan, Xin Wang, Baivab Sinha, Ying Wu

Intellectual Property

Patents

U.S. patents in visual reasoning, retrieval-augmented generation, and generative AI.

Retrieval-Augmented Generation for Domain-Specific Technical Documents

US Patent App. 19/340,352 · 2026

Multi-Modality Reinforcement Learning in Logic-Rich Scene Generation

US Patent App. 19/229,127 · 2025

Multi-Granularity Alignment for Visual Question Answering

US Patent 12,210,835 · 2025

Network for Structure-Based Text-to-Image Generation

US Patent App. 18/400,561 · 2024

Semantic-Guided Transformer for Object Recognition and Radiance Field-Based Novel View

US Patent App. 18/475,353 · 2024

Experience

Selected Industry Experience

Prior research internships in computer vision and multimodal AI.

Research Intern

Samsung Research America — Mountain View, CA · Summer 2021

Proposed a hierarchical Transformer for VQA reasoning (MGA-VQA) and developed a decision fusion module for effective multi-Transformer collaboration.

Research Intern — Computer Vision

Microsoft Corporation — Redmond, WA · Summer 2020

Proposed a graph-based architecture for VQA reasoning (SA-VQA) and incorporated multi-head attention mechanisms for enhanced multimodal alignment.

Research Intern

SAIC Innovation Center — San Jose, CA · Summer 2018

Proposed a novel architecture for LiDAR-based autonomous driving performing simultaneous object detection and instance segmentation, with attention-enhanced small object detection.

Professional Activities