Me
Hexiang (Frank) Hu
Member of Technical Staff @xAI
hexiang.frank.hu [at] gmail.com
I believe that:
'There is only one heroism in the world:
to see the world as it is and to love it.'
-- Romain Rolland

Biography

Hexiang Hu is a Member of Technical Staff at xAI. Prior to that, He was a Research Scientist at Google DeepMind. He earned his Ph.D. degree in Computer Science from Viterbi School of Engineering at University of Southern California (USC). His long-term research goal is to build agents that understand human language in the perceptual and embodied environments. [ CV ]

Present
xAI
Member of Technical Staff
Nov 2024
Google Deepmind
Research Scientist
May 2021
University of Southern California
Ph.D. in Computer Science

News

 June 2023
MagicLens accepted at ICML 2024 as an Oral.
 May 2023
Imagen 3 released.
 March 2023
Instruct-Imagen accepted at CVPR 2024 as an Oral.
 Dec 2023
Gemini released.

Publications ( show selected / show all by date / show all by topic )

(*: Indicating equal contribution.)
Grok 3
Blog Post
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
ICML 2024 (Oral) , Vienna, Austria
Instruct-Imagen: Image Generation with Multi-modal Instruction
CVPR 2024 (Oral) , Seattle, WA
Subject-driven Text-to-Image Generation via Apprenticeship Learning
NeurIPS 2023 , New Orleans, LA
Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
ICCV 2023 (Oral) , Paris, France
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
EMNLP 2023
PaLI-X: On Scaling up a Multilingual Vision and Language Model
CVPR 2024 , Seattle, WA
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
ICML 2023 (Oral) , Honolulu, HI
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
ICLR 2023 , Kigali, Rwanda
Learning the Best Pooling Strategy for Visual Semantic Embedding
CVPR 2021 (Oral) , Virtual
Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions
CVPR 2020 , Seattle, WA
Top