Jiahui Yu, PhD

Member of Technical Staff, OpenAI

Google Scholar
GitHub  /  LinkedIn  /  Twitter


I am a Research Scientist at OpenAI. Before that I was a Staff Research Scientist and Manager at Google Brain and Google DeepMind. I received my PhD at University of Illinois at Urbana-Champaign advised by Professor Thomas Huang, and Bachelor with distinction at School of the Gifted Young in Computer Science, University of Science and Technology of China. During my graduate studies, I enjoyed a diverse and enriching array of internships at Microsoft Research Asia, Megvii, Adobe Research, Snap Research, Jump Trading, Baidu Research, Nvidia Research, and Google Brain.

I work on sequence modeling (language, speech, video, financial data), computer vision, generative models, and high performance computing.


Publications

(Google Scholar Profile)

Gemini: A Family of Highly Capable Multimodal Models
Gemini Team Google: Jiahui Yu, Co-Lead, Multimodal Vision.
ArXiv 2023 / Gemini

PaLM 2 Technical Report
PaLM2 Team Google: Jiahui Yu, Core Contributor, Architecture and Modeling Workstream; and
Contributor, Fine-tuning Workstream

ArXiv 2023 / PaLM2

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Jiahui Yu, Yuanzhong Xu, Jing Yu Koh, Thang Luong, Gunjan Baid, Zirui Wang, Vijay Vasudevan, Alexander Ku, Yinfei Yang, Burcu Karagol Ayan, Ben Hutchinson, Wei Han, Zarana Parekh, Xin Li, Han Zhang, Jason Baldridge, Yonghui Wu
TMLR 2022 / parti.research.google

CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mojtaba Seyedhosseini, Yonghui Wu
TMLR 2022 / Google AI Blog

Vector-quantized Image Modeling with Improved VQGAN
Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alex Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu
ICLR 2022 / Google AI Blog

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao
ICLR 2022 / Google AI Blog

Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Chung-Cheng Chiu, James Qin, Yu Zhang, Jiahui Yu, Yonghui Wu
ICML 2022

Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
Shen Yan, Tao Zhu, Zirui Wang, Yuan Cao, Mi Zhang, Soham Ghosh, Yonghui Wu, Jiahui Yu
ArXiv 2022

Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Taihong Xiao, Zirui Wang, Liangliang Cao, Jiahui Yu, Shengyang Dai, Ming-Hsuan Yang
ArXiv 2022

Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Jiahui Yu, Wei Han, Anmol Gulati, Chung-Cheng Chiu, Bo Li, Tara Sainath,
Yonghui Wu, Ruoming Pang.

ICLR 2021

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization
Jiahui Yu, Chung-Cheng Chiu, Bo Li, Shuo-yiin Chang, Tara Sainath, Yanzhang He,
Arun Narayanan, Wei Han, Anmol Gulati, Yonghui Wu, Ruoming Pang.

ICASSP 2021

A Better and Faster End-to-End Model for Streaming ASR
Bo Li, Anmol Gulati, Jiahui Yu, Tara Sainath, Chung-Cheng Chiu, Arun Narayanan,
Shuo-Yiin Chang, Ruoming Pang, Yanzhang He, James Qin, Wei Han, Qiao Liang, Yu Zhang, Trevor Strohman, Yonghui Wu.

ICASSP 2021

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang, Daniel Park, Wei Han, et al.
IEEE Journal of Selected Topics in Signal Processing, 2021

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling
Tara Sainath, Yanzhang He, Arun Narayanan, et al.
INTERSPEECH 2021

Cascaded Encoders for Unifying Streaming and Non-streaming ASR
Arun Narayanan, Tara Sainath, Ruoming Pang, Jiahui Yu, Chung-Cheng Chiu,
Rohit Prabhavalkar, Ehsan Variani, Trevor Strohman.

ICASSP 2021

Dynamic Sparsity Neural Networks for Automatic Speech Recognition
Zhaofeng Wu, Ding Zhao, Qiao Liang, Jiahui Yu, Anmol Gulati, Ruoming Pang.
ICASSP 2021

Co-training Transformer with Videos and Images Improves Action Recognition
Bowen Zhang, Jiahui Yu, Christopher Fifty, Wei Han, Andrew Dai, Ruoming Pang, Fei Sha
ArXiv 2021

Neural Sparse Representation for Image Restoration
Yuchen Fan, Jiahui Yu, Yiqun Mei, Yulun Zhang, Yun Fu, Ding Liu, Thomas Huang.
NeurIPS 2020 / Code

Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications
Ming-Yu Liu, Xun Huang, Jiahui Yu, Ting-Chun Wang, Arun Mallya.
Proceedings of the IEEE 2020

BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models
Jiahui Yu, Pengchong Jin, Hanxiao Liu, Gabriel Bender, Pieter-Jan Kindermans, Mingxing Tan, Thomas Huang, Xiaodan Song, Ruoming Pang, Quoc Le.
ECCV 2020

ContextNet: Improving Convolutional Neural Networks for ASR with Global Context
Wei Han, Zhengdong Zhang, Yu Zhang, Jiahui Yu, Chung-Cheng Chiu, James Qin, Anmol Gulati, Ruoming Pang, Yonghui Wu.
INTERSPEECH 2020

Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang.
INTERSPEECH 2020

FSNet: Compression of Deep Convolutional Neural Networks by Filter Summary
Yingzhen Yang, Jiahui Yu, Nebojsa Jojic, Jun Huan, Thomas Huang.
ICLR 2020 / OpenReview

Scale-wise Convolution for Image Restoration
Yuchen Fan, Jiahui Yu, Ding Liu, Thomas Huang.
AAAI 2020 / Code

Pyramid Attention Networks for Image Restoration
Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas Huang, Humphrey Shi.
IJCV 2020

AutoSlim: Towards One-Shot Architecture Search for Channel Numbers
Jiahui Yu and Thomas Huang.
NeurIPS Workshop 2019 / Code

Universally Slimmable Networks and Improved Training Techniques
Jiahui Yu and Thomas Huang.
ICCV 2019 / Code

Free-Form Image Inpainting with Gated Convolution (DeepFill v2)
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas Huang.
ICCV 2019 (Oral Presentation) / Code

Fast Proximal Gradient Descent for Non-convex Optimization
Yingzhen Yang and Jiahui Yu.
UAI 2019 / Code

Foreground-aware Image Inpainting
Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, Jiebo Luo.
CVPR 2019

Slimmable Neural Networks
Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, Thomas Huang.
ICLR 2019 / Code / OpenReview
Top-10 Rated Papers on OpenReview, ICLR 2019.

Wide Activation for Efficient and Accurate Image Super-Resolution
Jiahui Yu, Yuchen Fan, Jianchao Yang, Ning Xu, Zhaowen Wang, Xinchao Wang, Thomas Huang.
BMVC 2019 (challenge report) / Code
Won 1st in NTIRE Challenge on Single Image Super-Resolution, CVPR 2018.

Improving Object Detection from Scratch via Gated Feature Reuse
Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides.
BMVC 2019 / Code

Generative Image Inpainting with Contextual Attention (DeepFill v1)
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, Thomas Huang.
CVPR 2018 / Code

Neighborhood Regularized l1-Graph
Yingzhen Yang, Jiashi Feng, Jiahui Yu, Jianchao Yang, Pushmeet Kohli, Thomas Huang.
UAI 2017

Support Regularized Sparse Coding and Its Fast Encoder
Yingzhen Yang, Jiahui Yu, Pushmeet Kohli, Jianchao Yang, Thomas Huang.
ICLR 2017

UnitBox: An Advanced Object Detection Network
Jiahui Yu, Yuning Jiang, Zhangyang Wang, Zhimin Cao, Thomas Huang.
ACMMM 2016
Adapted to official TensorFlow Object Detection API.


Service

  • Journal Reviewer: TPAMI, IJCV, TIP, TMM, TMI, TVCG, TVCJ, TCSVT, TSC, TGRS, IMAGE, JSTSP, JVCI, NEUCOM, PR, SPL, MULT, TNNLS, ENG, IROS, ACCESS, Hindawi.
  • Conference Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, SIGGRAPH, ACM Multimedia, AAAI, IJCAI, ICME, PRCV, PG, ACCV, ICASSP.