Junjie Fei’s Homepage

I am currently a PhD student at the King Abdullah University of Science and Technology (KAUST), under the supervision of Prof. Mohamed Elhoseiny. Before that, I obtained my BS and MS degrees from Chongqing University and Xiamen University, respectively. I also gained valuable research experience as a visiting student / research assistant at SUSTech VIP Lab and KAUST Vision CAIR. Please refer to my CV for more details.

My recent research interests are focused on vision-language multimodal learning and artificial intelligence generated content. Feel free to drop me an email at junjiefei@outlook.com / junjie.fei@kaust.edu.sa if you are interested in collaborating.

News

  • [2024/08] Join KAUST as a PhD student!
  • [2023/07] 1 paper has been accepted by ICCV 2023!
  • [2023/04] Project Caption Anything is publicly released!

Research

(* equal contribution)

Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding
Junjie Fei*, Mahmoud Ahmed*, Jian Ding, Eslam Mohamed Bakr, Mohamed Elhoseiny
arXiv, 2024
project / paper

Kestrel is a part-aware point grounding 3D MLLM, capable of comprehending and generating language and locating the position of the object and its materials at the part level.

Transferable Decoding with Visual Entities for Zero‑Shot Image Captioning
Junjie Fei*, Teng Wang*, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng
ICCV, 2023
code / paper

Improving the transferability of zero-shot captioning for out-of-domain images by addressing the modality bias and object hallucination that arise when adapting pre-trained vision-language models and large language models.

Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang*, Jinrui Zhang*, Junjie Fei*, Hao Zheng, Yunlong Tang, Zhe Li, Mingqi Gao, Shanshan Zhao
arXiv, 2023
code / paper / demo

Caption Anything is an interactive image‑to‑text generative tool that can generate diverse descriptions for any user-specified object within an image, providing a variety of language styles and visual controls to cater to diverse user preferences.

Hybrid Microwave Imaging of 3-D Objects Using LSM and BIM Aided by a CNN U-Net
Feng Han, Miao Zhong, Junjie Fei
IEEE Transactions on Geoscience and Remote Sensing (2 Year IF: 8.125, ranking: 42/708)
paper

An efficient and accurate 3-D quantitative hybrid microwave imaging method, which incorporates 3D U-Net to further refine the reconstructed object.

Fast 3-D Electromagnetic Full-Wave Inversion of Dielectric Anisotropic Objects Based on ResU-Net Enhanced by Variational Born Iterative Method
Junjie Fei, Yanjin Chen, Miao Zhong, Feng Han
IEEE Transactions on Antennas and Propagation (2 Year IF: 4.824, ranking: 71/708)
paper

ResU-Net is proposed to directly reconstruct 3-D anisotropic objects from the received electromagnetic field data.