E

Artificial Intelligence Research Scientist (Gen AI - Multimodal Learning)

Eluvio
Full-time
On-site
Berkeley, California, United States
Description

Eluvio is a highly focused and ambitious team of systems, networking, application, and video software engineers, AI scientists, ML engineers, and security experts working together to implement the vision of the Content Fabric - a decentralized platform for video and commerce with the ambition of serving the world's Internet video. The Eluvio Content Fabric provides an innovative distributed and decentralized video processing framework with just-in-time and personalized experiences, made possible through our state-of-the-art real-time content routing and just-in-time code execution. We are headquartered in Berkeley, CA. 

We are currently looking for a full-time research scientist for the AI Team. As an AI Research Scientist, you will have the opportunity to work with the AI Team on cutting-edge generative AI technologies for improving the content/video descriptors as well as our software offerings for generative AI-based content. 

The scope of this work entails (but is not limited to), 

  • Survey: Exploring generative AI technologies targeting multimedia content improvement and/or generation.    
  • Research (Proof of Concept):  
    • Design and development of advanced generative techniques towards content improvement, like., A/V quality, resolution, etc. as well as video description, understanding, and summarization.
    • Design and development of advanced generative techniques for user-guided content generation. For e.x. Text to Image, Text to Video, etc. 
  • Development (Engineering)
  • Integrating the developed model into the Eluvio ML stack. 

You will also be able to work on unique multimodal data sets including, video, audio, speech, text, image, 3D, etc., and a very specific product line unique to Eluvio’s AI stack offering. This work will directly impact the product. We highly encourage publishing our research in top conferences.



Requirements

  • Has Masters in a relevant field, Ph.D. preferred. 
  • Has research experience with modern generative technologies like Diffusion, GANs, etc. for video restoration and/or text-to-image (text-to-video) generation.
  • Has working knowledge of multimodal learning (CLIP, BLIP, etc.) and a good understanding of advanced learning settings like, self-supervised learning, semi-supervised, and transductive learning.
  • Has prior publications in top-tier conferences like., NeuRIPS, ICML, ICLR, CVPR, KDD, RecSys, etc and/or journals.
  • Strong experience with Python-based eco systems - Pytorch, Tensorflow, JAX, etc.

Nice to Have

  • Working knowledge of java-script, Rust, etc.
  • Full-stack application experience operating and using video tools such as ffmpeg.


Benefits

Medical, dental, 401K