O

Machine Learning Researcher

OpenReq
Full-time
On-site
Cupertino, California, United States
Overview
With Etched ASICs, we have fundamentally different constraints than existing AI chips. We don’t
have the same batch-size-latency tradeoff as GPUs. We can parallelize workloads and digest
large contexts much more efficiently than GPUs.

Sohu enables entirely new research directions and products. When our chips come out, these
use cases need to already be mature and visceral. Whether real-time video, agents, speculative
decoding, or new tree search algorithms, we must create the market for our hardware.
In this role, you will lead efforts on research directions that trade compute for improved
understanding and speed, particularly with agents and new search techniques.

Representative projects
● Combine multi-agent with RAG to improve the quality of QA
● Evaluate new system on standard benchmark
● Design new and verifiable benchmark for agent reasoning
● Design LLM content understanding based recommendation systems

You may be a good fit if you have:
● Past projects or publications with substantial impact in ML and/or CV (quality > quantity)
● Strong hands-on engineering skills, particularly with python, pytorch, CUDA, DDP/FSDP
● Deep understanding of open and closed source model architectures and open source
libraries for transformer training and inference
● Familiarity with LlamaIndex, LangChain, MetaGPT, CoT, ToT, and beam search
● Ability to think outside the box and make tradeoffs considering feasibility, quality, and
time-to-ship of a project

Strong candidates may also have experience with:
● Knowledge graphs
● Name entity recognition
● Tool calling and coder LLMs
● SWE bench & SWE agent
 
We encourage you to apply even if you do not believe you meet every single qualification.