Firecrawl Inc. logo

Research Engineer (Focused on Search/IR)

Job Overview

Location

San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10)

Job Type

Full-time

Category

Data Science

Date Posted

March 21, 2026

Full Job Description

đź“‹ Description

  • • As a Research Engineer focused on Search/IR at Firecrawl Inc., you will own and advance the core search and information retrieval systems that power the company's ability to extract, index, rank, and serve web content at massive scale — directly enabling developers worldwide to access LLM-ready data through a single API call. This role is critical to Firecrawl’s mission of building essential infrastructure for super-intelligence to gather web data, where your work impacts billions of documents and the freshness, relevance, and speed of search results.
  • • Day to day, you will design, build, and operate scalable search indexes handling billions of documents, optimizing for latency and storage efficiency; own the full search pipeline from ingestion and processing to indexing, ranking, query understanding, and serving layers, ensuring end-to-end reliability and performance; develop and iterate on ranking models and relevance scoring systems using techniques like BM25, learned ranking, and embedding-based retrieval to improve search quality; implement systems for freshness, deduplication, and incremental indexing to keep the index current without full rebuilds, managing continuous updates at scale; design and run rigorous experiments to test hypotheses, measure impact, and ship improvements to production quickly based on data-driven decisions; collaborate closely with the research team to align search/IR advancements with model training and broader product strategy, bridging infrastructure and AI innovation.
  • • Firecrawl is a small, fast-moving, technical team that has achieved 8 figures in ARR and over 90k GitHub stars in just one year by building the fastest way for developers to get LLM-ready web data. The company operates with an intense pace of urgency, shipping deep technical improvements rapidly to stay ahead in the AI infrastructure space, and values engineers who can build, improve, and ship on evolving systems without waiting for perfect documentation or abstraction.
  • • In this role, you will deepen your expertise in large-scale search and information retrieval systems, gain end-to-end ownership of a production-critical infrastructure pipeline, and have the autonomy to experiment, measure, and ship improvements that directly impact product quality and scalability — positioning you as a key contributor to the foundational data layer powering next-generation AI applications.

🎯 Requirements

  • • 3+ years of experience building search or information retrieval (IR) systems at massive scale, handling billions of documents with real latency and throughput requirements
  • • Hands-on experience designing and improving ranking, relevance, and query understanding systems, including knowledge of BM25, learned ranking, embedding-based retrieval, and production deployment of ranking models
  • • Proven ability to own the full search stack — from ingestion pipelines and indexing to serving layers — including expertise in sharding, schema evolution, index compaction, and incremental/freshness-focused updates without full rebuilds

🏖️ Benefits

  • • Competitive salary range of $180,000–$270,000/year, adjusted fairly for non-U.S. based employees based on local cost of living
  • • Equity grant of up to 0.15% in the company, allowing direct ownership in the impact you help create
  • • Generous time-off policy including 15 days mandatory PTO, unlimited additional days after 24 (just ask), 12 weeks fully paid parental leave, and a 3-month paid sabbatical after 4 years of service

Skills & Technologies

Go
Elasticsearch
GitHub
Remote
$180k-270k

Ready to Apply?

You will be redirected to an external site to apply.

Firecrawl Inc. logo
Firecrawl Inc.
Visit Website

About Firecrawl Inc.

Firecrawl Inc. provides an API that converts entire websites into clean markdown or structured data. Designed for AI applications, the service crawls all accessible subpages, renders dynamic content, and returns LLM-ready output without requiring sitemaps. It includes built-in scraping, search, and extraction capabilities for building knowledge bases, fine-tuning datasets, or powering chatbots. The company targets developers and data teams who need reliable web content ingestion at scale, offering cloud-hosted endpoints and self-hosted options under a usage-based pricing model.

Get more remote jobs like this

Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.

Newsletter

Weekly remote jobs and featured talent.

No spam. Only curated remote roles and product updates. You can unsubscribe anytime.

Similar Opportunities

Work From Anywhere-India
Full-time
Expires May 16, 2026
Python
AWS
Kubernetes
+3 more

1 month ago

Apply
Remote (United States)
Full-time
Expires May 23, 2026
Python
AWS
Azure
+4 more

29 days ago

Apply
US-Remote
Full-time
Expires May 4, 2026
Remote

2 months ago

Apply
Colombia
Full-time
Expires May 27, 2026
Senior
Onsite

25 days ago

Apply