
Job Overview
Location
San Francisco, CA (Hybrid) OR Remote (Americas, UTC-3 to UTC-10)
Job Type
Full-time
Category
Data Science
Date Posted
March 21, 2026
Full Job Description
đź“‹ Description
- • As a Research Engineer focused on Search/IR at Firecrawl Inc., you will own and advance the core search and information retrieval systems that power the company's ability to extract, index, rank, and serve web content at massive scale — directly enabling developers worldwide to access LLM-ready data through a single API call. This role is critical to Firecrawl’s mission of building essential infrastructure for super-intelligence to gather web data, where your work impacts billions of documents and the freshness, relevance, and speed of search results.
- • Day to day, you will design, build, and operate scalable search indexes handling billions of documents, optimizing for latency and storage efficiency; own the full search pipeline from ingestion and processing to indexing, ranking, query understanding, and serving layers, ensuring end-to-end reliability and performance; develop and iterate on ranking models and relevance scoring systems using techniques like BM25, learned ranking, and embedding-based retrieval to improve search quality; implement systems for freshness, deduplication, and incremental indexing to keep the index current without full rebuilds, managing continuous updates at scale; design and run rigorous experiments to test hypotheses, measure impact, and ship improvements to production quickly based on data-driven decisions; collaborate closely with the research team to align search/IR advancements with model training and broader product strategy, bridging infrastructure and AI innovation.
- • Firecrawl is a small, fast-moving, technical team that has achieved 8 figures in ARR and over 90k GitHub stars in just one year by building the fastest way for developers to get LLM-ready web data. The company operates with an intense pace of urgency, shipping deep technical improvements rapidly to stay ahead in the AI infrastructure space, and values engineers who can build, improve, and ship on evolving systems without waiting for perfect documentation or abstraction.
- • In this role, you will deepen your expertise in large-scale search and information retrieval systems, gain end-to-end ownership of a production-critical infrastructure pipeline, and have the autonomy to experiment, measure, and ship improvements that directly impact product quality and scalability — positioning you as a key contributor to the foundational data layer powering next-generation AI applications.
🎯 Requirements
- • 3+ years of experience building search or information retrieval (IR) systems at massive scale, handling billions of documents with real latency and throughput requirements
- • Hands-on experience designing and improving ranking, relevance, and query understanding systems, including knowledge of BM25, learned ranking, embedding-based retrieval, and production deployment of ranking models
- • Proven ability to own the full search stack — from ingestion pipelines and indexing to serving layers — including expertise in sharding, schema evolution, index compaction, and incremental/freshness-focused updates without full rebuilds
🏖️ Benefits
- • Competitive salary range of $180,000–$270,000/year, adjusted fairly for non-U.S. based employees based on local cost of living
- • Equity grant of up to 0.15% in the company, allowing direct ownership in the impact you help create
- • Generous time-off policy including 15 days mandatory PTO, unlimited additional days after 24 (just ask), 12 weeks fully paid parental leave, and a 3-month paid sabbatical after 4 years of service
Skills & Technologies
About Firecrawl Inc.
Firecrawl Inc. provides an API that converts entire websites into clean markdown or structured data. Designed for AI applications, the service crawls all accessible subpages, renders dynamic content, and returns LLM-ready output without requiring sitemaps. It includes built-in scraping, search, and extraction capabilities for building knowledge bases, fine-tuning datasets, or powering chatbots. The company targets developers and data teams who need reliable web content ingestion at scale, offering cloud-hosted endpoints and self-hosted options under a usage-based pricing model.
Subscribe to the weekly newsletter for similar remote roles and curated hiring updates.
Newsletter
Weekly remote jobs and featured talent.
No spam. Only curated remote roles and product updates. You can unsubscribe anytime.
Similar Opportunities

NextGen Healthcare, Inc.
1 month ago

Astronomer, Inc.
29 days ago
2 months ago

