Founding AI Engineer

Rad Hires
Desde casa

Postulación rápida

Detalles del empleo

Tiempo completo

Descripción completa del empleo

Founding AI Engineer

Location: United States, Canada, Romania, Ukraine, Pakistan, Brazil, Argentina, Colombia

Type: Full-time | Remote

About the Company

Our client is an early-stage startup building an AI intelligence layer for the commercial real estate industry, embedded directly inside Excel. Think of it as an AI analyst that lives inside the analyst's existing workbook — purpose-built for institutional CRE acquisitions. It understands domain-specific financial models and can handle everything from parsing offering documents to building financial models and running market research, all with full provenance and human-in-the-loop control.

The Role

We're looking for a Founding AI/ML Engineer to own the intelligence core of the platform. This is not about fine-tuning generic models — it's about building the reasoning, extraction, orchestration, and evaluation systems that make an AI analyst trustworthy enough to use in a $100M+ transaction.

You'll be the first dedicated AI/ML hire, working directly with the founders, and owning the full AI layer from day one.

What You'll Own

Multi-agent orchestration: coordinator + sub-agent architecture with a planning loop, routing tasks to the right model (OpenAI, Anthropic, Gemini, Perplexity, Mistral) based on cost, quality, and latency.
Document intelligence pipelines: stateless extraction for financial documents (OMs, P&Ls, Rent Rolls) with per-field confidence scoring and bounding-box provenance.
RAG and retrieval infrastructure: vector-backed retrieval with hybrid search, embedding pipeline management, and context assembly for grounded model responses.
Evaluation and quality infrastructure: parser quality harnesses, extraction accuracy benchmarks, LLM output scoring, and feedback loops from analyst corrections.
Prompt architecture and context management: system prompt design, tool schema engineering, context window optimization, and few-shot construction from live deal data.
Provenance and hallucination controls: every output traces to a source document, page, and bounding box. If it can't be cited, it's flagged as an assumption.
Model strategy: track frontier model releases and make build-vs-buy calls on fine-tuning, custom classifiers, and retrieval augmentation.

The Stack

Orchestrator: FastAPI (async Python), SSE streaming, multi-agent architecture
AI / LLM: GPT-4.1, Claude Sonnet / Opus, Gemini 2.0 Flash, Perplexity Sonar Pro, Mistral
Retrieval: pgvector, PostgreSQL, hybrid RAG, embedding pipelines
Parsers: Azure Doc Intelligence, Mistral OCR, stateless extraction
Evals: Custom harnesses, labeling pipelines, correction feedback loops
Infra: Azure Container Apps, Service Bus, Blob Storage, Docker Compose
Frontend: Next.js, React 19 + Office.js (interface with, not own)

Must-Have

5+ years building AI/ML systems in production — real systems, real users, real failure modes.
LLM orchestration experience: tool-calling, multi-step reasoning chains, agent architectures, streaming.
Expert-level Python: async FastAPI, type-annotated, well-structured.
RAG systems: embedding pipeline design, hybrid retrieval, context assembly, chunk strategy.
Evals mindset: you measure model quality systematically through benchmarks, harnesses, and accuracy scoring.
Startup operating mode: you scope your own work, make judgment calls, and ship without waiting for consensus.

Strong-to-Have

Document extraction / OCR (Azure Doc Intelligence, Textract, or equivalent)
Fine-tuning experience (LoRA, RLHF, DPO, or classifier fine-tuning)
Vector database depth (pgvector, Pinecone, Weaviate)
Financial document literacy (P&Ls, rent rolls, structured financial data)
Multi-modal models: document layout understanding, table extraction, bounding-box grounding
Prompt security: adversarial inputs, injection hardening, output validation
Azure AI Services: OpenAI on Azure, Doc Intelligence, Blob-backed pipelines

Your First 30 Days

Week 1: Stand up the full stack locally. Run parser pipelines end-to-end on real documents. Understand the overall architecture.
Week 2: Deep-dive the extraction pipelines. Trace a document through parse, extract, normalize, map, and write. Identify the weakest quality link.
Week 3: Ship a measurable eval — a harness that scores extraction accuracy on a labeled document set and establishes a baseline.
Week 4: Own an improvement — better field normalization, improved context assembly, or a new document type. Ship it with a passing eval.

Why This Role

Full ownership of the AI layer of a real institutional product.
Hard, unsolved problems in document extraction, multi-agent reliability, and provenance in an agentic write path.
High-stakes domain: a $20T+ market where same-day turnaround wins deals. Your work directly affects whether a deal closes.
Frontier model access: working with the latest models from OpenAI, Anthropic, Google, and Mistral in production.
Operator founder: deep domain expertise from someone who has sat in the analyst, IC, and deal lead seats.

Postulación rápida

Founding AI Engineer

About the Company

The Role

What You'll Own

The Stack

Must-Have

Strong-to-Have

Your First 30 Days

Why This Role

Herramientas para candidatos

Herramientas para empresas

Explorar

Mantente conectado