Compute the Narrative

Earnings Call Transcripts & Presentations

Alpha isn't just in the numbers; it's in the tone, the hesitation, and the Q&A. We convert unstructured earnings calls and investor decks into machine-readable datasets. Access 15+ years of perfectly parsed transcripts, synchronized slide decks, and speaker-level sentiment scoring.

Get API Key

Read Documentation

read Documentation

1Mn+

Transcripts

15+ Years

Historical Data

Speaker-ID

Separation

NLP-Ready

JSON Structure

Why Use Nextmark Insider Data?

Ready for Your LLM. Stop spending 80% of your time cleaning PDFs. We provide the clean text, mapped metadata, and slide content you need to feed your RAG pipelines immediately.

Get the Dataset

Get Started

Speaker & Role Mapping

Generic transcripts are messy. We explicitly separate "Management Remarks" from the "Q&A Session." We identify every speaker by Role (CEO, CFO) and Name, allowing you to track who said what. Run sentiment analysis specifically on the CFO's answers during the Q&A to spot hesitation.

Presentation Deck Parsing

Don't ignore the slides. We scrape and OCR the accompanying Earnings Presentation (PDF), extracting the text and tables from every slide. We link specific slide content to the timestamp in the transcript where it was discussed, giving you the full multimedia context.

NLP Optimized "Chunks"

Building a RAG bot? We offer a "Chunked" feed. Instead of one massive text blob, retrieve transcripts pre-split into semantic paragraphs with embedded metadata (Ticker, Quarter, Speaker). This drastically improves vector search accuracy for queries like "Show me all guidance updates."

Data Delivery Formats

Built for Your Stack

Text, Audio, or Vector. Choose the format that fits your research workflow.

Get Started

REST API (JSON)

Get me the Q&A session text from Microsoft's Q3 call.

Bulk Feed (CSV/Excel)

Download the entire history of S&P 500 transcripts for training a custom financial BERT model.

Vector DB (RAG-Ready)

Our pre-embedded feed allows you to query concepts ("Supply Chain Headwinds") without managing your own embedding model.

SQL Direct Connect

Plug our database directly into your internal warehouse Snowflake or BigQuery.

Data Output Sample

Structured for Precise NLP Analysis

Get Started

JSON

{
  "ticker": "UBER",
  "quarter": "2024-Q3",
  "date": "2024-11-05",
  "presentation_url": "https://nextmark.data/decks/uber_q3_24.pdf",
  "segments": [
    {
      "segment_type": "Management_Remarks",
      "speaker_name": "Dara Khosrowshahi",
      "speaker_role": "CEO",
      "text": "We are seeing unprecedented demand in the mobility segment...",
      "sentiment_score": 0.85,
      "linked_slide": 4
    },
    {
      "segment_type": "Q&A",
      "speaker_name": "Analyst (Goldman Sachs)",
      "text": "Can you elaborate on the margin compression in freight?",
      "sentiment_score": -0.12
    },
    {
      "segment_type": "Q&A_Response",
      "speaker_name": "Prashanth Mahendra-Rajah",
      "speaker_role": "CFO",
      "text": "Freight remains a cyclical headwind, but we expect...",
      "sentiment_score": 0.05
    }
  ]
}

Key Fields

segment_type : Crucial for filtering. Many algo-traders ignore the "Scripted Remarks" (which are PR-polished) and focus entirely on the Q&A_Response segments, where management is more likely to slip up or reveal true sentiment.
sentiment_score : A pre-calculated NLP score (-1.0 to +1.0) for that specific paragraph. This allows you to plot the "Emotional Arc" of the call—did the CFO sound confident at the start but defensive during the Q&A?
linked_slide : Direct context. We map the spoken text to the specific slide number being presented, allowing your analysts to view the chart the CEO is describing in real-time.

Who Is This For?

NLP Quants

Train models to predict stock moves based on "CFO Confidence" or "Analyst Tone."

Learn More

Fundamental Analysts

Search across 10 years of transcripts instantly to find every time "Competition" was mentioned.

Learn More

RAG Developers

Build internal chatbots that answer questions like "What did the CEO say about AI capex?" with perfect citation.

Learn More

Read Between the Lines.

Turn unstructured voice into structured alpha.

Get Free API Key

Get Started

D‍ownload Sample JSON

Get Started

FAQs

Our team of experienced financial advisors is here to provide personalized guidance and support.

Contact us

How far back do transcripts go?

We have full coverage of US Equities (Russell 3000) going back to 2008. Global coverage (Europe/APAC) typically starts around 2014.

How accurate is the text?

We use a hybrid "AI + Human-in-the-Loop" process. A specialized financial speech-to-text model generates the first draft, and human editors verify proper nouns, specialized financial jargon (e.g., "EBITDA"), and speaker attribution for accuracy >99%.

Do you include the "Safe Harbor" statement?

y default, we flag and separate the standard "Safe Harbor" and "Forward-Looking Statements" legal disclaimer at the start of the call, so your NLP models don't waste tokens processing boilerplate legalese.

Can I get the slide deck images?

es. The API provides a direct link to the parsed PDF of the presentation deck. We also provide an OCR endpoint that returns the raw text content of each slide as a JSON object.

How do you handle multi-lingual calls?

For global companies (e.g., Toyota, Samsung) that hold earnings calls in their native language, we provide Dual-Channel Transcripts. You get the original native text and an English translation side-by-side. Our metadata flags these as translated: true, allowing you to decide whether to process the raw source or the translated version.

Can I search for specific financial concepts across thousands of transcripts?

es. Our "Concept Tagging" engine automatically tags transcripts with themes like "Guidance Raise," "Supply Chain Disruption," or "Share Buyback Announcement." Instead of writing complex keyword regex (e.g., "buyback" OR "repurchase"), you can simply query concept:share_buyback to instantly retrieve every relevant management discussion from the S&P 500 history.