Section Sub Image
Compute the Narrative
Section Sub Image

Earnings Call Transcripts & Presentations

Alpha isn't just in the numbers; it's in the tone, the hesitation, and the Q&A. We convert unstructured earnings calls and investor decks into machine-readable datasets. Access 15+ years of perfectly parsed transcripts, synchronized slide decks, and speaker-level sentiment scoring.

1Mn+
Transcripts
15+ Years
Historical Data
Speaker-ID
Separation
NLP-Ready
JSON Structure

Why Use Nextmark Insider Data?

Ready for Your LLM. Stop spending 80% of your time cleaning PDFs. We provide the clean text, mapped metadata, and slide content you need to feed your RAG pipelines immediately.

Work Single Icon
Speaker & Role Mapping

Generic transcripts are messy. We explicitly separate "Management Remarks" from the "Q&A Session." We identify every speaker by Role (CEO, CFO) and Name, allowing you to track who said what. Run sentiment analysis specifically on the CFO's answers during the Q&A to spot hesitation.

Work Single Icon
Presentation Deck Parsing

Don't ignore the slides. We scrape and OCR the accompanying Earnings Presentation (PDF), extracting the text and tables from every slide. We link specific slide content to the timestamp in the transcript where it was discussed, giving you the full multimedia context.

Work Single Icon
NLP Optimized "Chunks"

Building a RAG bot? We offer a "Chunked" feed. Instead of one massive text blob, retrieve transcripts pre-split into semantic paragraphs with embedded metadata (Ticker, Quarter, Speaker). This drastically improves vector search accuracy for queries like "Show me all guidance updates."

Data Delivery Formats

Built for Your Stack

Text, Audio, or Vector. Choose the format that fits your research workflow.
Automate Icon
REST API (JSON)

Get me the Q&A session text from Microsoft's Q3 call.

Automate Icon
Bulk Feed (CSV/Excel)

Download the entire history of S&P 500 transcripts for training a custom financial BERT model.

Automate Icon
Vector DB (RAG-Ready)

Our pre-embedded feed allows you to query concepts ("Supply Chain Headwinds") without managing your own embedding model.

Automate Icon
SQL Direct Connect

Plug our database directly into your internal warehouse Snowflake or BigQuery.

Data Output Sample

Structured for Precise NLP Analysis
JSON
{
  "ticker": "UBER",
  "quarter": "2024-Q3",
  "date": "2024-11-05",
  "presentation_url": "https://nextmark.data/decks/uber_q3_24.pdf",
  "segments": [
    {
      "segment_type": "Management_Remarks",
      "speaker_name": "Dara Khosrowshahi",
      "speaker_role": "CEO",
      "text": "We are seeing unprecedented demand in the mobility segment...",
      "sentiment_score": 0.85,
      "linked_slide": 4
    },
    {
      "segment_type": "Q&A",
      "speaker_name": "Analyst (Goldman Sachs)",
      "text": "Can you elaborate on the margin compression in freight?",
      "sentiment_score": -0.12
    },
    {
      "segment_type": "Q&A_Response",
      "speaker_name": "Prashanth Mahendra-Rajah",
      "speaker_role": "CFO",
      "text": "Freight remains a cyclical headwind, but we expect...",
      "sentiment_score": 0.05
    }
  ]
}
Key Fields
  •  segment_type : Crucial for filtering. Many algo-traders ignore the "Scripted Remarks" (which are PR-polished) and focus entirely on the Q&A_Response segments, where management is more likely to slip up or reveal true sentiment.
  •  sentiment_score : A pre-calculated NLP score (-1.0 to +1.0) for that specific paragraph. This allows you to plot the "Emotional Arc" of the call—did the CFO sound confident at the start but defensive during the Q&A?
  •  linked_slide : Direct context. We map the spoken text to the specific slide number being presented, allowing your analysts to view the chart the CEO is describing in real-time.

Who Is This For?

Choose Icon
NLP Quants

Train models to predict stock moves based on "CFO Confidence" or "Analyst Tone."

Choose Icon
Fundamental Analysts

Search across 10 years of transcripts instantly to find every time "Competition" was mentioned.

Choose Icon
RAG Developers

Build internal chatbots that answer questions like "What did the CEO say about AI capex?" with perfect citation.

Read Between the Lines.

Turn unstructured voice into structured alpha.
Cta Image

FAQs

Our team of experienced financial advisors is here to provide personalized guidance and support.

How far back do transcripts go?
Faq Icon

We have full coverage of US Equities (Russell 3000) going back to 2008. Global coverage (Europe/APAC) typically starts around 2014.

How accurate is the text?
Faq Icon

We use a hybrid "AI + Human-in-the-Loop" process. A specialized financial speech-to-text model generates the first draft, and human editors verify proper nouns, specialized financial jargon (e.g., "EBITDA"), and speaker attribution for accuracy >99%.

Do you include the "Safe Harbor" statement?
Faq Icon

y default, we flag and separate the standard "Safe Harbor" and "Forward-Looking Statements" legal disclaimer at the start of the call, so your NLP models don't waste tokens processing boilerplate legalese.

Can I get the slide deck images?
Faq Icon

es. The API provides a direct link to the parsed PDF of the presentation deck. We also provide an OCR endpoint that returns the raw text content of each slide as a JSON object.

How do you handle multi-lingual calls?
Faq Icon

For global companies (e.g., Toyota, Samsung) that hold earnings calls in their native language, we provide Dual-Channel Transcripts. You get the original native text and an English translation side-by-side. Our metadata flags these as translated: true, allowing you to decide whether to process the raw source or the translated version.

Can I search for specific financial concepts across thousands of transcripts?
Faq Icon

es. Our "Concept Tagging" engine automatically tags transcripts with themes like "Guidance Raise," "Supply Chain Disruption," or "Share Buyback Announcement." Instead of writing complex keyword regex (e.g., "buyback" OR "repurchase"), you can simply query concept:share_buyback to instantly retrieve every relevant management discussion from the S&P 500 history.