Turn the global news cycle into a tradable time-series. Access 15+ years of institutional-grade news, mapped to 50,000+ tickers, and scored for sentiment with context-aware NLP. Don't just read the news—measure the market's reaction before it happens.
Ready for Your LLM. Stop spending 80% of your time cleaning PDFs. We provide the clean text, mapped metadata, and slide content you need to feed your RAG pipelines immediately.
Most news feeds are a firehose of unstructured noise. Nextmark converts millions of daily articles, press releases, and filings into structured, qualitative data. We use proprietary Entity Resolution to ensure that a story about "Apple" is mapped to $AAPL, not the fruit, and a story about "Paris" is mapped to the market, not the city.
Legacy sentiment tools use "bag-of-words" counting. Nextmark uses Transformer-based LLMs to understand context. We differentiate between a "Revenue Miss" (Negative) and a "Strategic Divestiture" (Neutral/Positive), giving you a sentiment score that correlates with price action, not just keyword density.
Built for backtesting. Our historical archives are stamped with the exact millisecond the news hit the wire, preventing look-ahead bias in your models. Train your algos on the reality of the past, not the revised history.
Real-time, low-latency lookups for live trading.
Push-based feed for immediate event processing.
Vector ready JSONL feed to directly input into your models.
Historical dump for backtesting and heavy quantitative research.
{
"timestamp_utc": "2023-10-12T14:30:00.00Z",
"article_id": "nm_8829102",
"source": "Global_Wire_Service",
"headline": "TechCorp announces strategic delay in
Q4 chipset rollout due to supply constraints.",
"entities": [
{
"ticker": "TCP",
"figi": "BBG000BLNNH6",
"name": "TechCorp Inc.",
"sentiment_score": -0.78,
"sentiment_label": "Negative",
"relevance_score": 1.0,
"event_type": "Supply_Chain_Disruption"
},
{
"ticker": "NVDA",
"figi": "BBG000BBJQV0",
"name": "NVIDIA Corp",
"sentiment_score": -0.15,
"sentiment_label": "Neutral_Negative",
"relevance_score": 0.3,
"event_type": "Sector_Read_Across"
}
]
} sentiment_score : A continuous -1.0 to +1.0 float quantifying the exact polarity of the news event for regression modeling.. relevance_score : A 0-100 confidence metric indicating if the ticker is the story's subject or just a footnote mention. event_type : Auto-classification of the article into 50+ financial categories like "M&A," "Guidance Update," or "Executive Change." timestamp_utc : Nanosecond-precision timestamps for when the news hit the wire, ensuring zero look-ahead bias in backtests. ticker / figi :Robust entity mapping that handles share class distinctions and historical ticker changes automatically. source_rank : A 1-5 credibility rating allowing you to weight major wires over unverified blogs or social noise. novelty_score : A uniqueness metric that filters out reposts and echoes to ensure you only trade on breaking information.
Our team of experienced financial advisors is here to provide personalized guidance and support.
We do not use legacy "positive/negative" word counting (Bag-of-Words), which fails to capture financial nuance. Our engine utilizes domain-specific Transformer models (LLMs) trained on 15 years of financial text. This allows the model to understand that a phrase like "narrowing losses" is Positive, whereas a keyword search for "loss" would flag it as Negative. We score strictly on financial implication, not just linguistic polarity.
All historical data is stamped with the exact millisecond the article was ingested by our system (ingest_timestamp_utc), not the time the event occurred. Furthermore, our entity mapping respects the historical symbology of that specific date. If you query data from 2014 for Facebook, the API returns data mapped to $FB, ensuring your backtest reflects the exact reality of the market at that moment, with zero forward-looking leakage.
We calculate a novelty_score for every incoming article by vectorizing the text and comparing it against a rolling 24-hour window of news for that specific ticker. High Novelty (>0.8): Breaking news or a materially new update. Low Novelty (<0.2): Syndicated re-prints or minor updates to an existing story. This allows you to filter out the "echo chamber" and trade only on the initial signal.
We provide a relevance_score (0-100) for every entity detected.If a company is in the headline or the lead paragraph, it receives a score of 100. If a company is merely listed as a constituent in an ETF wrapper or mentioned in a peer comparison, it receives a score below 20.We recommend filtering your feed for relevance_score > 70 to remove noise from index rebalancing or sector roundups.
We map all data to the Bloomberg FIGI (Financial Instrument Global Identifier), which remains persistent regardless of ticker changes. Our API allows you to query by the current ticker (e.g., META) and automatically retrieve the full historical history of the previous ticker (FB) seamlessly stitched together. You do not need to maintain your own mapping table.
Yes. In addition to the raw sentiment_score (-1 to +1), we provide a confidence_interval. If a news event is ambiguous or polarizing (e.g., a complex merger announcement where the model detects both high positive and high negative signals), the confidence_interval widens. Sophisticated models use this variance as a proxy for potential volatility, allowing you to size positions down when the model is less certain of the directional impact.