Data sources
EDINET: Japan's official corporate disclosure system
All structured filing data comes from EDINET, the Financial Services Agency's electronic disclosure system. EDINET is the regulatory source of truth for every listed company in Japan, covering securities reports, large shareholding filings, tender offers, extraordinary reports, treasury stock filings, and the rest of the document family.
Japan Finsight updates from EDINET regularly. Every filing since January 2018 is ingested. Every EDINET-sourced value carries its filing identifier (edinet_document_id), so you can trace it to the source on EDINET's portal. LLM-extracted fields include the source quote they were drawn from.
Parser layer
The XBRL parsing that turns EDINET's CSV dumps into typed structured data is handled by edinet-tools, an MIT-licensed Python library, developed and maintained as a separate open-source project. pip install edinet-tools gives you the same parsers Japan Finsight uses internally, by design.
LLM-extracted fields
Several Japan Finsight tools add LLM-extracted fields on top of the deterministic XBRL parsing:
get_capital_policy: target payout ratios and dividend skip reasons extracted from annual report text blocks, with source quotes embedded in the response.get_related_party_transactions: counterparties, transaction types, and amounts structured from filing text, with every quote verified against the source filing.get_holding_actions: activist-intent annotations on large-shareholding filings (Doc 350). A curated registry of known activist filers, plus extracted change proposals and desired outcomes.get_material_events: extraordinary report (Doc 180) classification into material-event categories.
Extraction policy: we extract facts the company stated in the filing with source quotes. We do not derive judgments about the company. Empty results mean "the company didn't state this," not "the company doesn't do this." Every extracted field carries the source quote it was drawn from.
Companion projects
edinet-tools: the parser library (PyPI).kabu-agent: a companion AI agent for working with EDINET data (open source).
What we don't source from
- Bloomberg / FactSet / Refinitiv: none. All structured data is sourced directly from FSA EDINET.
- Live market data: not redistributed through public tools; commercial licensing required.
- TDNet: earnings short-reports (決算短信) are not yet ingested; planned for a future release.