Bulk-downloadable, pre-enriched sports datasets — for people who want to model, not scrape.
Raw Sports Vault is a premium baseball data library covering 2010 through 2026, with historical records dating back to 1871. More sports are coming in future releases. Today the catalog is seven bundles of pre-cleaned, pre-enriched datasets ranging from $59 (CSV/Excel of the historical record) to $449 (every dataset in every format, including a pre-loaded SQLite database).
The product isn't the data — the data is public. The product is what we did to it: cleaning, deduplication, schema standardization, leakage-checked feature engineering, and packaging in formats you can actually use.
If you've ever spent a weekend trying to merge a historical archive with pitch-by-pitch tracking and an odds feed and gave up, that's the problem we solved.
season - 1); historical records back to 1871 for the foundational tables.Every byte we ship is sourced from a public origin. The work product we charge for is what's done in between: cleaning, joining, deduplicating, enriching, and reformatting. Sources:
If you're a source maintainer and have questions about how we use your data, email us — we'll happily walk through specifics.
What you're licensing is our compiled bundle. Use it for research, modeling, betting, fantasy, editorial, and internal analytics. Don't redistribute the raw files as a competing product. Keep your receipt email safe — it contains your download link. See the FAQ for full usage terms and policies.
Questions, custom dataset requests, broken-file reports, enterprise licenses, or feedback:
We answer within one business day. For download-link, receipt, or payment issues, contact payhip.com directly — we do not have access to your payment or account information.
Or grab the free sample first if you want to see the schema before committing.