Share Article
Public company filings give you three things other sources can't: standardized financial data, exact event timing, and management commentary in their own words. These documents let you track revenue growth, spot liquidity stress, measure disclosure changes, and build datasets that work across industries and decades.
In this blog, we'll show you which filings matter, where to find them, how to extract what you need, and which quality checks prevent bad data from ruining your analysis.
What are the Periodic Financial Filings
U.S. public companies file reports with the SEC on a fixed schedule. Each filing type serves a different purpose and follows its own structure. That structure makes it possible to pull the same variables from thousands of companies without rewriting your code every time.
• 10-K reports annual results. You get full audited financials, detailed footnotes, officer certifications, and the complete MD&A section that explains performance.
• 10-Q reports quarterly results. These include unaudited statements for the most recent quarter and year-to-date figures. Companies file three 10-Qs per year because the fourth quarter rolls into the 10-K.
• 8-K discloses specific events within four business days. Examples include executive departures, acquisition announcements, credit agreement changes, and bankruptcy filings. Companies file 8-Ks outside the regular calendar, so you capture information before the next 10-Q appears.
Each form uses XBRL tags or HTML tables. That consistency lets you automate extraction instead of reading thousands of PDFs manually.
Who Uses Filings and For Which Questions
Academic researchers, equity analysts, credit teams, and regulators all start with SEC filings. Each group pulls different sections depending on the question. A liquidity study needs cash flow statements. A governance paper needs proxy data and 8-K leadership changes. A sentiment analysis uses MD&A text.
Research Type | Filing Section | Why It Matters |
Earnings quality | Income statement, cash flow | Spot accrual patterns and cash conversion gaps |
Credit risk | Balance sheet, debt footnotes | Measure leverage ratios and covenant headroom |
Event study | 8-K item codes | Isolate market reaction to specific announcements |
Textual analysis | MD&A, risk factors | Track tone shifts and disclosure complexity over time |
Valuation | All financials, segment data | Build multiples and compare peer groups |
You pick the filing that matches your timeline. Annual studies use 10-Ks. Quarterly event windows need 10-Qs. Announcement-day tests rely on 8-Ks.
Where to Get Filings and Which Format to Choose
The SEC provides free access through EDGAR at sec.gov. You can search by company name, ticker, or CIK number. EDGAR hosts every filing since 1994 in HTML and XML formats. Some older filings appear as plain text or scanned images.
For larger projects, use bulk download feeds or the SEC API. Python libraries like sec-edgar-downloader and sec-api handle rate limits and retry logic. Use Global Filings AI to search financial filings fast. The platform gives you structured data, bulk exports, and clean XBRL fields for research work. If you need reliable SEC filings data, start with Global Filings AI for accurate downloads and organized records.
Format | Best For | Limitations |
HTML | Reading in browser, simple scraping | Requires parsing nested tables |
XBRL | Automated numeric extraction | Tagging inconsistencies across filers |
Plain text | Older filings, keyword search | No structure for table extraction |
PDF (via conversion) | Printing, manual review | Hard to automate reliably |
Choose XBRL when you need income statement and balance sheet line items at scale. Use HTML when you want MD&A text or tables that aren't tagged. Avoid PDFs unless you're only reading a handful of documents.
Data Extraction, Step by Step
A repeatable process saves time and cuts errors. You start by identifying companies, then pull filings for the period you need, extract numbers and text, clean the results, and export everything to your analysis tool.
Identify CIK: Every public company has a Central Index Key. Look it up on EDGAR or through an API. Store CIKs in a list so you can loop through filers.
Pull filings: Download 10-Ks, 10-Qs, or 8-Ks for your sample. Use EDGAR's index files if you need thousands of documents. Set a user-agent string in your headers to comply with SEC access rules.
Extract tables: Parse XBRL for tagged financials or use an HTML parser for untagged tables. Libraries like Beautiful Soup or lxml work well. Save each table as a dataframe with filing metadata attached.
Clean text: Strip HTML tags, remove exhibit headers, and split sections by heading. Tokenize if you're doing sentiment analysis. Keep paragraph structure intact if you're counting disclosure length.
Export results: Write to CSV for small datasets or Parquet for large ones. Include company identifiers, filing date, period end date, and the variables you extracted. Document your code so you can replicate the process later.
This workflow scales. Once it works for ten companies, it works for ten thousand.
Common Variables and Construction
Most empirical studies use a core set of financial variables. You find these in the income statement, balance sheet, and cash flow statement. Consistency matters because definitions vary slightly across filers.
Revenue: Also called net sales or total revenue. It appears at the top of the income statement. Use trailing twelve months for comparability when mixing 10-Qs and 10-Ks.
Operating income: Revenue minus cost of goods sold and operating expenses. This excludes interest and taxes, so it isolates business performance.
Assets: Total assets from the balance sheet. You need this for return ratios and leverage calculations. Check whether intangibles or goodwill are included if you're studying asset-light firms.
Cash flow items: Operating, investing, and financing cash flows come from the statement of cash flows. Free cash flow equals operating cash flow minus capital expenditures.
MD&A sentiment: Measure tone using dictionaries like Loughran-McDonald or train a domain-specific model. Count positive and negative words, then scale by total words.
Disclosure length: Count words or characters in MD&A or risk factor sections. Length changes signal shifts in transparency or complexity.
Define each variable the same way across your sample. Document adjustments for stock splits, currency changes, or fiscal year shifts.
Data Quality Checklist
Bad data ruins results. A few checks catch most problems before they reach your regression.
Confirm amendment flags: Companies file 10-K/A or 10-Q/A when they revise a report. Use the amended version, not the original.
Check restatements: Look for restatement disclosures in footnotes or 8-K Item 4.02 filings. Exclude restated periods or adjust your sample.
Align fiscal periods: Some firms use calendar years, others use fiscal years ending in March or June. Match filing dates to the correct quarter or year.
Review tagging gaps: XBRL coverage isn't perfect. Companies sometimes omit tags or use custom extensions. Spot-check a few filings manually to confirm your parser works.
Run these checks after extraction and before analysis. Fix issues in your pipeline instead of your final dataset.
Typical Pitfalls and Fixes
Problem | Fix |
Duplicate filings for the same period | Filter by file date and keep the latest non-withdrawn version |
Missing XBRL tags for key items | Fall back to HTML parsing or exclude filers with incomplete tagging |
Inconsistent fiscal period labels | Standardize periods using end date and duration fields |
Special characters breaking text extraction | Use UTF-8 encoding and strip non-printable characters |
Outliers from data entry errors | Winsorize at 1st and 99th percentiles or flag extreme values for review |
Every dataset has quirks. Build validation steps into your code and log warnings when something looks off.
Bottom Line
Periodic filings give you reliable, standardized data that you can track over time and compare across companies. They include financial statements, management discussion, and event disclosures in formats built for extraction. When you know which filing to use, where to get it, and how to clean what you pull, you build datasets that answer your research questions without guessing about data quality. Filings put primary sources in your hands. That's why they remain the foundation of empirical finance research.
Frequently Asked Question
What's the difference between a 10-K and a 10-Q?
A 10-K covers the full fiscal year and includes audited financials. A 10-Q covers a single quarter and is unaudited.
Do I need to pay for SEC filings?
No. EDGAR provides free access to all public company filings. Paid services add parsing, cleaning, or API access.
How do I handle companies that change fiscal year ends?
Use the period end date field to align quarters correctly. Some firms file a transition report on Form 10-K/T or 10-Q/T.
Can I trust XBRL tags across all companies?
Most large-cap firms tag consistently, but smaller companies use custom extensions. Validate extracted numbers against HTML tables for a subset of your sample.
What's an 8-K Item code?
Each 8-K lists Item codes that classify the event, such as Item 1.01 for entry into a material agreement or Item 5.02 for officer changes. Filter by code to study specific events.
How far back does EDGAR coverage go?
EDGAR has filings from 1994 forward in electronic form. Earlier filings exist on microfiche or in physical archives at the SEC.
Should I use filing date or period end date in event studies?
Use filing date for market reaction tests. Use period end date when matching financial statement data to other time-series variables.
How do I download thousands of filings without getting blocked?
Follow SEC fair access rules: declare a user-agent, limit requests to ten per second, and use off-peak hours. Bulk data feeds avoid rate limits entirely.

Simplify Your Access to Global Corporate Filings
Get instant updates on new filings, AI-driven insights, and tools to help you make smarter investment and business decisions
By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.
© 2025 Global Filings. All rights reserved.


