Power Sector Data Crosswalk
October 2022 v0.3 Release
CAPD has added manual matches as well as the ability to include two additional dataset IDs (Facility Registry Service, or FRS, and the National Electric Energy Data System, or NEEDS) via optional flags in the Github crosswalk code.
EPA’s Clean Air and Power Division (CAPD) and the U.S. Energy Information Administration (EIA) provide two of the most comprehensive and commonly used electric power sector datasets. These two datasets include information on emissions, electricity generation, fuel use, operations, and facility attributes of power plants across the United States. Generally, CAPD's Power Sector Emissions Data contains detailed hourly information on emissions and operation while EIA's data provides monthly and annual information on fuel use and net generation. Many researchers and data consumers find useful details in both datasets and can integrate the data in innovative ways to present informative analyses.
It is difficult to merge the two datasets due to key differences in each agency’s purpose for, and manner of, collecting the data. CAPD's Power Sector Emissions Data focuses on combustion sources (e.g., boilers) while EIA's data focuses on electricity generators. In providing this crosswalk (a table that matches key EPA and EIA identifiers assigned to power plants and electric generating units), CAPD is hoping to make it easier to integrate and use both datasets.
The crosswalk (csv and xlsx formats) as well as the R script that generates the crosswalk files are provided in GitHub, a collaboration and version control platform. You can help improve the crosswalk by identifying mismatched units, adding new matches, and contributing updates to the R programming script. Please see “Contributing to the Crosswalk” for more information. The crosswalk is available for download in csv and xlsx formats below.
Data Sources
- Power Sector Emissions Data Guide (pdf)
- A guide to EIA electric power data
- Emissions & Generation Resource Integrated Database (eGRID)
- GitHub tutorial
Additional Data Sources
- eGRID Data Explorer
- EIA's Hourly Electric Grid Monitor
- EPA’s National Emissions Inventory (NEI)
- eGRID PM2.5 Data
- Homeland Infrastructure Foundation-Level Data (HIFLD)
How do I cite the Power Sector Data Crosswalk?
CAPD suggests the following citation for its Power Sector Emissions Data:
Huetteman, J., Tafoya, J., Johnson, T., and Schreifels, J. (2021). EPA-EIA Power Sector Data Crosswalk. Accessible at www.epa.gov/airmarkets/power-sector-data-crosswalk.