Utility Power Outage Scrapers¶
The Outage Detection System collects real-time power outage data directly from major utility companies. These scrapers provide high-confidence data with detailed geographic and temporal information.
Pacific Gas & Electric (PG&E)¶
Coverage: Northern and Central California Customers Served: ~5.5 million Refresh Rate: 10 minutes
PG&E provides outage data through their public API and alerts system. The scraper collects:
- Outage ID and location
- Affected customers
- Cause (when available)
- Crew status
- Estimated restoration time
Endpoints:
- Primary: apim.pge.com/cocoutage/outages/getOutagesRegions
- Fallback: pgealerts.alerts.pge.com/api/outages.json
Duke Energy¶
Coverage: North Carolina, South Carolina, Florida, Indiana, Ohio, Kentucky Customers Served: ~8.2 million Refresh Rate: 10 minutes
Duke Energy uses an ArcGIS-based outage map system. The scraper collects outages organized by region.
Data Fields: - Geographic coordinates - Customer counts - ETR estimates - Cause categories - Crew dispatch status
Southern California Edison (SCE)¶
Coverage: Southern California (except Los Angeles DWP area) Customers Served: ~5 million Refresh Rate: 10 minutes
SCE provides outage data through their public API with fallback to ArcGIS layers.
Data Fields: - City and county - ZIP code - Circuit ID - Customer counts (affected and total) - Outage start time - Crew status
Consolidated Edison (Con Edison)¶
Coverage: New York City (Manhattan, Brooklyn, Queens, Bronx, Staten Island) and Westchester County Customers Served: ~3.5 million Refresh Rate: 10 minutes
Con Edison provides data through their storm center with borough-level granularity.
Borough to County Mapping: | Borough | County | |---------|--------| | Manhattan | New York | | Brooklyn | Kings | | Queens | Queens | | Bronx | Bronx | | Staten Island | Richmond | | Westchester | Westchester |
Florida Power & Light (FPL)¶
Coverage: Florida (eastern and southern regions) Customers Served: ~5.7 million Refresh Rate: 10 minutes
FPL provides county-level outage summaries through their storm center.
Endpoints:
- Primary: fplmaps.com/data/Customer_Data.json
- Fallback: fplmaps.com/data/thematic-2.json
Data Quality¶
All utility scrapers follow a consistent pattern:
- Primary endpoint is tried first
- Fallback endpoint is used if primary fails
- Data is normalized to a common schema
- Confidence score is calculated (base: 0.85)
Error Handling¶
- Network errors: Logged, returns empty array
- Parse errors: Individual records skipped, others processed
- API changes: Fallback endpoints provide resilience
Rate Limiting¶
Each utility scraper respects rate limits: - 6 requests per minute per utility - 10-minute refresh intervals - Caching prevents duplicate requests
Adding New Utilities¶
The scraper framework is extensible. To add a new utility:
- Create a new adapter extending
BaseAdapter - Implement
fetchRawOutages()andnormalizeOutage() - Register in
adapters/index.ts - Add to wrangler.toml configuration
Example adapter structure:
export class NewUtilityAdapter extends BaseAdapter {
readonly sourceName: SourceName = 'newutility';
readonly refreshIntervalSeconds = 600;
async fetchRawOutages(): Promise<RawOutage[]> {
// Fetch from utility API
}
normalizeOutage(raw: RawOutage): NormalizedOutage | null {
// Transform to common schema
}
}