Add South Carolina dataset exploration#120
Open
DTrim99 wants to merge 8 commits intoPolicyEngine:mainfrom
Open
Add South Carolina dataset exploration#120DTrim99 wants to merge 8 commits intoPolicyEngine:mainfrom
DTrim99 wants to merge 8 commits intoPolicyEngine:mainfrom
Conversation
Adds data exploration notebook and summary CSV for South Carolina (SC) dataset: - Household and person counts (weighted) - AGI distribution (median, average, percentiles) at household and person level - Households with children breakdown - Children by age group demographics - Income bracket analysis Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add H.4216 reform analysis notebook using PolicyEngine microsimulation - Include RFA official analysis data for comparison - Add detailed comparison markdown explaining $159M difference: - PE shows +$40M revenue vs RFA's -$119M - Key difference: SCIAD phase-out treatment for upper-middle income - Implementation uses AGI - SCIAD vs federal taxable income Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Key findings: - PE has 7.85x more $0 income returns vs RFA - PE has ~50% fewer returns in $100k-$300k brackets - PE has 1.9x more millionaire returns paying 78% higher avg tax - Total baseline revenue similar ($6.52B vs $6.40B) but composition differs - PE derives 48% of SC income tax from millionaires vs RFA's 15% Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
PE includes non-filers which explains 540k extra returns in $0 bracket Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add implementation note about sc_additions bug fix - Add RFA comparison section to notebook - Update comparison markdown with post-fix accuracy (~93%) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add data_exploration_staging.ipynb for staging SC dataset - Add sc_h4216_budget_impact.py for quick budget impact calculation - Add staging dataset summary CSV - Update reform analysis notebook with RFA comparison fixes - Update tax impact CSV with corrected results (staging data) Staging vs Production dataset comparison: - Staging has 17% fewer households (more focused on filers) - Staging median AGI is 39% higher (0k vs 3k) - Budget impact with staging: -46.6M (5.21%) / -10.9M (5.39%) - RFA estimate: -19.1M (93% accuracy with 5.39% rate) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Contributor
Author
Update: Staging Dataset Analysis & PR #7514 FixChanges
Results with PR #7514 Fix
Staging vs Production Dataset
The staging dataset better represents actual tax filers (fewer zero/low income units), which explains the improved alignment with RFA estimates. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Key SC Statistics
H.4216 Tax Reform Analysis
Compares PolicyEngine microsimulation results against official RFA (Revenue & Fiscal Affairs) analysis.
Key Differences
The $159M discrepancy is primarily due to:
See
h4216_analysis_comparison.mdfor detailed analysis.Files Added
us/states/sc/data_exploration.ipynb- SC dataset explorationus/states/sc/sc_dataset_summary_weighted.csv- Dataset summaryus/states/sc/sc_h4216_reform_analysis.ipynb- H.4216 reform analysisus/states/sc/sc_h4216_tax_impact_analysis.csv- PE analysis resultsus/states/sc/rfa_h4216_analysis.csv- RFA official analysisus/states/sc/h4216_analysis_comparison.md- Comparison analysisTest plan
🤖 Generated with Claude Code