Analysis Modules#
Tools for \(\Psi\) phenomenon detection, deduplication, and scaling relations analysis.
\(\Psi\) Phenomenon Detection#
Detect precursory scale increase using the rectangular algorithm (Christophersen et al., 2024).
Deduplication#
Two-stage deduplication to remove duplicate \(\Psi\) identifications.
- analysis.optimize_psi_results.optimize_psi_results(input_file='psi.ou4a', output_file='optimized_psi.ou4', t_decimals=3, mp_decimals=1, r_decimals=1, auto_relax=True, verbose=True, mode='round', t_tol=0.03, mp_tol=0.05, r_tol=0.25)[source]#
Apply Step 9 two-stage deduplication to Ψ identification results.
- This implements the deduplication procedure from Christophersen et al. (2024, SRL 95(6):3464-3481):
Step 9.1: For same (eq_name, tmin, tp, mp) → keep max sloperatio (ties: min ap) Step 9.2: For same (eq_name, tp, r) → keep min ap
- Two discretization modes:
mode=”round”: Discretize by rounding to decimal places (simple, controllable)
mode=”tolerance”: Discretize by binning with tolerance (RECOMMENDED, closer to physical resolution)
Output format: Fixed 14 columns in CANONICAL_14 order.
- Parameters:
input_file (str) – Input .ou4a file with raw Ψ identifications
output_file (str) – Output .ou4 file with deduplicated results
t_decimals (int) – Decimal places for time (round mode)
mp_decimals (int) – Decimal places for magnitude (round mode)
r_decimals (int) – Decimal places for slope ratio (round mode)
auto_relax (bool) – Automatically relax precision if no reduction (round mode only)
verbose (bool) – Print detailed progress information
mode (str) – “round” or “tolerance”
t_tol (float) – Time tolerance in years (tolerance mode, default 0.03 ≈ 11 days)
mp_tol (float) – Magnitude tolerance (tolerance mode)
r_tol (float) – Slope ratio tolerance (tolerance mode)
- Returns:
Tuple (n_initial, n_after_9.1, n_after_9.2) - counts at each stage
Scaling Relations#
Fixed-effects regression for initial parameter estimation.
- analysis.plot_relations.analyze_scaling_relations(psi_file='optimized_psi.ou4', output_prefix='scaling_final')[source]#
Analyze Ψ phenomenon scaling relations using fixed-effects regression.
This function implements the two-stage estimation procedure to estimate initial values for EEPAS model parameters. It removes the within-mainshock AP-TP trade-off using fixed-effects regression, then fits scaling relations.
- Parameters:
psi_file – Input file with deduplicated Ψ identifications (.ou4 format)
output_prefix – Prefix for output files (plots and CSV)
- Algorithm:
Read all Ψ identifications from file
Convert units (years → days, km²) and compute logarithms
Estimate fixed-effects common slopes (b_AT, b_TA) to remove trade-off
Project each mainshock’s identifications to representative point
Fit linear regressions on representative points: - log₁₀(AP) vs MP → b_A (spatial scaling), sigma_A - log₁₀(TP) vs MP → b_T, a_T (temporal scaling), sigma_T - Mm vs MP → b_M, a_M (magnitude scaling), sigma_M
Generate plots with 95% prediction intervals
- Output Files:
{output_prefix}_projected_points.csv: Representative points per mainshock
{output_prefix}_mp_relations.png: Plots of AP and TP vs MP (semi-log)
{output_prefix}_mm_mp.png: Plot of Mm vs MP (linear scale)
- Returns:
None (writes output files directly)
Note
Estimated parameters (b_A, sigma_A, a_T, b_T, sigma_T, a_M, b_M, sigma_M) can be used as initial values for EEPAS learning.
References
Christophersen et al. (2024). Algorithmic Identification of the Precursory Scale Increase Phenomenon in Earthquake Catalogs. SRL 95(6):3464-3481.
- analysis.plot_relations._fixed_effects_slope_safe(x, y, group, min_group_size=2)[source]#
Estimate common slope using fixed-effects regression (within-group demeaning).
Removes within-mainshock trade-off by demeaning x and y within each mainshock, then computing pooled slope across all demeaned observations.
- Parameters:
x – Independent variable (array-like)
y – Dependent variable (array-like)
group – Group identifiers (mainshock names, array-like)
min_group_size – Minimum group size to include (default: 2)
- Returns:
Common slope estimate (b_AT or b_TA). Falls back to pooled OLS if denominator is zero or no groups meet minimum size.
Utility Functions#
Dataset Extraction#
- analysis.dataset.extract_period_forecast(period_idx, forecast_data, num_magnitude_steps, magnitude_bins, num_regions, cell_bounds)[source]#
Extract earthquake forecast data for specified 3-month period. (This function is essentially the same as your version)
- Parameters:
period_idx – Period index (starting from 1) (int)
forecast_data – Complete forecast data matrix (ndarray)
num_magnitude_steps – Number of magnitude bins (int)
- magnitude_binslist
List of magnitude bin tuples [(m0, m1), …]
- num_regionsint
Number of spatial regions
- cell_boundsDataFrame
Already converted to lon/lat DataFrame
Returns: pd.DataFrame
Forecast data for the specified period
- analysis.dataset.create_subgrids_spatial(data, grid_resolution=0.1)[source]#
Divide large grid (possibly irregular lon/lat boundaries) into 0.1x0.1 sub-grids.
Using ‘Centroid-in-Bounding-Box’ method for spatial downscaling.
- Parameters:
data – Coarse grid forecast data (pd.DataFrame)
grid_resolution – Sub-grid resolution in degrees (default: 0.1) (float, optional)
Returns: pd.DataFrame
Downscaled forecast data with sub-grids
Time Conversion#
- analysis.decimal_time.decimal_time_precise(year, month, day, hour=0, minute=0, second=0, start=0)[source]#
Convert year-month-day-hour-minute-second to decimal time format, handling various cases including negative seconds
- Parameters:
year – Year (can be scalar or array)
month – Month (can be scalar or array)
day – Day (can be scalar or array)
hour – Hour, minute, second (can be scalar or array)
minute – Hour, minute, second (can be scalar or array)
second – Hour, minute, second (can be scalar or array)
start – Starting year (default is 0)
- Returns:
Decimal time format (relative to starting year)
Event Selection#
- analysis.select_m5plus.select_events_with_options(catalog_file, output_file, min_mag=6.0, start_year=1988, end_year=2025, max_depth=40.0, MC=2.5, exclude_921_aftershocks=True)[source]#
Select earthquakes from the catalog based on specified criteria, with optional exclusion of 921 earthquake aftershocks.
- Parameters:
catalog_file (str) – Input .mat file name.
output_file (str) – Output file name.
min_mag (float) – Minimum magnitude to filter.
start_year (int) – Starting year to filter (inclusive).
end_year (int) – Ending year to filter (exclusive).
max_depth (float) – Maximum depth to filter (exclusive), in kilometers.
MC (float) – A custom parameter, default is 2.5.
exclude_921_aftershocks (bool) – Whether to exclude 921 aftershocks, default is True.
Forecast Validation#
- analysis.analyze_forecast_lambda.analyze_forecast_results(results_dir, config_file)[source]#
Analyze PPE and EEPAS forecast Lambda sum for validation.
This function validates forecast results by checking if the integrated rate density (Lambda sum) is consistent with the observed event count. For a properly calibrated forecast, the Lambda sum should be close to the number of observed events in the forecast period.
- Parameters:
results_dir – Path to results directory containing forecast .mat files
config_file – Path to configuration JSON file
- Returns:
Dictionary containing ‘ppe_lambda_sum’ and ‘eepas_lambda_sum’
- Return type:
Notes
Forecast matrix format: (n_windows × n_mag_bins, n_cells + 1)
First column contains time_s index and should be excluded from Lambda sum
Lambda sum = Σ forecast[:, 1:] (exclude index column)
Expected: Lambda sum ≈ observed event count (within ±20%)
Forecast Format Conversion#
Convert EEPAS/PPE forecast matrices to PyCSEP-compatible format.
- class analysis.forecast_converter.EEPASForecastConverter(forecast_file, grid_file, num_regions=177, num_magnitude_steps=25, magnitude_min=5.0, magnitude_step=0.1, depth_min=0.0, depth_max=30.0, coordinate_transform=True, source_crs='EPSG:7794', target_crs='EPSG:4326', verbose=True)[source]#
EEPAS Forecast Format Converter
Features:
Load EEPAS/PPE MATLAB forecast files
Load grid definitions (CELLE_ter.mat) and perform coordinate transformation
Extract forecasts for specific time periods
Spatial downsampling (coarse grids → 0.1° sub-grids)
Export PyCSEP-compatible format
Examples
>>> # Basic usage >>> converter = EEPASForecastConverter( ... forecast_file='PREVISIONI_3m_EEPAS_2012_2022.mat', ... grid_file='CELLE_ter.mat' ... ) >>> >>> # Convert specific period >>> converter.convert_period( ... period=1, ... start_year=2012, ... output_file='forecast_period_1.dat' ... ) >>> >>> # Convert all periods (sum rates) >>> converter.convert_all_periods( ... start_year=2012, ... output_file='forecast_all_periods.dat' ... )
- __init__(forecast_file, grid_file, num_regions=177, num_magnitude_steps=25, magnitude_min=5.0, magnitude_step=0.1, depth_min=0.0, depth_max=30.0, coordinate_transform=True, source_crs='EPSG:7794', target_crs='EPSG:4326', verbose=True)[source]#
Initialize converter
- Parameters:
forecast_file – Path to EEPAS/PPE forecast .mat file
grid_file – Path to grid definition .mat file (CELLE_ter.mat)
num_regions – Number of spatial grids (default: 177 for Italy)
num_magnitude_steps – Number of magnitude bins (default: 25)
magnitude_min – Minimum magnitude (default: 5.0)
magnitude_step – Magnitude bin width (default: 0.1)
depth_min – Minimum depth in km (default: 0.0)
depth_max – Maximum depth in km (default: 30.0)
coordinate_transform – Enable coordinate transformation (default: True)
source_crs – Source coordinate system (default: ‘EPSG:7794’ for RDN2008)
target_crs – Target coordinate system (default: ‘EPSG:4326’ for WGS84)
verbose – Print verbose messages (default: True)
- extract_period(period_idx)[source]#
Extract forecast data for specific period
- Parameters:
period_idx – Period index (starting from 1)
- Returns:
Coarse-grid forecast data for the period
- Return type:
pd.DataFrame
- spatial_downsampling(data, grid_resolution=0.1)[source]#
Spatial downsampling: divide coarse grids into 0.1° × 0.1° sub-grids
Uses “centroid-in-bounding-box” method for spatial downsampling
- Parameters:
data – Coarse-grid forecast data (pd.DataFrame)
grid_resolution – Sub-grid resolution in degrees (default: 0.1)
- Returns:
Downsampled forecast data
- Return type:
pd.DataFrame
- aggregate_overlaps(data)[source]#
Aggregate overlapping grid points (duplicates from bounding box overlaps)
- Parameters:
data – Downsampled data (pd.DataFrame)
- Returns:
Aggregated data
- Return type:
pd.DataFrame
- convert_period(period, start_year=None, output_file=None, perform_downsampling=True, grid_resolution=0.1)[source]#
Convert specific period to PyCSEP format
- Parameters:
period – Period number (starting from 1)
start_year – Forecast start year (for time range calculation, optional)
output_file – Output file path (optional)
perform_downsampling – Enable spatial downsampling (default: True)
grid_resolution – Downsampling resolution in degrees (default: 0.1)
- Returns:
PyCSEP-format forecast data
- Return type:
pd.DataFrame
- convert_all_periods(start_period=1, end_period=None, start_year=None, output_file=None, perform_downsampling=True, grid_resolution=0.1)[source]#
Convert all periods and sum RATE
- Parameters:
start_period – Starting period (from 1)
end_period – Ending period (None = all)
start_year – Forecast start year (optional)
output_file – Output file path (optional)
perform_downsampling – Enable spatial downsampling
grid_resolution – Downsampling resolution in degrees
- Returns:
Summed forecast data for all periods
- Return type:
pd.DataFrame
- export_csep_format(data, output_file)[source]#
Export PyCSEP-compatible format
Format: LON_0 LON_1 LAT_0 LAT_1 Z_0 Z_1 MAG_0 MAG_1 RATE FLAG
- Parameters:
data – Forecast data (pd.DataFrame)
output_file – Output file path
- to_pycsep_forecast(data, start_date, end_date, name='EEPAS_Forecast')[source]#
Convert to PyCSEP GriddedForecast object
- Parameters:
data – Forecast data (pd.DataFrame)
start_date – Forecast start time (datetime or str)
end_date – Forecast end time (datetime or str)
name – Forecast name (str)
- Returns:
PyCSEP forecast object
- Return type:
csep.core.forecasts.GriddedForecast
Examples
>>> from datetime import datetime >>> forecast = converter.to_pycsep_forecast( ... data=period_data, ... start_date=datetime(2012, 1, 1), ... end_date=datetime(2012, 3, 31), ... name='EEPAS_2012_Q1' ... )
- calculate_period_dates(period, start_year, period_length_months=3)[source]#
Calculate start and end dates for specific period
- Parameters:
period – Period number (starting from 1)
start_year – Forecast start year
period_length_months – Months per period (default: 3)
- Returns:
(start_date, end_date) as datetime objects
- Return type:
Examples
>>> # 3-month periods >>> start, end = converter.calculate_period_dates(1, 2012, 3) >>> # 1-year periods >>> start, end = converter.calculate_period_dates(1, 2012, 12)
- analysis.forecast_converter.convert_eepas_forecast(forecast_file, grid_file, output_file, period=None, **kwargs)[source]#
Convenience function: quick EEPAS forecast conversion.
- Parameters:
- Returns:
Writes PyCSEP format file to output_file
- Return type:
None
Examples
>>> # Convert specific period >>> convert_eepas_forecast( ... 'PREVISIONI_3m_EEPAS_2012_2022.mat', ... 'CELLE_ter.mat', ... 'forecast_period_1.dat', ... period=1 ... ) >>> >>> # Convert all periods >>> convert_eepas_forecast( ... 'PREVISIONI_3m_EEPAS_2012_2022.mat', ... 'CELLE_ter.mat', ... 'forecast_all_periods.dat' ... )
PyCSEP Compatibility#
Patch pycsep for Shapely 2.x compatibility.
- analysis.patch_pycsep.patch_csep_regions()[source]#
Patch pycsep.core.regions for Shapely 2.x compatibility.
Automatically locates the pycsep installation and replaces deprecated Shapely API calls with compatible versions.
- Changes:
Old: joined_poly.boundary.xy
New: joined_poly.envelope.boundary.coords.xy
- Returns:
Prints status message and modifies pycsep installation file
- Return type:
None
- Raises:
Exception – If file cannot be read or written (permission issues)
References#
Christophersen, A., Rhoades, D. A., & Hainzl, S. (2024). Algorithmic Identification of the Precursory Scale Increase Phenomenon in Earthquake Catalogs. Seismological Research Letters, 95(6), 3464-3481.
See Also#
Core Modules - Core EEPAS modules
Examples and Tutorials - Analysis workflow examples