Analysis Modules#

Tools for \(\Psi\) phenomenon detection, deduplication, and scaling relations analysis.

\(\Psi\) Phenomenon Detection#

Detect precursory scale increase using the rectangular algorithm (Christophersen et al., 2024).


Deduplication#

Two-stage deduplication to remove duplicate \(\Psi\) identifications.

analysis.optimize_psi_results.optimize_psi_results(input_file='psi.ou4a', output_file='optimized_psi.ou4', t_decimals=3, mp_decimals=1, r_decimals=1, auto_relax=True, verbose=True, mode='round', t_tol=0.03, mp_tol=0.05, r_tol=0.25)[source]#

Apply Step 9 two-stage deduplication to Ψ identification results.

This implements the deduplication procedure from Christophersen et al. (2024, SRL 95(6):3464-3481):

Step 9.1: For same (eq_name, tmin, tp, mp) → keep max sloperatio (ties: min ap) Step 9.2: For same (eq_name, tp, r) → keep min ap

Two discretization modes:
  • mode=”round”: Discretize by rounding to decimal places (simple, controllable)

  • mode=”tolerance”: Discretize by binning with tolerance (RECOMMENDED, closer to physical resolution)

Output format: Fixed 14 columns in CANONICAL_14 order.

Parameters:
  • input_file (str) – Input .ou4a file with raw Ψ identifications

  • output_file (str) – Output .ou4 file with deduplicated results

  • t_decimals (int) – Decimal places for time (round mode)

  • mp_decimals (int) – Decimal places for magnitude (round mode)

  • r_decimals (int) – Decimal places for slope ratio (round mode)

  • auto_relax (bool) – Automatically relax precision if no reduction (round mode only)

  • verbose (bool) – Print detailed progress information

  • mode (str) – “round” or “tolerance”

  • t_tol (float) – Time tolerance in years (tolerance mode, default 0.03 ≈ 11 days)

  • mp_tol (float) – Magnitude tolerance (tolerance mode)

  • r_tol (float) – Slope ratio tolerance (tolerance mode)

Returns:

Tuple (n_initial, n_after_9.1, n_after_9.2) - counts at each stage


Scaling Relations#

Fixed-effects regression for initial parameter estimation.

analysis.plot_relations.analyze_scaling_relations(psi_file='optimized_psi.ou4', output_prefix='scaling_final')[source]#

Analyze Ψ phenomenon scaling relations using fixed-effects regression.

This function implements the two-stage estimation procedure to estimate initial values for EEPAS model parameters. It removes the within-mainshock AP-TP trade-off using fixed-effects regression, then fits scaling relations.

Parameters:
  • psi_file – Input file with deduplicated Ψ identifications (.ou4 format)

  • output_prefix – Prefix for output files (plots and CSV)

Algorithm:
  1. Read all Ψ identifications from file

  2. Convert units (years → days, km²) and compute logarithms

  3. Estimate fixed-effects common slopes (b_AT, b_TA) to remove trade-off

  4. Project each mainshock’s identifications to representative point

  5. Fit linear regressions on representative points: - log₁₀(AP) vs MP → b_A (spatial scaling), sigma_A - log₁₀(TP) vs MP → b_T, a_T (temporal scaling), sigma_T - Mm vs MP → b_M, a_M (magnitude scaling), sigma_M

  6. Generate plots with 95% prediction intervals

Output Files:
  • {output_prefix}_projected_points.csv: Representative points per mainshock

  • {output_prefix}_mp_relations.png: Plots of AP and TP vs MP (semi-log)

  • {output_prefix}_mm_mp.png: Plot of Mm vs MP (linear scale)

Returns:

None (writes output files directly)

Note

Estimated parameters (b_A, sigma_A, a_T, b_T, sigma_T, a_M, b_M, sigma_M) can be used as initial values for EEPAS learning.

References

Christophersen et al. (2024). Algorithmic Identification of the Precursory Scale Increase Phenomenon in Earthquake Catalogs. SRL 95(6):3464-3481.

analysis.plot_relations._fixed_effects_slope_safe(x, y, group, min_group_size=2)[source]#

Estimate common slope using fixed-effects regression (within-group demeaning).

Removes within-mainshock trade-off by demeaning x and y within each mainshock, then computing pooled slope across all demeaned observations.

Parameters:
  • x – Independent variable (array-like)

  • y – Dependent variable (array-like)

  • group – Group identifiers (mainshock names, array-like)

  • min_group_size – Minimum group size to include (default: 2)

Returns:

Common slope estimate (b_AT or b_TA). Falls back to pooled OLS if denominator is zero or no groups meet minimum size.


Utility Functions#

Dataset Extraction#

analysis.dataset.extract_period_forecast(period_idx, forecast_data, num_magnitude_steps, magnitude_bins, num_regions, cell_bounds)[source]#

Extract earthquake forecast data for specified 3-month period. (This function is essentially the same as your version)

Parameters:
  • period_idx – Period index (starting from 1) (int)

  • forecast_data – Complete forecast data matrix (ndarray)

  • num_magnitude_steps – Number of magnitude bins (int)

magnitude_binslist

List of magnitude bin tuples [(m0, m1), …]

num_regionsint

Number of spatial regions

cell_boundsDataFrame

Already converted to lon/lat DataFrame

Returns: pd.DataFrame

Forecast data for the specified period

analysis.dataset.create_subgrids_spatial(data, grid_resolution=0.1)[source]#

Divide large grid (possibly irregular lon/lat boundaries) into 0.1x0.1 sub-grids.

Using ‘Centroid-in-Bounding-Box’ method for spatial downscaling.

Parameters:
  • data – Coarse grid forecast data (pd.DataFrame)

  • grid_resolution – Sub-grid resolution in degrees (default: 0.1) (float, optional)

Returns: pd.DataFrame

Downscaled forecast data with sub-grids

Time Conversion#

analysis.decimal_time.decimal_time_precise(year, month, day, hour=0, minute=0, second=0, start=0)[source]#

Convert year-month-day-hour-minute-second to decimal time format, handling various cases including negative seconds

Parameters:
  • year – Year (can be scalar or array)

  • month – Month (can be scalar or array)

  • day – Day (can be scalar or array)

  • hour – Hour, minute, second (can be scalar or array)

  • minute – Hour, minute, second (can be scalar or array)

  • second – Hour, minute, second (can be scalar or array)

  • start – Starting year (default is 0)

Returns:

Decimal time format (relative to starting year)

analysis.decimal_time.ymd_time_precise(time, start)[source]#

Convert decimal time to year-month-day format, using datetime library

Parameters:
  • time – Decimal time (can be scalar or array)

  • start – Starting year

Returns:

Array containing year, month, day (each row corresponds to one time point)

Event Selection#

analysis.select_m5plus.select_events_with_options(catalog_file, output_file, min_mag=6.0, start_year=1988, end_year=2025, max_depth=40.0, MC=2.5, exclude_921_aftershocks=True)[source]#

Select earthquakes from the catalog based on specified criteria, with optional exclusion of 921 earthquake aftershocks.

Parameters:
  • catalog_file (str) – Input .mat file name.

  • output_file (str) – Output file name.

  • min_mag (float) – Minimum magnitude to filter.

  • start_year (int) – Starting year to filter (inclusive).

  • end_year (int) – Ending year to filter (exclusive).

  • max_depth (float) – Maximum depth to filter (exclusive), in kilometers.

  • MC (float) – A custom parameter, default is 2.5.

  • exclude_921_aftershocks (bool) – Whether to exclude 921 aftershocks, default is True.

Forecast Validation#

analysis.analyze_forecast_lambda.analyze_forecast_results(results_dir, config_file)[source]#

Analyze PPE and EEPAS forecast Lambda sum for validation.

This function validates forecast results by checking if the integrated rate density (Lambda sum) is consistent with the observed event count. For a properly calibrated forecast, the Lambda sum should be close to the number of observed events in the forecast period.

Parameters:
  • results_dir – Path to results directory containing forecast .mat files

  • config_file – Path to configuration JSON file

Returns:

Dictionary containing ‘ppe_lambda_sum’ and ‘eepas_lambda_sum’

Return type:

dict

Notes

  • Forecast matrix format: (n_windows × n_mag_bins, n_cells + 1)

  • First column contains time_s index and should be excluded from Lambda sum

  • Lambda sum = Σ forecast[:, 1:] (exclude index column)

  • Expected: Lambda sum ≈ observed event count (within ±20%)

Forecast Format Conversion#

Convert EEPAS/PPE forecast matrices to PyCSEP-compatible format.

class analysis.forecast_converter.EEPASForecastConverter(forecast_file, grid_file, num_regions=177, num_magnitude_steps=25, magnitude_min=5.0, magnitude_step=0.1, depth_min=0.0, depth_max=30.0, coordinate_transform=True, source_crs='EPSG:7794', target_crs='EPSG:4326', verbose=True)[source]#

EEPAS Forecast Format Converter

Features:

  1. Load EEPAS/PPE MATLAB forecast files

  2. Load grid definitions (CELLE_ter.mat) and perform coordinate transformation

  3. Extract forecasts for specific time periods

  4. Spatial downsampling (coarse grids → 0.1° sub-grids)

  5. Export PyCSEP-compatible format

Examples

>>> # Basic usage
>>> converter = EEPASForecastConverter(
...     forecast_file='PREVISIONI_3m_EEPAS_2012_2022.mat',
...     grid_file='CELLE_ter.mat'
... )
>>>
>>> # Convert specific period
>>> converter.convert_period(
...     period=1,
...     start_year=2012,
...     output_file='forecast_period_1.dat'
... )
>>>
>>> # Convert all periods (sum rates)
>>> converter.convert_all_periods(
...     start_year=2012,
...     output_file='forecast_all_periods.dat'
... )
__init__(forecast_file, grid_file, num_regions=177, num_magnitude_steps=25, magnitude_min=5.0, magnitude_step=0.1, depth_min=0.0, depth_max=30.0, coordinate_transform=True, source_crs='EPSG:7794', target_crs='EPSG:4326', verbose=True)[source]#

Initialize converter

Parameters:
  • forecast_file – Path to EEPAS/PPE forecast .mat file

  • grid_file – Path to grid definition .mat file (CELLE_ter.mat)

  • num_regions – Number of spatial grids (default: 177 for Italy)

  • num_magnitude_steps – Number of magnitude bins (default: 25)

  • magnitude_min – Minimum magnitude (default: 5.0)

  • magnitude_step – Magnitude bin width (default: 0.1)

  • depth_min – Minimum depth in km (default: 0.0)

  • depth_max – Maximum depth in km (default: 30.0)

  • coordinate_transform – Enable coordinate transformation (default: True)

  • source_crs – Source coordinate system (default: ‘EPSG:7794’ for RDN2008)

  • target_crs – Target coordinate system (default: ‘EPSG:4326’ for WGS84)

  • verbose – Print verbose messages (default: True)

extract_period(period_idx)[source]#

Extract forecast data for specific period

Parameters:

period_idx – Period index (starting from 1)

Returns:

Coarse-grid forecast data for the period

Return type:

pd.DataFrame

spatial_downsampling(data, grid_resolution=0.1)[source]#

Spatial downsampling: divide coarse grids into 0.1° × 0.1° sub-grids

Uses “centroid-in-bounding-box” method for spatial downsampling

Parameters:
  • data – Coarse-grid forecast data (pd.DataFrame)

  • grid_resolution – Sub-grid resolution in degrees (default: 0.1)

Returns:

Downsampled forecast data

Return type:

pd.DataFrame

aggregate_overlaps(data)[source]#

Aggregate overlapping grid points (duplicates from bounding box overlaps)

Parameters:

data – Downsampled data (pd.DataFrame)

Returns:

Aggregated data

Return type:

pd.DataFrame

convert_period(period, start_year=None, output_file=None, perform_downsampling=True, grid_resolution=0.1)[source]#

Convert specific period to PyCSEP format

Parameters:
  • period – Period number (starting from 1)

  • start_year – Forecast start year (for time range calculation, optional)

  • output_file – Output file path (optional)

  • perform_downsampling – Enable spatial downsampling (default: True)

  • grid_resolution – Downsampling resolution in degrees (default: 0.1)

Returns:

PyCSEP-format forecast data

Return type:

pd.DataFrame

convert_all_periods(start_period=1, end_period=None, start_year=None, output_file=None, perform_downsampling=True, grid_resolution=0.1)[source]#

Convert all periods and sum RATE

Parameters:
  • start_period – Starting period (from 1)

  • end_period – Ending period (None = all)

  • start_year – Forecast start year (optional)

  • output_file – Output file path (optional)

  • perform_downsampling – Enable spatial downsampling

  • grid_resolution – Downsampling resolution in degrees

Returns:

Summed forecast data for all periods

Return type:

pd.DataFrame

export_csep_format(data, output_file)[source]#

Export PyCSEP-compatible format

Format: LON_0 LON_1 LAT_0 LAT_1 Z_0 Z_1 MAG_0 MAG_1 RATE FLAG

Parameters:
  • data – Forecast data (pd.DataFrame)

  • output_file – Output file path

to_pycsep_forecast(data, start_date, end_date, name='EEPAS_Forecast')[source]#

Convert to PyCSEP GriddedForecast object

Parameters:
  • data – Forecast data (pd.DataFrame)

  • start_date – Forecast start time (datetime or str)

  • end_date – Forecast end time (datetime or str)

  • name – Forecast name (str)

Returns:

PyCSEP forecast object

Return type:

csep.core.forecasts.GriddedForecast

Examples

>>> from datetime import datetime
>>> forecast = converter.to_pycsep_forecast(
...     data=period_data,
...     start_date=datetime(2012, 1, 1),
...     end_date=datetime(2012, 3, 31),
...     name='EEPAS_2012_Q1'
... )
calculate_period_dates(period, start_year, period_length_months=3)[source]#

Calculate start and end dates for specific period

Parameters:
  • period – Period number (starting from 1)

  • start_year – Forecast start year

  • period_length_months – Months per period (default: 3)

Returns:

(start_date, end_date) as datetime objects

Return type:

tuple

Examples

>>> # 3-month periods
>>> start, end = converter.calculate_period_dates(1, 2012, 3)
>>> # 1-year periods
>>> start, end = converter.calculate_period_dates(1, 2012, 12)
analysis.forecast_converter.convert_eepas_forecast(forecast_file, grid_file, output_file, period=None, **kwargs)[source]#

Convenience function: quick EEPAS forecast conversion.

Parameters:
  • forecast_file (str) – EEPAS forecast .mat file path

  • grid_file (str) – Grid definition .mat file path

  • output_file (str) – Output PyCSEP format file path

  • period (int, optional) – Period number (None = all periods)

  • **kwargs – Additional parameters for EEPASForecastConverter

Returns:

Writes PyCSEP format file to output_file

Return type:

None

Examples

>>> # Convert specific period
>>> convert_eepas_forecast(
...     'PREVISIONI_3m_EEPAS_2012_2022.mat',
...     'CELLE_ter.mat',
...     'forecast_period_1.dat',
...     period=1
... )
>>>
>>> # Convert all periods
>>> convert_eepas_forecast(
...     'PREVISIONI_3m_EEPAS_2012_2022.mat',
...     'CELLE_ter.mat',
...     'forecast_all_periods.dat'
... )

PyCSEP Compatibility#

Patch pycsep for Shapely 2.x compatibility.

analysis.patch_pycsep.patch_csep_regions()[source]#

Patch pycsep.core.regions for Shapely 2.x compatibility.

Automatically locates the pycsep installation and replaces deprecated Shapely API calls with compatible versions.

Changes:
  • Old: joined_poly.boundary.xy

  • New: joined_poly.envelope.boundary.coords.xy

Returns:

Prints status message and modifies pycsep installation file

Return type:

None

Raises:

Exception – If file cannot be read or written (permission issues)


References#

Christophersen, A., Rhoades, D. A., & Hainzl, S. (2024). Algorithmic Identification of the Precursory Scale Increase Phenomenon in Earthquake Catalogs. Seismological Research Letters, 95(6), 3464-3481.


See Also#