Examining the \(\Psi\) Phenomenon in Italian Seismicity#
π Background: What is \(\Psi\)?#
:math:`Psi` is a precursory seismic activation pattern observed before large earthquakes. It represents an increase in the rate of smaller earthquakes (precursors) in the vicinity of an upcoming mainshock.
Physical Interpretation#
Stress accumulation: As tectonic stress builds up before a large earthquake, it can trigger smaller earthquakes in the surrounding region
Statistical significance: Not all small earthquakes before a large one are precursors - Ξ¨ analysis identifies statistically significant rate increases
EEPAS Foundation: The Ξ¨ phenomenon is the core motivation for the EEPAS model
π― This Notebookβs Objectives#
This comprehensive notebook demonstrates the complete Ξ¨ analysis workflow for Italian seismicity:
Part 1: Data Preparation#
Testing Region (R): 177 grid cells (CELLE_ter.mat) - where we evaluate forecasts
Neighborhood Region (N): CPTI15 polygon - larger region to avoid boundary effects
Catalog Filtering: Apply spatial, depth (β€40km), and temporal constraints
Part 2: Ξ¨ Identification#
Select Mβ₯5.45 mainshocks from 1990-2011 (training period)
Search for precursors in 400km radius before each mainshock
Optimize Ξ¨ parameters (
a_p, b_p, t_p, r_p) using grid searchApply significance tests (slope ratio, goodness-of-fit)
Part 3: Scaling Relations Analysis#
Magnitude scaling: How does precursor magnitude relate to mainshock magnitude?
Time scaling: How does the Ξ¨ duration relate to mainshock size?
Spatial scaling: How does the activation region size relate to mainshock?
π Workflow Context#
This notebook is Step 0 (preprocessing) in the EEPAS pipeline:
[Step 0] Ξ¨ Analysis (this notebook)
β Provides initial estimates for a_t, b_t
[Step 1] PPE Learning β a, d, s
[Step 2] Aftershock Fitting β Ξ½, ΞΊ
[Step 3] EEPAS Learning β am, bm, Sm, at, bt, St, ba, Sa, u
[Step 4] Forecast Generation
[Step 5] Evaluation (PyCSEP)
β οΈ Important Notes#
Testing vs. Neighborhood Region:
Testing (R): 177 cells - compute NLL here, evaluate forecasts here
Neighborhood (N): CPTI15 polygon - larger region including R + buffer
Why? Avoid boundary effects - precursors outside R can still trigger events in R
Coordinate System:
Uses RDN2008 (EPSG:7794) projection
Units: kilometers (km)
Must convert between lat/lon and projected coordinates
Time Representation:
Uses decimal years (e.g., 2009.2604 = April 5, 2009)
Critical for Ξ¨ analysis - need precise timing!
Letβs begin!
[1]:
!pip install pycsep -qq
ββββββββββββββββββββββββββββββββββββββββ 24.3/24.3 MB 25.5 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
ββββββββββββββββββββββββββββββββββββββββ 11.8/11.8 MB 99.9 MB/s eta 0:00:00
ββββββββββββββββββββββββββββββββββββββββ 14.5/14.5 MB 65.2 MB/s eta 0:00:00
ββββββββββββββββββββββββββββββββββββββββ 1.6/1.6 MB 59.7 MB/s eta 0:00:00
Building wheel for pycsep (setup.py) ... done
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-adk 1.20.0 requires sqlalchemy<3.0.0,>=2.0, but you have sqlalchemy 1.4.54 which is incompatible.
ipython-sql 0.5.0 requires sqlalchemy>=2.0, but you have sqlalchemy 1.4.54 which is incompatible.
[2]:
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
[3]:
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS_CODE/HORUS_40ter.mat /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/HORUS_Ita_Catalog/HORUS_Ita_Catalog.txt /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS_CODE/CELLE_ter.mat /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS_CODE/CPTI15.mat /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/dataset.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/select_m5plus.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/decimal_time.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/optimize_psi_results.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/optimize_psi_working.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/analysis/plot_relations.py /content/
!cp /content/drive/MyDrive/00_Image\ meeting/Researches/01_Scratchpad/earthquake/EEPAS-master/utils/coordinate_transform.py /content/
[4]:
import os
import pandas as pd
import numpy as np
from tqdm import tqdm
import scipy
import scipy.io
from scipy.io import loadmat, savemat
from matplotlib.patches import Rectangle, Polygon
import matplotlib.dates as mdates
from matplotlib.path import Path
import subprocess
import matplotlib.font_manager as fm
import matplotlib.pyplot as plt
from matplotlib.gridspec import GridSpec
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from pyproj import Transformer
from decimal_time import decimal_time_precise
from optimize_psi_working import optimize_psi, _load_catalog
from optimize_psi_results import optimize_psi_results
from select_m5plus import select_events_with_options
from plot_relations import analyze_scaling_relations
[5]:
mat_data = scipy.io.loadmat("/content/CELLE_ter.mat")
aa = mat_data['CELLESD']
aa.shape, aa[:2]
[5]:
((177, 10),
array([[6.86787804e+03, 6.91030445e+03, 5.15785699e+03, 5.20028340e+03,
1.00000000e+00, 2.69000000e+02, 6.88909124e+03, 5.17907019e+03,
1.00000000e+00, 4.12925132e-03],
[6.91030445e+03, 6.95273085e+03, 5.15785699e+03, 5.20028340e+03,
1.00000000e+00, 2.99000000e+02, 6.93151765e+03, 5.17907019e+03,
2.00000000e+00, 2.45036680e-03]]))
[6]:
np.savetxt('output.csv', aa[:,:4], delimiter=',')
[7]:
mat_data = scipy.io.loadmat("/content/CPTI15.mat")
mat_data['cpti15']
[7]:
array([[1.63980000e+01, 3.50500000e+01, 3.87196399e+03, 7.40126811e+03],
[1.19620000e+01, 3.50500000e+01, 3.86922477e+03, 6.99653642e+03],
[6.79600000e+00, 3.88930000e+01, 4.31304309e+03, 6.54906317e+03],
[6.80700000e+00, 4.32580000e+01, 4.79729044e+03, 6.57893081e+03],
[6.19400000e+00, 4.48420000e+01, 4.97633921e+03, 6.54164299e+03],
[6.06700000e+00, 4.62310000e+01, 5.13119466e+03, 6.54304825e+03],
[7.19400000e+00, 4.67090000e+01, 5.17834900e+03, 6.63307178e+03],
[8.99400000e+00, 4.67240000e+01, 5.17318387e+03, 6.77055195e+03],
[1.06860000e+01, 4.75450000e+01, 5.26077193e+03, 6.90123231e+03],
[1.19570000e+01, 4.75550000e+01, 5.26104730e+03, 6.99676847e+03],
[1.48610000e+01, 4.65210000e+01, 5.15024019e+03, 7.21919918e+03],
[1.47940000e+01, 4.38640000e+01, 4.85521402e+03, 7.22427410e+03],
[1.79530000e+01, 4.16670000e+01, 4.62486146e+03, 7.49513393e+03],
[1.96010000e+01, 3.97210000e+01, 4.41964146e+03, 7.65108514e+03],
[1.63980000e+01, 3.50500000e+01, 3.87196399e+03, 7.40126811e+03]])
Testing Region vs Neighborhood Region#
[8]:
italy_polygon = mat_data['cpti15']
# Extract polygon coordinates
polygon_lon = italy_polygon[:, 0] # Longitude
polygon_lat = italy_polygon[:, 1] # Latitude
polygon_x_km = italy_polygon[:, 3] # X coordinate (eastward, km)
polygon_y_km = italy_polygon[:, 2] # Y coordinate (northward, km)
polygon_x_m = polygon_x_km * 1000 # Convert to meters
polygon_y_m = polygon_y_km * 1000 # Convert to meters
print("="*60)
print("Italy Analysis Polygon Information")
print("="*60)
print(f"Polygon vertices: {len(italy_polygon)}")
print(f"Longitude range: {polygon_lon.min():.2f}Β°E to {polygon_lon.max():.2f}Β°E")
print(f"Latitude range: {polygon_lat.min():.2f}Β°N to {polygon_lat.max():.2f}Β°N")
print(f"X range: {polygon_x_km.min():.2f} to {polygon_x_km.max():.2f} km")
print(f"Y range: {polygon_y_km.min():.2f} to {polygon_y_km.max():.2f} km")
# Read CSV
df = pd.read_csv('output.csv', header=None)
# Convert unit: km β m
data_m = df.values * 1000
print(f"\n{'='*60}")
print("Grid Cell Information")
print("="*60)
print(f"Total number of cells: {len(data_m)}")
print(f"X range (m): {data_m[:, 0].min():.0f} to {data_m[:, 1].max():.0f}")
print(f"Y range (m): {data_m[:, 2].min():.0f} to {data_m[:, 3].max():.0f}")
# Create coordinate transformer: EPSG:7794 β EPSG:4326 (WGS84)
transformer = Transformer.from_crs("EPSG:7794", "EPSG:4326", always_xy=True)
# Convert coordinates and save
print("\nConverting coordinates to lat/lon...")
cells_lonlat = []
for i in tqdm(range(len(data_m)), desc="Processing cells"):
x_min, x_max, y_min, y_max = data_m[i]
# Convert four corners to lat/lon
lon_min, lat_min = transformer.transform(x_min, y_min)
lon_max, lat_max = transformer.transform(x_max, y_max)
cells_lonlat.append({
'cell_id': i + 1,
'x_min_m': x_min,
'x_max_m': x_max,
'y_min_m': y_min,
'y_max_m': y_max,
'lon_min': lon_min,
'lon_max': lon_max,
'lat_min': lat_min,
'lat_max': lat_max,
'lon_center': (lon_min + lon_max) / 2,
'lat_center': (lat_min + lat_max) / 2
})
cells_df = pd.DataFrame(cells_lonlat)
cells_df.to_csv('italy_cells_coordinates.csv', index=False)
print(f"β Coordinates saved to 'italy_cells_coordinates.csv'")
# Define RDN2008 projection (EPSG:7794)
rdn2008 = ccrs.epsg(7794)
# ============= Figure 1: Main map with selected labels and polygon =============
print("\n" + "="*60)
print("Creating Figure 1: Main map with selected labels and polygon...")
print("="*60)
fig = plt.figure(figsize=(50, 16))
ax = plt.axes(projection=rdn2008)
# Set extent (in meters)
x_min, x_max = data_m[:, 0].min() - 50000, data_m[:, 1].max() + 50000
y_min, y_max = data_m[:, 2].min() - 400000, data_m[:, 3].max() + 900000
ax.set_extent([x_min, x_max, y_min, y_max], crs=rdn2008)
# Add map features
print("Adding map features...")
ax.add_feature(cfeature.LAND, facecolor='lightgray', alpha=0.3, zorder=0)
ax.add_feature(cfeature.OCEAN, facecolor='lightblue', alpha=0.3, zorder=0)
ax.add_feature(cfeature.COASTLINE, linewidth=1.2, edgecolor='black', zorder=1)
ax.add_feature(cfeature.BORDERS, linewidth=0.8, linestyle='--', edgecolor='gray', zorder=1)
italy_poly = Polygon(list(zip(polygon_x_m, polygon_y_m)),
closed=True,
facecolor='yellow',
alpha=0.15,
edgecolor='red',
linewidth=3,
transform=rdn2008,
zorder=1.5,
clip_on=False)
ax.add_patch(italy_poly)
# Draw grid cells
print("Drawing grid cells...")
for i in tqdm(range(len(data_m)), desc="Drawing cells"):
x_min_cell, x_max_cell, y_min_cell, y_max_cell = data_m[i]
rect = Rectangle(
(x_min_cell, y_min_cell),
x_max_cell - x_min_cell,
y_max_cell - y_min_cell,
linewidth=0.6,
edgecolor='black',
facecolor='none',
alpha=0.8,
transform=rdn2008,
zorder=2
)
ax.add_patch(rect)
# Label every 15 cells
if i % 15 == 0:
x_center = (x_min_cell + x_max_cell) / 2
y_center = (y_min_cell + y_max_cell) / 2
ax.text(x_center, y_center, str(i + 1),
fontsize=6, ha='center', va='center',
transform=rdn2008,
bbox=dict(boxstyle='round,pad=0.2', facecolor='white',
alpha=0.7, edgecolor='none'),
zorder=3)
# Add gridlines
gl = ax.gridlines(draw_labels=True, linewidth=0.5, color='gray',
alpha=0.5, linestyle='--', crs=ccrs.PlateCarree())
gl.top_labels = False
gl.right_labels = False
# Add legend
ax.legend(loc='upper right', fontsize=10)
ax.set_title(f'Italian Territory Grid Tessellation ({len(data_m)} cells)\n'
f'Projection: RDN2008 (EPSG:7794)\n'
f'Cell side = 30β2 km β 42.43 km',
fontsize=13, fontweight='bold', pad=15)
#plt.tight_layout()
print("Saving Figure 1...")
plt.savefig('italy_grid_with_polygon.png', dpi=300, bbox_inches='tight')
print("β Saved: italy_grid_with_polygon.png")
plt.show()
# ============= Figure 2: Map with all cell IDs and polygon =============
print("\n" + "="*60)
print("Creating Figure 2: Map with all cell IDs and polygon...")
print("="*60)
fig2 = plt.figure(figsize=(16, 16))
ax2 = plt.axes(projection=rdn2008)
ax2.set_extent([x_min, x_max, y_min, y_max], crs=rdn2008)
print("Adding map features...")
ax2.add_feature(cfeature.LAND, facecolor='wheat', alpha=0.3)
ax2.add_feature(cfeature.OCEAN, facecolor='azure', alpha=0.5)
ax2.add_feature(cfeature.COASTLINE, linewidth=1, edgecolor='black')
ax2.add_feature(cfeature.BORDERS, linewidth=0.6, linestyle=':', edgecolor='gray')
# Draw Italy polygon boundary
print("Drawing Italy polygon boundary...")
ax2.plot(polygon_x_m, polygon_y_m,
color='red', linewidth=3, linestyle='-',
transform=rdn2008, zorder=4, label='Italy analysis region')
ax2.fill(polygon_x_m, polygon_y_m,
color='yellow', alpha=0.1, transform=rdn2008, zorder=1.5)
# Draw all cells with numbers
print("Drawing all cells with labels...")
for i in tqdm(range(len(data_m)), desc="Drawing labeled cells"):
x_min_cell, x_max_cell, y_min_cell, y_max_cell = data_m[i]
# Cell border
rect = Rectangle(
(x_min_cell, y_min_cell),
x_max_cell - x_min_cell,
y_max_cell - y_min_cell,
linewidth=0.8,
edgecolor='black',
facecolor='white',
alpha=0.6,
transform=rdn2008
)
ax2.add_patch(rect)
# Label all cells (starting from 1)
x_center = (x_min_cell + x_max_cell) / 2
y_center = (y_min_cell + y_max_cell) / 2
ax2.text(x_center, y_center, str(i + 1),
fontsize=5, ha='center', va='center',
transform=rdn2008,
color='darkred', fontweight='bold')
# Add legend
ax2.legend(loc='upper right', fontsize=10)
ax2.set_title(f'Italian Analysis Region - All {len(data_m)} Cell IDs\n'
f'with Analysis Polygon Boundary (RDN2008 Projection)',
fontsize=14, fontweight='bold', pad=15)
plt.tight_layout()
print("Saving Figure 2...")
plt.savefig('italy_grid_all_numbers_with_polygon.png', dpi=300, bbox_inches='tight')
print("β Saved: italy_grid_all_numbers_with_polygon.png")
plt.show()
print("\n" + "="*60)
print("β All tasks completed!")
print("="*60)
print("\nGenerated files:")
print(" 1. italy_cells_coordinates.csv - Cell coordinates (lat/lon + projected)")
print(" 2. italy_grid_with_polygon.png - Main map with polygon boundary")
print(" 3. italy_grid_all_numbers_with_polygon.png - All cell IDs with polygon")
print(f"\nCell ID range: 1 to {len(data_m)}")
print(f"Polygon vertices: {len(italy_polygon)}")
============================================================
Italy Analysis Polygon Information
============================================================
Polygon vertices: 15
Longitude range: 6.07Β°E to 19.60Β°E
Latitude range: 35.05Β°N to 47.55Β°N
X range: 6541.64 to 7651.09 km
Y range: 3869.22 to 5261.05 km
============================================================
Grid Cell Information
============================================================
Total number of cells: 177
X range (m): 6570893 to 7546701
Y range (m): 4054770 to 5200283
Converting coordinates to lat/lon...
Processing cells: 100%|ββββββββββ| 177/177 [00:00<00:00, 82451.33it/s]
β Coordinates saved to 'italy_cells_coordinates.csv'
============================================================
Creating Figure 1: Main map with selected labels and polygon...
============================================================
Adding map features...
Drawing grid cells...
Drawing cells: 100%|ββββββββββ| 177/177 [00:00<00:00, 1422.82it/s]
/tmp/ipython-input-3694189746.py:135: UserWarning: No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
ax.legend(loc='upper right', fontsize=10)
Saving Figure 1...
/usr/local/lib/python3.12/dist-packages/cartopy/io/__init__.py:242: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_land.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.12/dist-packages/cartopy/io/__init__.py:242: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_ocean.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.12/dist-packages/cartopy/io/__init__.py:242: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
/usr/local/lib/python3.12/dist-packages/cartopy/io/__init__.py:242: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_cultural/ne_10m_admin_0_boundary_lines_land.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)
β Saved: italy_grid_with_polygon.png
============================================================
Creating Figure 2: Map with all cell IDs and polygon...
============================================================
Adding map features...
Drawing Italy polygon boundary...
Drawing all cells with labels...
Drawing labeled cells: 100%|ββββββββββ| 177/177 [00:00<00:00, 719.94it/s]
Saving Figure 2...
β Saved: italy_grid_all_numbers_with_polygon.png
============================================================
β All tasks completed!
============================================================
Generated files:
1. italy_cells_coordinates.csv - Cell coordinates (lat/lon + projected)
2. italy_grid_with_polygon.png - Main map with polygon boundary
3. italy_grid_all_numbers_with_polygon.png - All cell IDs with polygon
Cell ID range: 1 to 177
Polygon vertices: 15
Transform coordinates#
[9]:
# 1. Define column names
# Note: Here we assume the 10th column of 'HORUS_Ita_Catalog_o.txt' is ML
columns = [
"Year", "Month", "Day", "Hour", "Minute", "Second",
"Latitude", "Longitude", "Depth", "Magnitude", # <-- Assuming the 10th column is Mw
"sigMw", "Geo-Ita", "Geo-CPTI15", "Ev. type", "Iside n."
]
# 2. Read the tab-separated .txt file
# Please replace 'HORUS_Ita_Catalog_o.txt' with your actual file path
# We skip the first header line (header=0, then skiprows=[0])
try:
df = pd.read_csv(
'HORUS_Ita_Catalog.txt',
delim_whitespace=True, # Use whitespace (including tab) as a delimiter
names=columns,
comment='#', # Assume comments start with # (if any)
skiprows=1 # Skip the original header line
)
# 3. Select the 10 fields required for your target format
target_columns = ['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second', 'Latitude', 'Longitude', 'Depth', 'Magnitude']
df_target = df[target_columns]
# 4. Convert the data to a NumPy array (often required for MATLAB format)
# and create a dictionary
mat_data = {
'HORUS': df_target.to_numpy()
# You can also pass the DataFrame directly, but converting to an array is closer to a MATLAB matrix
# mat_data_df = {'catalog_data': df_target}
}
# 5. Save as a .mat file
savemat("HORUS.mat", mat_data)
print("File 'HORUS.mat' saved successfully!")
print("\nFirst 5 records of the data:")
print(df_target.head())
except FileNotFoundError:
print("Error: 'HORUS_Ita_Catalog.txt' file not found.")
except Exception as e:
print(f"An error occurred during processing: {e}")
/tmp/ipython-input-863598739.py:13: FutureWarning: The 'delim_whitespace' keyword in pd.read_csv is deprecated and will be removed in a future version. Use ``sep='\s+'`` instead
df = pd.read_csv(
File 'HORUS.mat' saved successfully!
First 5 records of the data:
Year Month Day Hour Minute Second Latitude Longitude Depth \
0 1960.0 1.0 3.0 20.0 19.0 34.0 39.3000 15.3000 290.0
1 1960.0 1.0 4.0 9.0 20.0 0.0 43.1333 13.1667 0.0
2 1960.0 1.0 6.0 15.0 17.0 34.0 46.4833 12.7000 4.0
3 1960.0 1.0 6.0 15.0 20.0 53.0 46.4667 12.7000 0.0
4 1960.0 1.0 6.0 15.0 31.0 0.0 46.4333 12.7500 0.0
Magnitude
0 6.34
1 3.94
2 4.69
3 4.14
4 3.00
/tmp/ipython-input-863598739.py:13: DtypeWarning: Columns (13) have mixed types. Specify dtype option on import or set low_memory=False.
df = pd.read_csv(
[10]:
df_target
[10]:
| Year | Month | Day | Hour | Minute | Second | Latitude | Longitude | Depth | Magnitude | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1960.0 | 1.0 | 3.0 | 20.0 | 19.0 | 34.00 | 39.3000 | 15.3000 | 290.0 | 6.34 |
| 1 | 1960.0 | 1.0 | 4.0 | 9.0 | 20.0 | 0.00 | 43.1333 | 13.1667 | 0.0 | 3.94 |
| 2 | 1960.0 | 1.0 | 6.0 | 15.0 | 17.0 | 34.00 | 46.4833 | 12.7000 | 4.0 | 4.69 |
| 3 | 1960.0 | 1.0 | 6.0 | 15.0 | 20.0 | 53.00 | 46.4667 | 12.7000 | 0.0 | 4.14 |
| 4 | 1960.0 | 1.0 | 6.0 | 15.0 | 31.0 | 0.00 | 46.4333 | 12.7500 | 0.0 | 3.00 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 493413 | 2024.0 | 6.0 | 23.0 | 4.0 | 45.0 | 40.63 | 42.8875 | 12.8883 | 11.9 | 0.66 |
| 493414 | 2024.0 | 6.0 | 23.0 | 4.0 | 51.0 | 14.70 | 46.1982 | 12.4182 | 9.3 | 1.29 |
| 493415 | 2024.0 | 6.0 | 23.0 | 6.0 | 26.0 | 39.32 | 38.0970 | 15.0805 | 9.5 | 1.74 |
| 493416 | 2024.0 | 6.0 | 23.0 | 7.0 | 7.0 | 42.56 | 46.1515 | 10.4637 | 8.7 | 1.49 |
| 493417 | 2024.0 | 6.0 | 23.0 | 7.0 | 51.0 | 56.07 | 42.9032 | 12.9262 | 11.0 | 1.57 |
493418 rows Γ 10 columns
[11]:
!python coordinate_transform.py \
--horus-in HORUS.mat \
--celle-in CELLE_ter.mat \
--horus-out HORUS_Italy_RDN2008.mat \
--celle-out CELLE_Italy_RDN2008.mat
============================================================
Coordinate Transformation Tool
============================================================
Target Coordinate System: RDN2008 Italy Zone (epsg:7794)
Description: Italy national projected coordinate system
Region: RDN2008 Italy Zone
============================================================
β Coordinate transformer created: epsg:4326 -> epsg:7794
Processing HORUS file: HORUS.mat
Original data shape: (493418, 10)
Coordinate range:
Latitude: [34.12, 48.80]Β°N
Longitude: [5.11, 20.00]Β°E
β οΈ Warning: 498 coordinates out of bounds for RDN2008 Italy Zone, transformation may be inaccurate
Invalid range: lat [34.12, 48.80]Β°N, lon [5.11, 19.30]Β°E
Expected range: lat [35, 48]Β°N, lon [6, 20]Β°E
Transformed coordinate range (km):
Easting (X): [6386.26, 7713.82] km
Northing (Y): [3775.51, 5408.56] km
β Data written to HORUS_Italy_RDN2008.mat
Variable name `HORUS` preserved, original lat/lon replaced with projected coordinates (km)
Processing CELLE file: CELLE_ter.mat
β Error processing CELLE file: 'Variable `CELLE` not found in CELLE_ter.mat'
============================================================
β Coordinate transformation completed!
============================================================
Obtain data in the testing region.
[12]:
# ============= 1. Reading Grid Cell Definitions =============
print("="*60)
print("Reading grid cell definitions...")
print("="*60)
cells_df = pd.read_csv('output.csv', header=None)
cells_df.columns = ['x_min_km', 'x_max_km', 'y_min_km', 'y_max_km']
cells_df['cell_id'] = range(1, len(cells_df) + 1)
print(f"Total grid cells: {len(cells_df)}")
print(f"X range: [{cells_df['x_min_km'].min():.2f}, {cells_df['x_max_km'].max():.2f}] km")
print(f"Y range: [{cells_df['y_min_km'].min():.2f}, {cells_df['y_max_km'].max():.2f}] km")
# ============= 2. Reading Earthquake Catalog A (Used for Filtering) =============
print("\n" + "="*60)
print("Reading earthquake catalog A...")
print("="*60)
mat_data_A = scipy.io.loadmat("/content/HORUS_Italy_RDN2008.mat")
var_name = "HORUS"
catalog_data_A = mat_data_A[var_name]
# First 10 columns
columns_A = ['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second',
'Latitude', 'Longitude', 'Depth', 'Magnitude']
df_catalog_A = pd.DataFrame(catalog_data_A[:, :10], columns=columns_A)
# Create datetime column
df_catalog_A['datetime'] = pd.to_datetime(
df_catalog_A[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']].astype(int),
errors='coerce'
)
df_catalog_A.dropna(subset=['datetime'], inplace=True)
print(f"Total earthquakes in catalog A: {len(df_catalog_A)}")
print(f"Longitude (x) range: [{df_catalog_A['Longitude'].min():.2f}, {df_catalog_A['Longitude'].max():.2f}] km")
print(f"Latitude (y) range: [{df_catalog_A['Latitude'].min():.2f}, {df_catalog_A['Latitude'].max():.2f}] km")
print(f"Magnitude range: [{df_catalog_A['Magnitude'].min():.2f}, {df_catalog_A['Magnitude'].max():.2f}]")
print(f"Depth range: [{df_catalog_A['Depth'].min():.2f}, {df_catalog_A['Depth'].max():.2f}] km")
print(f"Year range: [{df_catalog_A['Year'].min():.0f}, {df_catalog_A['Year'].max():.0f}]")
# ============= 3. Reading Earthquake Catalog B (Full Data) =============
print("\n" + "="*60)
print("Reading earthquake catalog B...")
print("="*60)
mat_data_B = scipy.io.loadmat("/content/HORUS.mat")
catalog_data_B = mat_data_B[var_name]
print(f"Total earthquakes in catalog B: {catalog_data_B.shape[0]}")
print(f"Number of columns in B: {catalog_data_B.shape[1]}")
# Check if the number of records in A and B are consistent
if catalog_data_A.shape[0] != catalog_data_B.shape[0]:
print(f"β Warning: A has {catalog_data_A.shape[0]} rows, B has {catalog_data_B.shape[0]} rows")
else:
print(f"β A and B have the same number of rows: {catalog_data_A.shape[0]}")
# ============= 4. Vectorized Filtering (Using A's Coordinates) =============
print("\n" + "="*60)
print("Filtering earthquakes within grid cells (vectorized)...")
print("="*60)
# Use A's coordinates for filtering
eq_y = df_catalog_A['Latitude'].values # y coordinate
eq_x = df_catalog_A['Longitude'].values # x coordinate
# Extract grid boundaries
cell_x_min = cells_df['x_min_km'].values
cell_x_max = cells_df['x_max_km'].values
cell_y_min = cells_df['y_min_km'].values
cell_y_max = cells_df['y_max_km'].values
# Vectorized computation
print("Computing spatial containment matrix...")
in_x = (eq_x[:, None] >= cell_x_min[None, :]) & (eq_x[:, None] <= cell_x_max[None, :])
in_y = (eq_y[:, None] >= cell_y_min[None, :]) & (eq_y[:, None] <= cell_y_max[None, :])
in_cell = in_x & in_y
print("Finding cell IDs for each earthquake...")
cell_ids = np.where(in_cell.any(axis=1),
in_cell.argmax(axis=1) + 1,
-1)
df_catalog_A['cell_id'] = cell_ids
# Get indices inside the grid (spatial filter)
spatial_mask = cell_ids > 0
print(f"\n{'='*60}")
print(f"β Spatial filtering completed!")
print(f"{'='*60}")
print(f"Total earthquakes: {len(df_catalog_A)}")
print(f"Earthquakes within grid cells: {spatial_mask.sum()}")
print(f"Earthquakes outside grid cells: {(~spatial_mask).sum()}")
print(f"Percentage within cells: {spatial_mask.sum()/len(df_catalog_A)*100:.2f}%")
# ============= 5. Adding Depth and Time Filters =============
print("\n" + "="*60)
print("Applying depth and time filters...")
print("="*60)
# Depth filter: Depth β€ 40 km
depth_mask = df_catalog_A['Depth'] <= 40.0
print(f"Earthquakes with depth β€ 40 km: {depth_mask.sum()} ({depth_mask.sum()/len(df_catalog_A)*100:.2f}%)")
# Time filter: Year <= 2021
time_mask = df_catalog_A['Year'] <= 2021
print(f"Earthquakes β€ 2021: {time_mask.sum()} ({time_mask.sum()/len(df_catalog_A)*100:.2f}%)")
# Combine all filter conditions
combined_mask = spatial_mask & depth_mask & time_mask
combined_indices = np.where(combined_mask)[0]
print(f"\n{'='*60}")
print(f"β All filters applied!")
print(f"{'='*60}")
print(f"Total earthquakes: {len(df_catalog_A)}")
print(f"After spatial filter: {spatial_mask.sum()}")
print(f"After depth filter (β€40km): {(spatial_mask & depth_mask).sum()}")
print(f"After time filter (β€2021): {combined_mask.sum()}")
print(f"Final percentage: {combined_mask.sum()/len(df_catalog_A)*100:.2f}%")
# ============= 6. Filtering A and B using Combined Criteria =============
print(f"\n{'='*60}")
print("Filtering both A and B using combined criteria...")
print(f"{'='*60}")
# Filter A
df_A_filtered = df_catalog_A[combined_mask].copy()
print(f"β Filtered A: {len(df_A_filtered)} earthquakes")
# Filter B (using the same indices)
catalog_B_filtered = catalog_data_B[combined_indices, :]
print(f"β Filtered B: {catalog_B_filtered.shape[0]} earthquakes, {catalog_B_filtered.shape[1]} columns")
# ============= 7. Statistics =============
print(f"\n{'='*60}")
print("Statistics of filtered earthquakes:")
print(f"{'='*60}")
print(f"Magnitude range: [{df_A_filtered['Magnitude'].min():.2f}, {df_A_filtered['Magnitude'].max():.2f}]")
print(f"Depth range: [{df_A_filtered['Depth'].min():.2f}, {df_A_filtered['Depth'].max():.2f}] km")
print(f"Year range: [{df_A_filtered['Year'].min():.0f}, {df_A_filtered['Year'].max():.0f}]")
print(f"Time range: {df_A_filtered['datetime'].min()} to {df_A_filtered['datetime'].max()}")
# Statistics of earthquake count per cell
cell_counts = df_A_filtered['cell_id'].value_counts().sort_index()
print(f"\nNumber of cells with earthquakes: {len(cell_counts)}")
print(f"Earthquakes per cell - mean: {cell_counts.mean():.1f}, median: {cell_counts.median():.1f}")
print(f"Max earthquakes in a single cell: {cell_counts.max()}")
# Display the top 10 cells with the most earthquakes
print("\nTop 10 cells with most earthquakes:")
print(cell_counts.head(10))
# ============= 8. Saving Results =============
print(f"\n{'='*60}")
print("Saving results...")
print(f"{'='*60}")
# Save A (CSV) - without cell_id
output_A_columns = ['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second',
'Latitude', 'Longitude', 'Depth', 'Magnitude']
df_A_filtered[output_A_columns].to_csv('HORUS_Italy_filtered.csv', index=False)
print("β Saved: HORUS_Italy_filtered.csv (without cell_id)")
# Save A (MAT) - without cell_id
output_A_array = df_A_filtered[output_A_columns].to_numpy()
mat_output_A = {
'HORUS': output_A_array,
'column_names': output_A_columns
}
scipy.io.savemat('HORUS_Italy_RDN2008_filtered.mat', mat_output_A)
print("β Saved: HORUS_Italy_RDN2008_filtered.mat (without cell_id)")
# Save B (MAT) - without adding cell_id
mat_output_B = {
'HORUS': catalog_B_filtered
}
scipy.io.savemat('HORUS_Italy_filtered.mat', mat_output_B)
print("β Saved: HORUS_Italy_filtered.mat (without cell_id)")
print(f" B columns: {catalog_data_B.shape[1]} (all original columns)")
# Save B (CSV) - if the number of columns is not too large
if catalog_B_filtered.shape[1] <= 20:
b_columns = [f'col_{i}' for i in range(catalog_B_filtered.shape[1])]
b_columns[0] = 'Year'
b_columns[6] = 'Latitude'
b_columns[7] = 'Longitude'
b_columns[8] = 'Depth'
df_B_filtered = pd.DataFrame(catalog_B_filtered, columns=b_columns)
df_B_filtered.to_csv('HORUS_Italy_filtered_full.csv', index=False)
print("β Saved: HORUS_Italy_filtered_full.csv")
else:
print(f"β B has {catalog_B_filtered.shape[1]} columns, skipping CSV export (too many columns)")
# Save cell statistics (Important!)
cell_stats = pd.DataFrame({
'cell_id': cell_counts.index,
'earthquake_count': cell_counts.values
})
cell_stats = cell_stats.merge(
cells_df[['cell_id', 'x_min_km', 'x_max_km', 'y_min_km', 'y_max_km']],
on='cell_id',
how='left'
)
cell_stats.to_csv('cell_earthquake_counts.csv', index=False)
print("β Saved: cell_earthquake_counts.csv")
print(f"\n{'='*60}")
print("β All tasks completed!")
print(f"{'='*60}")
print("\nFilter criteria applied:")
print(" 1. Spatial: Within 177 grid cells")
print(" 2. Depth: β€ 40 km")
print(" 3. Time: β€ 2021")
print("\nGenerated files:")
print(" 1. HORUS_Italy_filtered.csv - Filtered catalog (10 columns, no cell_id)")
print(" 2. HORUS_Italy_RDN2008_filtered.mat - Filtered catalog (MAT, no cell_id)")
print(" 3. HORUS_Italy_filtered.mat - Full filtered catalog (MAT, no cell_id)")
if catalog_B_filtered.shape[1] <= 20:
print(" 4. HORUS_Italy_filtered_full.csv - Full filtered catalog (CSV)")
print(f" {4 if catalog_B_filtered.shape[1] <= 20 else 3}. cell_earthquake_counts.csv - Statistics per cell")
# ============= 9. Display Preview =============
print(f"\n{'='*60}")
print("Preview of filtered data (without cell_id):")
print(f"{'='*60}")
print(df_A_filtered[output_A_columns].head(10))
print(f"\n{'='*60}")
print("Filtering summary by criteria:")
print(f"{'='*60}")
print(f"Original total: {len(df_catalog_A)}")
print(f"After spatial filter: {spatial_mask.sum()} (-{len(df_catalog_A) - spatial_mask.sum()})")
print(f"After depth filter: {(spatial_mask & depth_mask).sum()} (-{spatial_mask.sum() - (spatial_mask & depth_mask).sum()})")
print(f"After time filter: {combined_mask.sum()} (-{(spatial_mask & depth_mask).sum() - combined_mask.sum()})")
print(f"Final retained: {combined_mask.sum()} ({combined_mask.sum()/len(df_catalog_A)*100:.2f}%)")
============================================================
Reading grid cell definitions...
============================================================
Total grid cells: 177
X range: [6570.89, 7546.70] km
Y range: [4054.77, 5200.28] km
============================================================
Reading earthquake catalog A...
============================================================
Total earthquakes in catalog A: 493418
Longitude (x) range: [6386.26, 7713.82] km
Latitude (y) range: [3775.51, 5408.56] km
Magnitude range: [-1.13, 6.81]
Depth range: [-0.20, 630.80] km
Year range: [1960, 2024]
============================================================
Reading earthquake catalog B...
============================================================
Total earthquakes in catalog B: 493418
Number of columns in B: 10
β A and B have the same number of rows: 493418
============================================================
Filtering earthquakes within grid cells (vectorized)...
============================================================
Computing spatial containment matrix...
Finding cell IDs for each earthquake...
============================================================
β Spatial filtering completed!
============================================================
Total earthquakes: 493418
Earthquakes within grid cells: 464318
Earthquakes outside grid cells: 29100
Percentage within cells: 94.10%
============================================================
Applying depth and time filters...
============================================================
Earthquakes with depth β€ 40 km: 483478 (97.99%)
Earthquakes β€ 2021: 450431 (91.29%)
============================================================
β All filters applied!
============================================================
Total earthquakes: 493418
After spatial filter: 464318
After depth filter (β€40km): 457646
After time filter (β€2021): 418257
Final percentage: 84.77%
============================================================
Filtering both A and B using combined criteria...
============================================================
β Filtered A: 418257 earthquakes
β Filtered B: 418257 earthquakes, 10 columns
============================================================
Statistics of filtered earthquakes:
============================================================
Magnitude range: [-1.13, 6.81]
Depth range: [-0.20, 40.00] km
Year range: [1960, 2021]
Time range: 1960-01-04 09:20:00 to 2021-12-31 23:37:19
Number of cells with earthquakes: 177
Earthquakes per cell - mean: 2363.0, median: 496.0
Max earthquakes in a single cell: 79696
Top 10 cells with most earthquakes:
cell_id
1 312
2 112
3 44
4 99
5 687
6 179
7 2459
8 5291
9 675
10 77
Name: count, dtype: int64
============================================================
Saving results...
============================================================
β Saved: HORUS_Italy_filtered.csv (without cell_id)
β Saved: HORUS_Italy_RDN2008_filtered.mat (without cell_id)
β Saved: HORUS_Italy_filtered.mat (without cell_id)
B columns: 10 (all original columns)
β Saved: HORUS_Italy_filtered_full.csv
β Saved: cell_earthquake_counts.csv
============================================================
β All tasks completed!
============================================================
Filter criteria applied:
1. Spatial: Within 177 grid cells
2. Depth: β€ 40 km
3. Time: β€ 2021
Generated files:
1. HORUS_Italy_filtered.csv - Filtered catalog (10 columns, no cell_id)
2. HORUS_Italy_RDN2008_filtered.mat - Filtered catalog (MAT, no cell_id)
3. HORUS_Italy_filtered.mat - Full filtered catalog (MAT, no cell_id)
4. HORUS_Italy_filtered_full.csv - Full filtered catalog (CSV)
4. cell_earthquake_counts.csv - Statistics per cell
============================================================
Preview of filtered data (without cell_id):
============================================================
Year Month Day Hour Minute Second Latitude Longitude Depth \
1 1960.0 1.0 4.0 9.0 20.0 0.0 4771.022277 7094.785982 0.0
2 1960.0 1.0 6.0 15.0 17.0 34.0 5142.320681 7053.669572 4.0
3 1960.0 1.0 6.0 15.0 20.0 53.0 5140.478190 7053.685894 0.0
4 1960.0 1.0 6.0 15.0 31.0 0.0 5136.806207 7057.555763 0.0
5 1960.0 1.0 6.0 15.0 45.0 0.0 5136.806207 7057.555763 0.0
6 1960.0 1.0 6.0 18.0 29.0 0.0 5142.229703 7042.168970 0.0
7 1960.0 1.0 6.0 20.0 0.0 0.0 5136.806207 7057.555763 0.0
8 1960.0 1.0 6.0 22.0 20.0 0.0 5140.478190 7053.685894 0.0
9 1960.0 1.0 7.0 1.0 41.0 20.0 5140.478190 7053.685894 0.0
10 1960.0 1.0 7.0 1.0 42.0 22.0 5127.503317 7051.241153 0.0
Magnitude
1 3.94
2 4.69
3 4.14
4 3.00
5 3.00
6 3.00
7 3.00
8 2.70
9 3.80
10 3.80
============================================================
Filtering summary by criteria:
============================================================
Original total: 493418
After spatial filter: 464318 (-29100)
After depth filter: 457646 (-6672)
After time filter: 418257 (-39389)
Final retained: 418257 (84.77%)
Obtain data in neighborhood region
[13]:
# ============= 1. Reading Polygon Definition =============
print("="*60)
print("Reading polygon definition...")
print("="*60)
# Read the polygon for the Italy analysis region
mat_data = scipy.io.loadmat("/content/CPTI15.mat")
italy_polygon = mat_data['cpti15']
# Extract polygon coordinates (using km units)
polygon_x_km = italy_polygon[:, 3] # X coordinate (eastward)
polygon_y_km = italy_polygon[:, 2] # Y coordinate (northward)
# Create a matplotlib Path object to check if a point is inside the polygon
polygon_points = np.column_stack([polygon_x_km, polygon_y_km])
polygon_path = Path(polygon_points)
print(f"Polygon vertices: {len(italy_polygon)}")
print(f"X range: [{polygon_x_km.min():.2f}, {polygon_x_km.max():.2f}] km")
print(f"Y range: [{polygon_y_km.min():.2f}, {polygon_y_km.max():.2f}] km")
# ============= 2. Reading Grid Cell Definitions (For Statistics) =============
print("\n" + "="*60)
print("Reading grid cell definitions...")
print("="*60)
cells_df = pd.read_csv('output.csv', header=None)
cells_df.columns = ['x_min_km', 'x_max_km', 'y_min_km', 'y_max_km']
cells_df['cell_id'] = range(1, len(cells_df) + 1)
print(f"Total grid cells: {len(cells_df)}")
print(f"X range: [{cells_df['x_min_km'].min():.2f}, {cells_df['x_max_km'].max():.2f}] km")
print(f"Y range: [{cells_df['y_min_km'].min():.2f}, {cells_df['y_max_km'].max():.2f}] km")
# ============= 3. Reading Earthquake Catalog A (Used for Filtering) =============
print("\n" + "="*60)
print("Reading earthquake catalog A...")
print("="*60)
mat_data_A = scipy.io.loadmat("/content/HORUS_Italy_RDN2008.mat")
var_name = "HORUS"
catalog_data_A = mat_data_A[var_name]
# First 10 columns
columns_A = ['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second',
'Latitude', 'Longitude', 'Depth', 'Magnitude']
df_catalog_A = pd.DataFrame(catalog_data_A[:, :10], columns=columns_A)
# Create datetime column
df_catalog_A['datetime'] = pd.to_datetime(
df_catalog_A[['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second']].astype(int),
errors='coerce'
)
df_catalog_A.dropna(subset=['datetime'], inplace=True)
print(f"Total earthquakes in catalog A: {len(df_catalog_A)}")
print(f"Longitude (x) range: [{df_catalog_A['Longitude'].min():.2f}, {df_catalog_A['Longitude'].max():.2f}] km")
print(f"Latitude (y) range: [{df_catalog_A['Latitude'].min():.2f}, {df_catalog_A['Latitude'].max():.2f}] km")
print(f"Magnitude range: [{df_catalog_A['Magnitude'].min():.2f}, {df_catalog_A['Magnitude'].max():.2f}]")
print(f"Depth range: [{df_catalog_A['Depth'].min():.2f}, {df_catalog_A['Depth'].max():.2f}] km")
print(f"Year range: [{df_catalog_A['Year'].min():.0f}, {df_catalog_A['Year'].max():.0f}]")
# ============= 4. Reading Earthquake Catalog B (Full Data) =============
print("\n" + "="*60)
print("Reading earthquake catalog B...")
print("="*60)
mat_data_B = scipy.io.loadmat("/content/HORUS.mat")
catalog_data_B = mat_data_B[var_name]
print(f"Total earthquakes in catalog B: {catalog_data_B.shape[0]}")
print(f"Number of columns in B: {catalog_data_B.shape[1]}")
# Check if the number of records in A and B are consistent
if catalog_data_A.shape[0] != catalog_data_B.shape[0]:
print(f"β Warning: A has {catalog_data_A.shape[0]} rows, B has {catalog_data_B.shape[0]} rows")
else:
print(f"β A and B have the same number of rows: {catalog_data_A.shape[0]}")
# ============= 5. Spatial Filtering (Using Polygon) =============
print("\n" + "="*60)
print("Filtering earthquakes within polygon (vectorized)...")
print("="*60)
# Extract earthquake coordinates (km)
eq_x = df_catalog_A['Longitude'].values # x coordinate
eq_y = df_catalog_A['Latitude'].values # y coordinate
eq_points = np.column_stack([eq_x, eq_y])
# Use matplotlib Path to check if points are inside the polygon (very fast!)
print("Checking if earthquakes are within polygon...")
spatial_mask = polygon_path.contains_points(eq_points)
print(f"\n{'='*60}")
print(f"β Spatial filtering (polygon) completed!")
print(f"{'='*60}")
print(f"Total earthquakes: {len(df_catalog_A)}")
print(f"Earthquakes within polygon: {spatial_mask.sum()}")
print(f"Earthquakes outside polygon: {(~spatial_mask).sum()}")
print(f"Percentage within polygon: {spatial_mask.sum()/len(df_catalog_A)*100:.2f}%")
# ============= 6. Assigning Cell IDs to Earthquakes (For Statistics Only) =============
print("\n" + "="*60)
print("Assigning cell IDs to earthquakes (for statistics only)...")
print("="*60)
# Extract cell boundaries
cell_x_min = cells_df['x_min_km'].values
cell_x_max = cells_df['x_max_km'].values
cell_y_min = cells_df['y_min_km'].values
cell_y_max = cells_df['y_max_km'].values
# Vectorized computation (finding which cell each earthquake is inside)
print("Computing cell assignments...")
in_x = (eq_x[:, None] >= cell_x_min[None, :]) & (eq_x[:, None] <= cell_x_max[None, :])
in_y = (eq_y[:, None] >= cell_y_min[None, :]) & (eq_y[:, None] <= cell_y_max[None, :])
in_cell = in_x & in_y
cell_ids = np.where(in_cell.any(axis=1),
in_cell.argmax(axis=1) + 1,
-1) # -1 indicates not inside any cell
df_catalog_A['cell_id'] = cell_ids
# Count earthquakes inside the polygon but outside any cell
in_polygon_not_in_cell = spatial_mask & (cell_ids == -1)
print(f"Earthquakes in polygon but not in any cell: {in_polygon_not_in_cell.sum()}")
# ============= 7. Adding Depth and Time Filters =============
print("\n" + "="*60)
print("Applying depth and time filters...")
print("="*60)
# Depth filter: Depth β€ 40 km
depth_mask = df_catalog_A['Depth'] <= 40.0
print(f"Earthquakes with depth β€ 40 km: {depth_mask.sum()} ({depth_mask.sum()/len(df_catalog_A)*100:.2f}%)")
# Time filter: Year <= 2021
time_mask = df_catalog_A['Year'] <= 2021
print(f"Earthquakes β€ 2021: {time_mask.sum()} ({time_mask.sum()/len(df_catalog_A)*100:.2f}%)")
# Combine all filter conditions
combined_mask = spatial_mask & depth_mask & time_mask
combined_indices = np.where(combined_mask)[0]
print(f"\n{'='*60}")
print(f"β All filters applied!")
print(f"{'='*60}")
print(f"Total earthquakes: {len(df_catalog_A)}")
print(f"After spatial filter (polygon): {spatial_mask.sum()}")
print(f"After depth filter (β€40km): {(spatial_mask & depth_mask).sum()}")
print(f"After time filter (β€2021): {combined_mask.sum()}")
print(f"Final percentage: {combined_mask.sum()/len(df_catalog_A)*100:.2f}%")
# ============= 8. Filtering A and B using Combined Criteria =============
print(f"\n{'='*60}")
print("Filtering both A and B using combined criteria...")
print(f"{'='*60}")
# Filter A
df_A_filtered = df_catalog_A[combined_mask].copy()
print(f"β Filtered A: {len(df_A_filtered)} earthquakes")
# Filter B (using the same indices)
catalog_B_filtered = catalog_data_B[combined_indices, :]
print(f"β Filtered B: {catalog_B_filtered.shape[0]} earthquakes, {catalog_B_filtered.shape[1]} columns")
# ============= 9. Statistics =============
print(f"\n{'='*60}")
print("Statistics of filtered earthquakes:")
print(f"{'='*60}")
print(f"Magnitude range: [{df_A_filtered['Magnitude'].min():.2f}, {df_A_filtered['Magnitude'].max():.2f}]")
print(f"Depth range: [{df_A_filtered['Depth'].min():.2f}, {df_A_filtered['Depth'].max():.2f}] km")
print(f"Year range: [{df_A_filtered['Year'].min():.0f}, {df_A_filtered['Year'].max():.0f}]")
print(f"Time range: {df_A_filtered['datetime'].min()} to {df_A_filtered['datetime'].max()}")
# Statistics of earthquake count per cell (only count those with a cell_id > 0)
df_with_cell = df_A_filtered[df_A_filtered['cell_id'] > 0]
if len(df_with_cell) > 0:
cell_counts = df_with_cell['cell_id'].value_counts().sort_index()
print(f"\nNumber of cells with earthquakes: {len(cell_counts)}")
print(f"Earthquakes per cell - mean: {cell_counts.mean():.1f}, median: {cell_counts.median():.1f}")
print(f"Max earthquakes in a single cell: {cell_counts.max()}")
# Display the top 10 cells with the most earthquakes
print("\nTop 10 cells with most earthquakes:")
print(cell_counts.head(10))
else:
print("\nβ No earthquakes assigned to cells")
# ============= 10. Saving Results =============
print(f"\n{'='*60}")
print("Saving results...")
print(f"{'='*60}")
# Save A (CSV) - without cell_id
output_A_columns = ['Year', 'Month', 'Day', 'Hour', 'Minute', 'Second',
'Latitude', 'Longitude', 'Depth', 'Magnitude']
df_A_filtered[output_A_columns].to_csv('HORUS_Italy_polygon_filtered.csv', index=False)
print("β Saved: HORUS_Italy_polygon_filtered.csv (without cell_id)")
# Save A (MAT) - without cell_id
output_A_array = df_A_filtered[output_A_columns].to_numpy()
mat_output_A = {
'HORUS': output_A_array,
'column_names': output_A_columns
}
scipy.io.savemat('HORUS_Italy_RDN2008_polygon_filtered.mat', mat_output_A)
print("β Saved: HORUS_Italy_RDN2008_polygon_filtered.mat (without cell_id)")
# Save B (MAT) - without adding cell_id
mat_output_B = {
'HORUS': catalog_B_filtered
}
scipy.io.savemat('HORUS_Italy_polygon_filtered.mat', mat_output_B)
print("β Saved: HORUS_Italy_polygon_filtered.mat (without cell_id)")
print(f" B columns: {catalog_B_filtered.shape[1]} (all original columns)")
# Save B (CSV) - if the number of columns is not too large
if catalog_B_filtered.shape[1] <= 20:
b_columns = [f'col_{i}' for i in range(catalog_B_filtered.shape[1])]
b_columns[0] = 'Year'
b_columns[6] = 'Latitude'
b_columns[7] = 'Longitude'
b_columns[8] = 'Depth'
df_B_filtered = pd.DataFrame(catalog_B_filtered, columns=b_columns)
df_B_filtered.to_csv('HORUS_Italy_polygon_filtered_full.csv', index=False)
print("β Saved: HORUS_Italy_polygon_filtered_full.csv")
else:
print(f"β B has {catalog_B_filtered.shape[1]} columns, skipping CSV export (too many columns)")
# Save cell statistics (Important!)
if len(df_with_cell) > 0:
cell_stats = pd.DataFrame({
'cell_id': cell_counts.index,
'earthquake_count': cell_counts.values
})
cell_stats = cell_stats.merge(
cells_df[['cell_id', 'x_min_km', 'x_max_km', 'y_min_km', 'y_max_km']],
on='cell_id',
how='left'
)
cell_stats.to_csv('cell_earthquake_counts_polygon.csv', index=False)
print("β Saved: cell_earthquake_counts_polygon.csv")
print(f"\n{'='*60}")
print("β All tasks completed!")
print(f"{'='*60}")
print("\nFilter criteria applied:")
print(" 1. Spatial: Within Italy analysis polygon")
print(" 2. Depth: β€ 40 km")
print(" 3. Time: β€ 2021")
print("\nGenerated files:")
print(" 1. HORUS_Italy_polygon_filtered.csv - Filtered catalog A (10 columns, no cell_id)")
print(" 2. HORUS_Italy_RDN2008_polygon_filtered.mat - Filtered catalog A (MAT, no cell_id)")
print(" 3. HORUS_Italy_polygon_filtered.mat - Full filtered catalog B (MAT, no cell_id)")
if catalog_B_filtered.shape[1] <= 20:
print(" 4. HORUS_Italy_polygon_filtered_full.csv - Full filtered catalog B (CSV)")
if len(df_with_cell) > 0:
print(f" {4 if catalog_B_filtered.shape[1] <= 20 else 3}. cell_earthquake_counts_polygon.csv - Statistics per cell")
# ============= 11. Display Preview =============
print(f"\n{'='*60}")
print("Preview of filtered data (without cell_id):")
print(f"{'='*60}")
print(df_A_filtered[output_A_columns].head(10))
print(f"\n{'='*60}")
print("Filtering summary:")
print(f"{'='*60}")
print(f"Original total: {len(df_catalog_A)}")
print(f"After spatial filter (polygon): {spatial_mask.sum()} (-{len(df_catalog_A) - spatial_mask.sum()})")
print(f"After depth filter: {(spatial_mask & depth_mask).sum()} (-{spatial_mask.sum() - (spatial_mask & depth_mask).sum()})")
print(f"After time filter: {combined_mask.sum()} (-{(spatial_mask & depth_mask).sum() - combined_mask.sum()})")
print(f"Final retained: {combined_mask.sum()} ({combined_mask.sum()/len(df_catalog_A)*100:.2f}%)")
print(f"\nEarthquakes in polygon with cell assignment: {(df_A_filtered['cell_id'] > 0).sum()}")
print(f"Earthquakes in polygon without cell assignment: {(df_A_filtered['cell_id'] == -1).sum()}")
============================================================
Reading polygon definition...
============================================================
Polygon vertices: 15
X range: [6541.64, 7651.09] km
Y range: [3869.22, 5261.05] km
============================================================
Reading grid cell definitions...
============================================================
Total grid cells: 177
X range: [6570.89, 7546.70] km
Y range: [4054.77, 5200.28] km
============================================================
Reading earthquake catalog A...
============================================================
Total earthquakes in catalog A: 493418
Longitude (x) range: [6386.26, 7713.82] km
Latitude (y) range: [3775.51, 5408.56] km
Magnitude range: [-1.13, 6.81]
Depth range: [-0.20, 630.80] km
Year range: [1960, 2024]
============================================================
Reading earthquake catalog B...
============================================================
Total earthquakes in catalog B: 493418
Number of columns in B: 10
β A and B have the same number of rows: 493418
============================================================
Filtering earthquakes within polygon (vectorized)...
============================================================
Checking if earthquakes are within polygon...
============================================================
β Spatial filtering (polygon) completed!
============================================================
Total earthquakes: 493418
Earthquakes within polygon: 489616
Earthquakes outside polygon: 3802
Percentage within polygon: 99.23%
============================================================
Assigning cell IDs to earthquakes (for statistics only)...
============================================================
Computing cell assignments...
Earthquakes in polygon but not in any cell: 25298
============================================================
Applying depth and time filters...
============================================================
Earthquakes with depth β€ 40 km: 483478 (97.99%)
Earthquakes β€ 2021: 450431 (91.29%)
============================================================
β All filters applied!
============================================================
Total earthquakes: 493418
After spatial filter (polygon): 489616
After depth filter (β€40km): 479947
After time filter (β€2021): 438192
Final percentage: 88.81%
============================================================
Filtering both A and B using combined criteria...
============================================================
β Filtered A: 438192 earthquakes
β Filtered B: 438192 earthquakes, 10 columns
============================================================
Statistics of filtered earthquakes:
============================================================
Magnitude range: [-1.13, 6.81]
Depth range: [-0.20, 40.00] km
Year range: [1960, 2021]
Time range: 1960-01-04 09:20:00 to 2021-12-31 23:37:19
Number of cells with earthquakes: 177
Earthquakes per cell - mean: 2363.0, median: 496.0
Max earthquakes in a single cell: 79696
Top 10 cells with most earthquakes:
cell_id
1 312
2 112
3 44
4 99
5 687
6 179
7 2459
8 5291
9 675
10 77
Name: count, dtype: int64
============================================================
Saving results...
============================================================
β Saved: HORUS_Italy_polygon_filtered.csv (without cell_id)
β Saved: HORUS_Italy_RDN2008_polygon_filtered.mat (without cell_id)
β Saved: HORUS_Italy_polygon_filtered.mat (without cell_id)
B columns: 10 (all original columns)
β Saved: HORUS_Italy_polygon_filtered_full.csv
β Saved: cell_earthquake_counts_polygon.csv
============================================================
β All tasks completed!
============================================================
Filter criteria applied:
1. Spatial: Within Italy analysis polygon
2. Depth: β€ 40 km
3. Time: β€ 2021
Generated files:
1. HORUS_Italy_polygon_filtered.csv - Filtered catalog A (10 columns, no cell_id)
2. HORUS_Italy_RDN2008_polygon_filtered.mat - Filtered catalog A (MAT, no cell_id)
3. HORUS_Italy_polygon_filtered.mat - Full filtered catalog B (MAT, no cell_id)
4. HORUS_Italy_polygon_filtered_full.csv - Full filtered catalog B (CSV)
4. cell_earthquake_counts_polygon.csv - Statistics per cell
============================================================
Preview of filtered data (without cell_id):
============================================================
Year Month Day Hour Minute Second Latitude Longitude Depth \
1 1960.0 1.0 4.0 9.0 20.0 0.0 4771.022277 7094.785982 0.0
2 1960.0 1.0 6.0 15.0 17.0 34.0 5142.320681 7053.669572 4.0
3 1960.0 1.0 6.0 15.0 20.0 53.0 5140.478190 7053.685894 0.0
4 1960.0 1.0 6.0 15.0 31.0 0.0 5136.806207 7057.555763 0.0
5 1960.0 1.0 6.0 15.0 45.0 0.0 5136.806207 7057.555763 0.0
6 1960.0 1.0 6.0 18.0 29.0 0.0 5142.229703 7042.168970 0.0
7 1960.0 1.0 6.0 20.0 0.0 0.0 5136.806207 7057.555763 0.0
8 1960.0 1.0 6.0 22.0 20.0 0.0 5140.478190 7053.685894 0.0
9 1960.0 1.0 7.0 1.0 41.0 20.0 5140.478190 7053.685894 0.0
10 1960.0 1.0 7.0 1.0 42.0 22.0 5127.503317 7051.241153 0.0
Magnitude
1 3.94
2 4.69
3 4.14
4 3.00
5 3.00
6 3.00
7 3.00
8 2.70
9 3.80
10 3.80
============================================================
Filtering summary:
============================================================
Original total: 493418
After spatial filter (polygon): 489616 (-3802)
After depth filter: 479947 (-9669)
After time filter: 438192 (-41755)
Final retained: 438192 (88.81%)
Earthquakes in polygon with cell assignment: 418257
Earthquakes in polygon without cell assignment: 19935
π Understanding Testing Region (R) vs. Neighborhood Region (N)#
The code above visualizes two critical spatial concepts in EEPAS:
Testing Region (R) - 177 Grid Cells#
Purpose: Where we evaluate the model and compute negative log-likelihood (NLL)
Definition: 177 square cells, each ~42.43 km Γ 42.43 km
Mathematical: R = {cells where target events are summed and integrated}
Result: Only events inside these 177 cells are counted as targets
Neighborhood Region (N) - CPTI15 Polygon#
Purpose: Where we search for source events (precursors, PPE sources)
Definition: Polygon boundary encompassing Italy + buffer zone
Why larger than R?: Avoid boundary effects
Precursors outside R can still trigger events inside R
PPE kernels have spatial extent (d β 30 km)
Result: Events in (N Β R) can be sources but not targets
Key Statistics from Above#
Region |
Area |
Events (filtered) |
Percentage |
|---|---|---|---|
Testing (R) |
177 cells |
418,257 |
84.77% |
Neighborhood (N) |
Polygon |
438,192 |
88.81% |
N Β R (buffer) |
19,935 |
4.04% |
Important: The 19,935 events in the buffer zone are crucial for avoiding edge effects!
Lon/Lat Bounds (from output above)#
Testing cells: [6.55Β°E, 18.44Β°E] Γ [36.65Β°N, 47.01Β°N]
This covers most of mainland Italy and surrounding seas
\(\Psi\) Analysis#
[14]:
# --- Step 1: Read and Prepare Mainshock List ---
# Use testing Region
earthquake_catalog_file = '/content/HORUS_Italy_RDN2008_filtered.mat'
output_filename = 'selectm7plusbrun7.out'
magnitude_threshold = 5.45
year_start_threshold = 1990
year_end_threshold = 2012 # New end year parameter
mc_parameter = 2.95
# Set to True to exclude the 921 aftershocks, set to False otherwise
exclude_aftershocks_option = False
select_events_with_options(
catalog_file=earthquake_catalog_file,
output_file=output_filename,
min_mag=magnitude_threshold,
start_year=year_start_threshold,
end_year=year_end_threshold,
MC=mc_parameter,
exclude_921_aftershocks=exclude_aftershocks_option
)
# Use mc = 2.5 (This comment seems misplaced or incomplete in the original context)
# Read the mainshock list generated by select_m5plus
# Use the regular expression r'\s+' to handle multiple spaces as separators
try:
ca_data = pd.read_csv("selectm7plusbrun7.out", sep=r'\s+', header=None)
except FileNotFoundError:
print("Error: 'selectm7plusbrun7.out' file not found. Please run select_m5plus() first.")
exit()
Successfully loaded variable 'HORUS' from '/content/HORUS_Italy_RDN2008_filtered.mat'.
921 aftershock exclusion disabled.
Selected 11 M5.45+ earthquakes (year >= 1990 and < 2012, depth < 40.0km, 921 aftershocks not excluded) and wrote to 'selectm7plusbrun7.out'
[15]:
# Use neighborhood region
earthquake_catalog_file = '/content/HORUS_Italy_RDN2008_polygon_filtered.mat'
400km#
π Ξ¨ Analysis Results Summary#
The analysis above identified Ξ¨ precursory patterns before 11 mainshocks (M β₯ 5.45) in Italy (1990-2011).
Next Steps: Scaling Relations#
The generated plots (final_scaling_*.png) visualize three key relationships:
Magnitude Scaling (M_mainshock vs. M_precursor):
Used to estimate a_m, b_m in EEPAS
Formula: M_mainshock = a_m + b_mΒ·M_precursor
Time Scaling (logββ(T_precursor) vs. M_precursor):
Used to estimate a_t, b_t in EEPAS
Formula: logββ(T) = a_t + b_tΒ·M_precursor
Spatial Scaling (Ο_space vs. M_precursor):
Used to estimate b_a in EEPAS
Formula: ΟΒ² = Ο_AΒ²Β·10^(b_aΒ·m_i)
These initial estimates bootstrap the EEPAS learning process in Step 3!
[16]:
# Extract fields
na = ca_data[0].astype(str).values
catalog = ca_data[1].astype(str).values
y1 = ca_data[2].astype(float).values
y3 = ca_data[5].astype(float).values
y4 = ca_data[8].astype(float).values
m1 = ca_data[3].astype(float).values
m3 = ca_data[6].astype(float).values
m4 = ca_data[9].astype(float).values
d1 = ca_data[4].astype(float).values
d3 = ca_data[7].astype(float).values
d4 = ca_data[10].astype(float).values
long = ca_data[12].astype(float).values
lat = ca_data[11].astype(float).values
co = ca_data[13].astype(float).values
ms_h = ca_data[14].astype(float).values
ms_m = ca_data[15].astype(float).values
ms_s = ca_data[16].astype(float).values
m_m = ca_data[17].astype(float).values
# --- Step 2: Clean up old output files ---
output_files = ["testopt.ou4a", "testopt.ou5a", "optimized_psi.ou4", "optimized_psi.ou5"]
for f in output_files:
if os.path.exists(f):
os.remove(f)
print(f"Deleted old file: {f}")
# --- Step 3: Run PSI Analysis for Each Earthquake ---
print("\n=== Starting PSI Analysis ===")
print("Note: Using absolute time of mainshock instead of relative time")
for i in range(len(na)):
print(f"\nProcessing earthquake {i+1}/{len(na)}: {na[i]}")
# Set depth range
dp_rg = (-1, 40)
# Calculate absolute time (Important correction!)
# Mainshock time (including hour, minute, second)
t3_abs = decimal_time_precise(
int(y3[i]), int(m3[i]), int(d3[i]),
int(ms_h[i]), int(ms_m[i]), ms_s[i]
)
# Start and end times (year, month, day only)
t1_abs = decimal_time_precise(int(y1[i]), int(m1[i]), int(d1[i]))
t4_abs = decimal_time_precise(int(y4[i]), int(m4[i]), int(d4[i]))
# Calculate time window
tprior = t3_abs - t1_abs # Search time before mainshock
tpost = t4_abs - t3_abs # Time after mainshock (though not used by the algorithm)
print(f" Mainshock Absolute Time: {t3_abs:.4f}")
print(f" Search Time Window: {tprior:.2f} years")
print(f" Location: ({lat[i]:.2f}, {long[i]:.2f}), Magnitude: {m_m[i]}")
catsel = earthquake_catalog_file
# Execute PSI Analysis
try:
optimize_psi(
main_time=t3_abs, # Use absolute time!
tprior=tprior, # Length of time to search backward
tpost=tpost, # Time backward (unused)
loc=(long[i], lat[i]),
radius=400, # Use radius from the paper
depth_range=dp_rg,
catalog=_load_catalog(catsel),
heading="testopt", # Fixed output prefix
cutoff=co[i],
mm=m_m[i],
eq_name=na[i],
tp_ratio_min=0.4,
tp_ratio_max=0.6
)
print(f" β Analysis completed for {na[i]}")
except Exception as e:
print(f" β Error: {str(e)}")
print("\n=== PSI Analysis Completed ===")
# --- Step 4: Execute Result Optimization (Step 9) ---
print("\n=== Executing Step 9 Optimization (tolerance mode) ===")
n0, n91, n92 = optimize_psi_results(
input_file="testopt.ou4a",
output_file="optimized_psi.ou4",
mode="tolerance",
t_tol=0.01, # β 3.65 days β Equivalent to t_decimals=2
mp_tol=0.10, # β Equivalent to mp_decimals=1
r_tol=1.00, # β Equivalent to r_decimals=0 (integer)
verbose=True, auto_relax=False
)
print(f"[Step9] Original {n0} β After 9.1 {n91} β After 9.2 {n92}")
=== Starting PSI Analysis ===
Note: Using absolute time of mainshock instead of relative time
Processing earthquake 1/11: K7EQ1
Mainshock Absolute Time: 1990.3405
Search Time Window: 30.34 years
Location: (4502.14, 7327.87), Magnitude: 5.77
β Analysis completed for K7EQ1
Processing earthquake 2/11: K7EQ2
Mainshock Absolute Time: 1997.7342
Search Time Window: 30.73 years
Location: (4758.49, 7072.52), Magnitude: 5.66
β Analysis completed for K7EQ2
Processing earthquake 3/11: K7EQ3
Mainshock Absolute Time: 1997.7353
Search Time Window: 30.74 years
Location: (4757.56, 7069.50), Magnitude: 5.97
β Analysis completed for K7EQ3
Processing earthquake 4/11: K7EQ4
Mainshock Absolute Time: 1997.7643
Search Time Window: 30.76 years
Location: (4758.97, 7068.91), Magnitude: 5.47
β Analysis completed for K7EQ4
Processing earthquake 5/11: K7EQ5
Mainshock Absolute Time: 1997.7853
Search Time Window: 30.79 years
Location: (4744.67, 7073.29), Magnitude: 5.62
β Analysis completed for K7EQ5
Processing earthquake 6/11: K7EQ6
Mainshock Absolute Time: 1998.2779
Search Time Window: 30.28 years
Location: (5121.02, 7120.87), Magnitude: 5.64
β Analysis completed for K7EQ6
Processing earthquake 7/11: K7EQ7
Mainshock Absolute Time: 1998.6889
Search Time Window: 30.69 years
Location: (4437.01, 7336.47), Magnitude: 5.53
β Analysis completed for K7EQ7
Processing earthquake 8/11: K7EQ8
Mainshock Absolute Time: 2002.8313
Search Time Window: 30.83 years
Location: (4617.28, 7240.41), Magnitude: 5.74
β Analysis completed for K7EQ8
Processing earthquake 9/11: K7EQ9
Mainshock Absolute Time: 2002.8346
Search Time Window: 30.83 years
Location: (4619.89, 7236.17), Magnitude: 5.72
β Analysis completed for K7EQ9
Processing earthquake 10/11: K7EQ10
Mainshock Absolute Time: 2009.2604
Search Time Window: 30.26 years
Location: (4683.51, 7113.55), Magnitude: 6.29
β Analysis completed for K7EQ10
Processing earthquake 11/11: K7EQ11
Mainshock Absolute Time: 2009.2650
Search Time Window: 30.26 years
Location: (4679.33, 7122.35), Magnitude: 5.54
β Analysis completed for K7EQ11
=== PSI Analysis Completed ===
=== Executing Step 9 Optimization (tolerance mode) ===
=== PSI Result Optimization (Step 9) ===
Following the exact algorithm from the paper
Read 153 raw Ξ¨ identification records
Mode: tolerance
Step 9.1: For identical tmin, tp, mp β keep max sloperatio...
After first filter: 150 records
Step 9.2: For identical tp, sloperatio β keep min ap...
After second filter: 103 records
After optimization: 6 earthquakes have Ξ¨ identifications
Number of Ξ¨ identifications per earthquake (top 10):
eq_name
K7EQ2 28
K7EQ7 19
K7EQ8 17
K7EQ1 16
K7EQ10 12
K7EQ6 11
Name: count, dtype: int64
Saved to optimized_psi.ou4 (fixed 14 columns)
[Step9] Original 153 β After 9.1 150 β After 9.2 103
[17]:
# --- Step 5: Plot Scaling Relations ---
print("\n=== Generating Scaling Relations Charts ===")
analyze_scaling_relations(
psi_file="optimized_psi.ou4",
output_prefix="final_scaling"
)
print("\n=== All Procedures Completed ===")
=== Generating Scaling Relations Charts ===
============================================================
=== Ξ¨ Scaling Relations Analysis (Projected Means + Plots) ===
β
Successfully read 103 total Ξ¨ identification records.
βΉοΈ Number of mainshocks (unique eq_name) = 6
βΉοΈ Number of identifications per mainshock (top 10):
eq_name
K7EQ2 28
K7EQ7 19
K7EQ8 17
K7EQ1 16
K7EQ10 12
K7EQ6 11
Name: count, dtype: int64
β
Generated 6 mainshock representative points with trade-off removed (should equal number of mainshocks).
--- Initial Value Estimates for MLE ---
Relation 1 (Area): log10(A_P[kmΒ²]) = 0.3266 + 0.9258*M_P (RΒ² = 0.6962, p-value = 0.038860)
sigma_A initial value (from simplified estimation): 0.4806
(Formula: 0.33 * 10^(a_A / 2), where a_A = 0.3266)
Relation 2 (Time): log10(T_P[days]) = -1.0613 + 0.8518*M_P (RΒ² = 0.6794, p-value = 0.043622)
sigma_T initial value (residual std from log(T_p) vs M_P): 0.2016
Relation 3 (Mag): M_m = 2.3217 + 0.7636*M_P (RΒ² = 0.7731, p-value = 0.020980)
sigma_M initial value (residual std from M_m vs M_P): 0.1425
Saved M_P relation plots to final_scaling_mp_relations.png
Saved M_m vs M_P plot to final_scaling_mm_mp.png
=== All Procedures Completed ===