Model evaluation#
This example shows how to load model and observation data and evalute model performance.
# # skip this if package has already been installed
# !pip install modvis
%matplotlib inline
%load_ext autoreload
%autoreload 2
import os
from modvis import ATSutils
from modvis import utils
from modvis import general_plots as gp
import matplotlib.pyplot as plt
import logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s: %(message)s')
# run_steadystate = "1-spinup_steadystate"
work_dir = f"../../model2/"
rho_m = 55500 # moles/m^3, water molar density. Check this in the xml input file.
# model_dir = "../data/coalcreek"
Download the sample data when running on Google Colab
# import os
# if not os.path.exists(model_dir):
# !git clone https://github.com/pinshuai/modvis.git
# %cd ./modvis/examples/notebooks
Load model data#
This will load the water_balance-daily.dat file generated from ATS model. The data file includes watershed variables including outlet discharge, ET, and etc.
run_dir = "3-transient"
model_dir = os.path.join(work_dir, run_dir)
logging.info(f"Loading data from {model_dir}")
2025-04-16 15:49:36,230 - root - INFO: Loading data from ../../model2/3-transient
simu_df = ATSutils.load_waterBalance(model_dir, WB_filename="water_balance_computational_domain.csv",
domain_names = None,
canopy = True, plot = False
)
simu_df['river discharge [mol d^-1]'].plot()
<AxesSubplot:xlabel='datetime'>
Load observation data#
Provide USGS gage number (i.e., sites) to download the streamflow.
obs_df = utils.load_nwis(sites= "09111250", start = '2015-10-01', end = '2017-10-1')
obs_df
| Discharge [m^3/d] | |
|---|---|
| 2015-10-01 | 11645.7208 |
| 2015-10-02 | 11572.3234 |
| 2015-10-03 | 11449.9944 |
| 2015-10-04 | 11694.6524 |
| 2015-10-05 | 13921.0402 |
| ... | ... |
| 2017-09-27 | 5798.3946 |
| 2017-09-28 | 10593.6914 |
| 2017-09-29 | 10349.0334 |
| 2017-09-30 | 14679.4800 |
| 2017-10-01 | 16245.2912 |
732 rows × 1 columns
obs_df.plot()
<AxesSubplot:>
Streamflow comparison#
Compare simulated streamflow with observed USGS streamflow.
fig,ax = plt.subplots(1,1, figsize=(8,4))
simu_df['watershed boundary discharge [m^3/d]'].plot(color = 'c',ax=ax, label= "simu_Q")
obs_df['Discharge [m^3/d]'].plot(color = 'k', ax=ax, label = "obs_Q")
ax.set_ylabel("Discharge [m^3/d]")
ax.legend()
<matplotlib.legend.Legend at 0x40824a4610>
Flow Duration Curve (FDC) comparison#
The flow duration curve is a plot that shows the percentage of time that flow in a stream is likely to equal or exceed some specified value of interest (also called “exceedance probability). For example, it can be used to show the percentage of time river flow can be expected to exceed a design flow of some specified value (e.g., 20 cfs), or to show the discharge of the stream that occurs or is exceeded some percent of the time (e.g., 80% of the time). See reference on how it’s calculated.
In model validation, comparing observed and simulated FDCs shows how well a hydrological model reproduces the full range of flows (low, median, and high).
fig, ax = gp.plot_FDC(dfs=[obs_df['Discharge [m^3/d]'], simu_df['watershed boundary discharge [m^3/d]']],
labels=['obs_Q','simu_Q'],
colors=['k', 'c'],
start_date="2016-10-01"
)
One-to-one plot#
One to one scatter plot with metrics.
metrics = gp.one2one_plot(obs_df['Discharge [m^3/d]'], simu_df['watershed boundary discharge [m^3/d]'],
metrics=['logNSE', 'NSE', 'mKGE'],
# metrics='all',
decompose_KGE=True,
show_metrics=True,
show_density=False,
start_date="2016-10-01"
)
mKGE: 0.8541672906250747, cc: 0.9389278483467483, alpha: 1.131220526451959, beta: 1.0178478249023342
metrics
{'pearsonr': 0.9389278483467479,
'R^2': 0.8815855044010537,
'RMSE': 74282.28758848661,
'rRMSE': 0.5769357988711631,
'NSE': 0.8362800085409323,
'logNSE': 0.7721731425289238,
'bias': -2297.963247935326,
'pbias': 1.784782490233417,
'KGE': 0.8357640497479065,
'npKGE': 0.8540500499788206,
'mKGE': 0.8541672906250747}