2019 Modeling work for WA EMD, covering parts of Island and Skagit Counties, including La Conner, WA.
Some of the directories containing big files are not included in the git repository but can be downloaded from the tar file found at http://depts.washington.edu/ptha/IslandSkagitTHA_2019/ and then moved to the appopriate location. These include:
topo/topofiles
dtopo/dtopofiles
info/fgmax_regions
report/figs
topo
: scripts and notebooks to manipulate DEMs and topo filestopo/topofiles
: topofiles to use in job runsdtopo
: scripts and notebooks to manipulate dtopo filesdtopo/dtopofiles
: dtopofiles to use in job runsinfo
: tide gauge locations, other general infoinfo/fgmax_regions
contains Google Earth images for plotting purposesnew_code
: Custom Python and Fortran code for this projectRuns
: Directory containing input for all job runsreport
: Directory containing latex files for final report.Within Runs
there are directories for each fgmax location, and within each of these there are subdirectories for each event: SFL
, CSZ_L1
, and AK
.
The topo/topofiles
directory contains topo files and also .kml
and .png
files that show the areas covered by each.
topo/merge_PS_PT_overlap.py
was used to merge parts of Puget Sound (PS) and Port Townsend (PT) DEMs together with an overlap DEM provided by Kelly Carignan at NCEI into a single 1/3" DEM covering the study region:
topo/topofiles/PTPS_merged_13s.asc
This DEM is referred to as PTPSm-DEM
in the report. A .nc
version of this file (netCDF) is used by some of the Python scripts and Jupyter notebooks for plotting purposes. This can be created from the .asc
file using the script topo/convert_PTPS_merged_asc2nc.py
.
WA_EMD_2018/topo/coarsen_PS_PT_SJdF.py
was previously used to coarsen the PS and PT DEM (and also the Strait of Juan de Fuca DEM) from 1/3" to 2" and those topo files were taken from 2018,
topo/topofiles/PT_2sec_center.asc
topo/topofiles/PS_2sec_center.asc
topo/topofiles/SJdF_2sec_center.asc
Outside of the Strait and Puget Sound, 1 arcsecond etopo1 topography was used. A single etopo1 topofile should have sufficed, but because the Alaska work was done later, two different files were used for different sources:
etopo1_-137_-122_38_54_1min_nc.asc
etopo1_-180_-66_-48_65_1min.tt3
But a new, smaller version, can instead be used for all runs:
etopo1_-163_-122_38_63_1min.asc
This file was obtained from the NCEI thredds server using the script topo/fetch_etopo1.py
.
The study region was first split into a set of polygons defined as Ruled Rectangles.
fgmax points were selected to be all points in the corresponding polygon with DEM elevation lower than 15 m (including all water points in polygon),
The params.py file in each run directory has a pointer to such a file.
Masks are used to specify any points that must be forced to be dry initially even if the topo elevation is below MHW.
A directory for an fgmax location (study region) has a name like SW_Whidbey
. This directory contains a subdirectory for each event, such as SFL
, and also a subdirectory input_files
with the files needed to run the code.
These files must be created by running the script input_files/make_input_files.py
. This Python script was generated from the Jupyter notebook make_input_files.ipynb
in the location directory, so alternativey you can run the notebook and then move the .data
files generated to the input_files
subdirectory.
Notes:
netCDF4
and supporting libraries, to create netCDF a netCDF file that will have results later added to it. The conda environment defined by environment_geomin.yml should provide the minimum needed for running things.topo/topofiles/PTPS_merged_13s.nc
. This is not included in the big_files
but can be generated using the script topo/convert_PTPS_merged_asc2nc.py
The files generated include (e.g., for location SW_Whidbey
):
fgmax_pts_topostyle_SW_Whidbey.data
The fgmax points as a topotype 3 file covering a rectangle, with 1/0 values indicating whether each point is an fgmax point or not.
RuledRegion_SW_Whidbey_ixy2.data
Description of a ruled rectangle to be used as an AMR flagregion at the finest level, that covers all fgmax points. Note: ixy2
means s=y
in the ruled region, while ixy1
means s=x
.
allow_wet_init_SW_Whidbey.data
Optional topotype 3 file with 1/0 entries. If there is dry land below MHW in this study region, this file indicates which points are allowed to be initialized to wet (1) and which are required to be dry initially (0).
SW_Whidbey_input.nc
A netCDF file containing the arrays indicating the fgmax_pts
and allow_wet_init
. After the code is run, this file is used as the basis for a netCDF file that also contains the model results.
These files are created by the Jupyter notebook make_input_files.ipynb
, or by a Python script input_files/make_input_files.py
created from this notebook (see below). Several figures (and png files) are also created. These are made in the main directory and then moved to input_files
for safe keeping.
Within an event directory, such as SW_Whidbey/SFL
, there are the following files:
Makefile
: Note that EXE
might be set to $(WA_EMD_2019)/xgeoclaw
so that a common executable can be used without recompiling in each event directory. Make sure you have this file and it is up to date with Fortran code in the new_code
directory. make .exe
should create it if necessary. Make sure $CLAW
is pointing to clawpack-v5.6.1
.
setrun.py
: This generally does not need to change, unless you want to modify how often checkpoint files are created, or someothing else not set in params.py
.
params.py
: Most of the parameters that change from one location or event to another are collected in params.py
. This is read in by setrun
and also by setplot
and by the process_fgmax
notebook or script. If you change something in params.py
, remake the data via make data
before running the code again. Or use the make run
to do make data
followed by make output
.
If copying these files to a new directory to create a different event at this location, the following parameters in params.py
are particularly important to modify:
loc
and event
should be set for this location / event.
fgmax_extent
is used in setplot
for zoomed views.
restart
: set to True
to do a restart and set restart_file
to point to the desired file to start from. Note that the files _output/fort.tck*
give the times of each checkpoint file, while the data needed to restart is in the corresponding _output/fort.chk*
file.
tfinal
and num_output_times
controls when the job ends (in seconds) and how many time frame files are produced along the way. If doing a restart, adjust these to the new (later) final time and desired total number of output times (including those time frames already generated).
lower
and upper
arrays specify the computational domain. Note that these are shifted left 1/6 arcsecond from values that should be chosen as multiples of 0.01 degree in order to insure that the finest computational grids at 1/3 arcsecond have cell centers aligned with the DEM grid points.
num_cells
sets the number of grid cells at the coarsest level (Level 1) over the computational domain. Usually far field events (like Alaska) have a larger computational domain and coarser Level 1 grid than nearfield events.
amr_levels_max
and refinement_ratios
controls the maximum number of AMR levels and the refinement factor from each level to the next.
topofiles
is a list of the topo files used for this run.
dtopofiles
is a list containing a single entry for the dtopo file used for this run (the earthquake source). Should be consistent with how event
is set and the name of the directory you are working in.
regions
is a list of AMR flagging regions in the old style, i.e. each region is specified as a list of the form [minlevel,maxlevel,t1,t2,x1,x2,y1,y2]
.
flagregions
is a list of AMR flagging regions in the new style, i.e. each is an object of class data_FlagRegions.FlagRegion
. The module data_FlagRegions.py
is in the new_code
directory for now. Eventually this will be merged into amrclaw/data.py
for more general usage. Note that flagregion.spatial_region_file
points to a file that specifies a RuledRectangle
that must be created to cover the spatial region desired. Files of this form covering the fgmax points are created in the notebook make_input_files.ipynb
for this location, and should be used for refinement to the finest (1/3 arcsecond) level. Others are made in the notebook topo/make_flag_regions.ipynb
that cover larger regions more efficiently than old-style rectangles would.
wave_tolerance
: The tolerance used to determine whether to flag a cell for refinement, if it is not forced to be refined or restricted from further refinement by one of the regions or flagregions. A cell is flagged for refinement if abs(eta) > wave_tolerance
, where eta = h + B
is the surface elevation computed from water depth h
and geoclaw topography B
. This may need to be adjusted, e.g. smaller if the tsunami is small in the region of interest.
fgmax_files
: A list of a single header file describing the fgmax parameters for this run. The header file is read in by the Fortran code and includes a path to the file that contains all the fgmax points. Note some changes from previous projects:
The header file is created directly from params.py
with parameters set here. Previously there was a separate script make_fgmax_header.py
that had to be run.
fg.xy_fname
should be set to the absolute path of a file in the form of a topotype 3 topography file, with values 1/0 indicating whether a point on this rectangular grid is an fgmax point or not. This file is created by the make_input_files
notebook or script in the location directory. In previous work we specified the fgmax points in a different format, a file with one line for each point, but specifying as a topo file mask works better in various ways. This style is specified by setting fg.point_style = 10
, and fg.npts = 0
.
Other fgmax parameters may need to be adjusted:
fg.min_level_check
should generally be set to the finest level. The values at fgmax points are only monitored on this level (or finer).
fg.tstart_max
: the time to start monitoring fgmax values, which should generally be a bit after the finest level is first created in the fgmax region (e.g. set to t_start + 20
).
Note that cells are flagged for refinement at the next regrid time after t_start
, but this may not be exactly at time t_start
.
fg.dt_check
: How often to update the running maxima stored at the fgmax points. Usually 20 (seconds) is ok. Note that 1/3 arcsecond grids are generally updated with a time step dt
less than 1 second but we don't need to update the fgmax points this frequently (since it is time-consuming to do so), but you need to update them frequently enough that a maximum is not missed by much, so an appropriate value depends on the location of the fgmax region and the nature of the source.
gauges
: A list of gauges to monitor, each in the form of a list [gaugeno, x, y, t1, t2]
. Some gauges are read in from a csv file listing gauges specified by DNR, and those found to lie in the region specified by fgmax_extent
are automatically included.
variable_sea_level
: Set to True
if the earthquake deformation affects the study location (i.e. for the SFL event in southern locations, but not for the CSZ or Alaska event). This controls how the finest grids are initialized, filling with water only up to sea level as adjusted by the dtopo file.
use_wet_mask
: Set to True
if there are some areas in this location that are below MHW but that should be initialized as dry because they are behind dikes or levies. If so, then you must also set fname_wet_mask
to point to the file created by the make_input_files
notebook or script.
t_stays_dry
: The last time for which the wet_mask
should be used when creating new AMR grids. After this time, it is ignored. Should be set late enough that the finest grids around the fgmax points have been created properly. At later times the tsunami may have arrived so this is not appropriate and would also add additional cost to running the code. Usually t_start + 120
seems fine.
Some other parameters in params.py
are only used in post-processing, see comments in this file.
After running GeoClaw, the results are in _output
by default. This contains time frame results at the times listed in the fort.t*
files, and also the fgmax results, in the files:
_output/fort.FG1.valuemax
_output/fort.FG1.aux1
These files list only the fgmax points, one per line.
The files can be post-processed using:
setplot.py
: For making plots of the time frames after running the code via make plots
(or interactively with Iplotclaw
.
process_fgmax.ipynb
: After running the code, this reads in the fgmax results (from the specified outdir
) and produces plots of the maximum flow depth and speed. It also creates a netCDF file with all of the results as arrays on the topo grid used in creating the input netCDF file described above (copying the input file and adding the output). The resulting file has a name like SW_Whidbey_results.nc
.
For this project, processing the results was automated using the Python script new_code/process_fgmax_common.py
that works for all locations/events.
The .png
files created by process_fgmax
(if save_figs == True
) will be copied into _plots/_other_figures/
so that they will be linked properly from _plots/_PlotIndex.html
if you also run make plots
to create the time frame plots and index in _plots
. You can make plots
either before or after process_fgmax
is run.
After producing the .nc
file and .png
files of the figures you might want to move them (and also _plots
) into a directory with a name like SW_Whidbey_SFL_results_DATE
for safe keeping.
To turn a notebook (e.g. process_fgmax.ipynb
) into a python script that can be used to process the data without opening the notebook, do the following:
"File --> Download as"
menu,jupyter nbconvert --to python process_fgmax.ipynb
Edit the resulting process_fgmax.py
file to do the following:
get_ipython().magic('...')
Set desired values for save_figs
and save_fgmax_nc_file
, or other parameters depending on the notebook.
Then you can run the script from the command line using
python process_fgmax.py
For make_input_files.py
you can also do this:
`jupyter nbconvert --to python --TagRemovePreprocessor.enabled=True --TagRemovePreprocessor.remove_cell_tags="['hide-py']" make_input_files.ipynb`
This will suppress cells tagged hide-py
and remove the cells that make plots, only producing the input files themselves. (Also removes the magic command.)
This will only work if jupyter nbconvert --version
shows version >= 5.3.0.
Move the resulting make_input_files.py
to input_files
if you want to be able to run this script and make the files in place, useful when running on a remote cluster.
To turn a notebook (e.g. process_fgmax.ipynb
) into an html file showing for archiving or sharing, do the following:
Either open the notebook (and execute it if you want the output cells and plots to appear) and export it as html using the "File --> Download as"
menu,
or run this command at the command line:
jupyter nbconvert --to html --execute make_input_files.ipynb
Note that the --execute
flag will cause it to execute the notebook before converting. If the output is already in the notebook or you want to create an html file of the input cells alone, leave this out.
If a cell times out, you can add the argument --ExecutePreprocessor.timeout=-1
for unlimited timeout.
See the detailed instructions in README_install.txt
.
Our production runs were done on the University of Colorado hpc cluster using time provided by CSDMS.
The main code repository was stored on the permanent home directory, but each Makefile
also specifies a scratch directory where the output was directed.
Some notes on the workflow there:
Within the repository:
`git pull # get recent changes`
if input_data does not yet exist submit job to create:
`sbatch ~/make_input_data.sbatch`
modify Makefile to set SCRATCH properly
submit job to run geoclaw:
`sbatch ~/run_geoclaw12.sbatch # using 12 threads`
If doing a restart, use full path to scratch directory for checkpt file.
_plots
should now exist with time frame plots. If not, sbatch ~/make_plots.sbatch
will only make plots
Copy params.py
to SCRATCH directory, and copy location's input_files
directory to the parent of the SCRATCH directory.
To make fgmax plots in _plots/_other_figures
, and results .nc
file:
`sbatch ~/process.sbatch`
Also copy _output/timing.*
to _plots/_timing_figures
Copy .nc
file to _plots
if you want to download that too.
Copy fgmax and gauge output out of scratch directory for safe keeping:
`python $WA19P/new_code/copy_output.py`
Use cp_output_cu.sh
to copy _plots
directory down.
Somewhat out of date since webpage was reorganized.
Use Runs/cp_plots.py
to copy most results results from each fgmax region to results_plots
. Before running this, make sure _plots
in each directory has the most recent results (or is a symbolic link to the most recent results).
Update results_plots/index.html
if necessary.
Run results_plots/collect_kmlfiles.py
to collect all the kml files and required png files from each _plots/_other_figures
directory into a single results_plots/kmlfiles
directory. Tar and gzip this directory to make kmlfiles.tar.gz
and then
`scp kmlfiles.tar.gz ptha@homer.u.washington.edu:public_html/WA_EMD_2019/results/`
Use Runs/all_gauges
to collect final gauge results from all runs. Note that this includes all gauges included in the run, even if it is not to be used because a different run has better results for this gauge.
To collect desired gauge plots from appropriate region:
`python results_plots/Gauges/make_gauge_html.py`
This reads the region mapping from info/gauges_regions.csv
and the most recent gauge png/csv/nc files from Runs/all_gauges
and puts them all in results_plots/Gauges
.
It also creates the index results_plots/Gauges/index.html
.
`rsync -avz Gauges/ ptha@homer.u.washington.edu:public_html/WA_EMD_2019/results/results_plots/Gauges/`
Tar and gzip results_plots/Gauges
to make Gauges.tar.gz
and then
`scp Gauges.tar.gz ptha@homer.u.washington.edu:public_html/WA_EMD_2019/results/`