Parallel I/O output module enhancements#655
Parallel I/O output module enhancements#655SeanBryan51 wants to merge 31 commits intoparallelio-playgroundfrom
Conversation
4b2f171 to
8abda2c
Compare
|
Hi @Whyborn, I've opened this PR to highlight the output module specific diffs (includes the aggregators and the list data structure for output variables). The implementation still contains a bunch of TODO's, but it would be awesome if I could get your feedback at this stage. I've been testing it only in serial (testing in parallel will only be possible once the met data can be read in via parallel I/O) and currently gives the same answers as before for Qh. However, I did have to change the NetCDF format in the old output module from classic to NetCDF4/HDF5 to easily compare outputs between new and old. This was a hack as the new NetCDF API has the NetCDF4 format hardcoded whenever creating or opening new files. I'll make that configurable so that both classic and NetCDF4 is supported. |
|
First pass comments:
class(aggregator_t), dimension(:), allocatable :: Qh_aggregators
...
Qh_aggregators = create_aggregators(canopy%fh, ["mean", "max"])
call cable_output_add_variable(
...
aggregator=Qh_aggregators
...
)This way it makes the aggregators available for use where necessary (e.g. the
cable_output_shape_land_patch = create_decomposition(["land", "patch"])
cable_output_shape_land_patch_soil = create_decomposition(["land", "patch", "soil"])where the dimensions are defined earlier in initialisation with something like: call define_cable_dimension("soil", 6)
call define_cable_dimension("rad", 2)I think this makes the design more easily extensible? This makes it trivial for someone developing new code which may have new dimensions to output their own stuff.
type aggregator_store_t
class(aggregator_t), allocatable :: aggregators(:)
integer :: num_aggregators
end type aggregator_store_twhere
|
|
Thanks @Whyborn for your thoughts!
I agree that it would be nice if time related control were handled at a higher level (for example with the relevant science output module as you say). However, it wasn't obvious to me how this approach could allow for driving the intermediate aggregators required for
For two reasons:
Oh yep I remember you mentioning this in our discussions. I think this is something we could definitely add in the future - I was hesitant to introduce this now as I want to avoid diverging in functionality from the current output module which assumes a single aggregation method per variable. As for a list of objects being targets, I got around that problem by working with arrays of type aggregator_handle_t which contain a reference to an aggregator rather than the actual aggregator instance.
I like this suggestion, I agree it's definitely not that obvious what one needs to do to create a new
I agree, a timing module for wider use in the code would be great. The |
Does this allow aggregators to be driven independently? It seems to me like specifically doesn't allow this, because every aggregator which accumulates e.g. There has to be a way to trigger accumulation at specific times, for instances like this. This is why I think the aggregators have to be available to work with as standalone objects.
Yea I can see that, it might be more easily readable to have every call contain a instead of a construction like Although I don't see why the latter couldn't also be used to create the same documentation in the same way.
Yea that's what I meant, whether the
Are you sure? I thought arrays of polymorphic classes were part of the standard, I used them in some of my aggregator testing. You just need
There's the datetime-fortran which we already have a spack package for? I haven't actually looked at it's features yet though. |
|
Hi @Whyborn, I've made the updates we discussed last week. I think it's looking much better now with your suggestions on how aggregators are organised in the output module, thanks as always for the comments! Please let me know if there is anything else that catches your eye on the design. I'm going to try add support for specifying non-time varying data in the output and restart files. For this I'm thinking we could introduce a |
|
Just a couple of comments:
if (output%patch)
decomp = decomp_land_real32
reducer = "patch"
else
decomp = decomp_land_patch_real32
reducer = "none"
end if
call cable_output_add_variable(
name="Qh",
...
decomp=decomp,
reducer=reducer
)And then in the Then you wouldn't have to have all the buffers. The
|
I'm definitely happy to add the frequency limit to
That sounds good to me, I will rename the module containing the grid cell averaging stuff to be more general (and maybe put this under
The distinction of types for decompositions is required by PIO via the
I'm happy to introduce a string argument for reducer instead of the
I did consider using a function interface instead of a subroutine interface, however I opted for the subroutine approach with the temporary buffer as this avoids introducing a potentially large allocation when computing the grid cell average. I want to limit the number of unnecessary allocations and copy operations in the write procedures where possible. It might be possible to return a preallocated array from a function. I can look into this a bit more. Thank you again for the feedback on this! |
|
Thanks for the comments @ccarouge, I'll add in those suggestions! |
03eaa9b to
fb4febe
Compare
4724830 to
9263f1e
Compare
|
Just an update on this PR! I'm currently adding functionality for restarts to the new output module by adding a
More to come 🙂 |
Co-authored-by: Lachlan Whyborn <lachlan.s.whyborn@gmail.com>
…_daily aggregators to canopy_type
9263f1e to
3094a2b
Compare
|
Hi @Whyborn, apologies it has been a while since I've followed up on this! I had to stop myself from tweaking small things 🙃 The output module now has functionality for writing restarts and has gone through a pretty aggressive restructuring. Since a lot of the previous commits on the output module are no longer relevant with the restructuring, I've blown away most of commit history and grouped the commits into these categories:
Please let me know what you think! (at a time that's convenient of course, it's a pretty dense PR 😅 ) The next thing on the plate after this is would be to introduce working variables for all diagnostics, and then add all output and restart variables and test for (approximate) bitwise reproducibility. |
876c111 to
64b3cf4
Compare
…tructor syntax for strings
…structor syntax for strings
3fe2db2 to
90433a0
Compare
4b4facd to
a294717
Compare
…_output_utils_mod Introduce `grid_type` component and replace grid based `if (restart)` with `select case(grid_type)`
619998e to
07f646e
Compare
…alls to netCDF (#681) This change is a bug fix and also a refactor. Currently, enabling the land use scheme (via `l_landuse`) causes the following error when writing out the land use restart file: ``` writing cable restart: land use Error writing iveg parameter/variable to output file (SUBROUTINE write_output_p arameter_r1) NetCDF: Not a valid ID ``` This is because the incorrect netCDF file ID is used in `cable_output.F90` when writing the land use restart file: https://github.com/CABLE-LSM/CABLE/blob/b2d3f8df5c055a5942f1f5e86288f736582afd74/src/offline/cable_output.F90#L51 Rather than fixing this directly, this change replaces all calls to cable_write.F90 procedures with direct calls to netCDF. This is done so that in the future we can safely replace `cable_output.F90` and `cable_write.F90` with a renewed output module as part of the parallel I/O output module changes (see #641 and #655). Note that the land use scheme is not yet currently functional, even with these changes, as `landuse3.F90` is partially implemented. ## Type of change Please delete options that are not relevant. - [x] Bug fix - [x] Refactor ## Checklist - [x] The new content is accessible and located in the appropriate section - [x] I have checked that links are valid and point to the intended content - [x] I have checked my code/text and corrected any misspellings ## Testing - [x] Are the changes bitwise-compatible with the main branch? If working on an optional feature, are the results bitwise-compatible when this feature is off? If yes, copy benchcab output showing successful completion of the bitwise compatibility tests or equivalent results below this line. ``` 2026-02-05 14:20:00,805 - INFO - benchcab.benchcab.py:380 - Running comparison tasks... 2026-02-05 14:20:00,831 - INFO - benchcab.benchcab.py:381 - tasks: 168 (models: 2, sites: 42, science configurations: 4) 2026-02-05 14:22:56,227 - INFO - benchcab.benchcab.py:391 - 0 failed, 168 passed ``` <!-- readthedocs-preview cable start --> ---- 📚 Documentation preview 📚: https://cable--681.org.readthedocs.build/en/681/ <!-- readthedocs-preview cable end -->
Note that some biogeophysics diagnostic variables overlap with CASA-CNP diagnostic variables, e.g. `canopy%fgpp` and `canopy%fnpp`. When CASA-CNP is enabled, these diagnostic variables are re-calculated using CASA-CNP variables before writing the diagnostic variables to the output file. The overriding of diagnostic variables is done to match the current behaviour of the output module - there may be scope to enable both CASA and non-CASA variants of these diagnostics in the future.
…f life changes In src/offline/cable_mpimaster.F90 and src/offline/cable_serial.F90, daily aggregators are reset immediately after writing instead of in the next iteration as this breaks for the last ktau iteration. New components have been added to `cable_output_variable_t`. These include `scale`, `offset` which are useful for applying unit conversions, and also `field_name` and `netcdf_name` to facilitate both restart variable names (`field_name`) and output variable names (`netcdf_name`). All components of the cable_output_variable_t except `field_name` and `aggregator` now have defaults. This cleans up the interface for defining output variables. Improved validation checks on registered output variables. A new `coordinate_variables` component has been added to `cable_output_profile_t` such that coordinate variables can be treated differently from the requested outputs due to differing metadata requirements. Removed start_year hack from drivers by overriding cable_user%yearstart to syear for the default MetType case.
This change removes duplicate module variables for namelist options `l_casacnp`, `l_landuse`, `l_laiFeedbk`, and `l_vcmaxFeedbk`. Cherry-picked from #655 ## Type of change Please delete options that are not relevant. - [x] Bug fix ## Checklist - [x] The new content is accessible and located in the appropriate section - [x] I have checked that links are valid and point to the intended content - [x] I have checked my code/text and corrected any misspellings ## Testing - [x] Are the changes bitwise-compatible with the main branch? If working on an optional feature, are the results bitwise-compatible when this feature is off? If yes, copy benchcab output showing successful completion of the bitwise compatibility tests or equivalent results below this line. ``` 2026-02-24 11:49:25,946 - INFO - benchcab.benchcab.py:380 - Running comparison tasks... 2026-02-24 11:49:25,982 - INFO - benchcab.benchcab.py:381 - tasks: 168 (models: 2, sites: 42, science configurations: 4) 2026-02-24 11:52:24,750 - INFO - benchcab.benchcab.py:391 - 0 failed, 168 passed ``` Spatial runs are also bitwise compatible: ``` nccmp -d runs/spatial/tasks/crujra_access_R*_S0/archive/output000/cable_out.nc nccmp -d runs/spatial/tasks/crujra_access_R*_S1/archive/output000/cable_out.nc nccmp -d runs/spatial/tasks/crujra_access_R*_S2/archive/output000/cable_out.nc nccmp -d runs/spatial/tasks/crujra_access_R*_S3/archive/output000/cable_out.nc ``` <!-- readthedocs-preview cable start --> ---- 📚 Documentation preview 📚: https://cable--692.org.readthedocs.build/en/692/ <!-- readthedocs-preview cable end -->
📚 Documentation preview 📚: https://cable--655.org.readthedocs.build/en/655/