Releasing

Author

Zeb Becker

Published

June 25, 2026

Overview

To improve reproducibility and reliability, we use tagged releases in our GitHub repo to determine which version of the FEDS codebase our NRT data processing system uses. The NRT system does not automatically incorporate changes pushed to the main branch of the GitHub repository. For your changes to be reflected in the production system, you need to release a new tagged version as described on this page, then edit the variable feds_algo_version on the Prefect orchestrator that submits NRT jobs to DPS for data processing.

What happens when we do this? At a high level, releasing a new tagged release on GitHub triggers the CI workflow defined in release.yaml, which submits a job to DPS asking it to build a new image (“register an algorithm,” in MAAP parlance) within the version of the code that the new tag refers to baked in. This will make a new algorithm version available on DPS- for example, eis-feds-dask-coordinator-v3:1.5.1. We can now submit jobs to DPS that reference this algorithm version, and it will use the image that is built during the release process to complete the job.

Importantly, just making the new version of the algorithm available on DPS is not enough- we also need to make it so that our orchestrator requests that DPS use that version for new NRT jobs going forward. We do this by changing the Prefect variable feds_algo_version in our production job orchestrator (refer to the internal orchestration runbook for more info). Conversely, this makes it easy to roll back to a previous version- if your release 1.5.1 has a bug, just change feds_algo_version back to 1.5.0 to go back to the previous version in production.

The rest of this document describes acceptance testing and release procedures in more detail.

Summary/release checklist

Acceptance Testing

Release

Post-Release

Rollback (if needed)

Acceptance Testing

Run unit tests

Our automated test suite runs every time a PR is opened for the main branch of the repository, and PRs should never be merged with any failing tests.

See our page on Contributing for information on how to run these tests locally while you are developing.

Test conda environment resolution on MAAP Hub

(Optional) If you have changed project dependencies, confirm the environment still resolves and passes tests before you regenerate the lock. The DPS image is built by fireatlas/maap_runtime/run_dps_build.sh, which installs from the pinned env.lock.yml — so the goal here is to verify that the loose env.yml still solves and works, then regenerate env.lock.yml from that known-good environment.

Run this on the MAAP Hub (Linux): env.lock.yml is platform specific and must be generated there. Note the Hub uses slightly different paths than the DPS image (/srv/conda rather than /opt/conda).

git clone https://github.com/Earth-Information-System/fireatlas.git
cd fireatlas
git switch branch-to-test

# if you have a previous environment, remove it and start clean
conda env remove -n fire_env -y

# build from the loose spec to test that it still resolves
conda env create -f env.yml
conda activate fire_env

# install fireatlas plus dev/test extras
python -m pip install -e '.[dev]'

# make sure no conflicts
python -m pip check

# run unit tests
# Specify abs path to pytest executable in this conda env if needed,
# e.g. /srv/conda/envs/fire_env/bin/pytest for MAAP Hub
/srv/conda/envs/fire_env/bin/pytest
/srv/conda/envs/fire_env/bin/pytest --runslow

If everything passes, regenerate env.lock.yml from this working environment and commit it alongside your dependency change (see the lock-regeneration command in Contributing). DPS and CI build from that lock, so it must reflect the environment you just validated. Again, this must be done on a Linux platform such as MAAP Hub.

Run end-to-end tests

We also have a bash script that runs end-to-end on the past ten days using version of the code on the current branch you currently have checked out as well as the main branch, then automatically compares the outputs and makes a report available for manual inspection. This can take a while to run, so it is recommended to debug your code with the unit tests first.

Follow the directions above to set up your conda environment (including making sure that the optional test dependency group is installed!), then:

# compare main against currently checked out branch 
bash maap_runtime/compare_branches.sh 

# or, specify which branch to compare main against 
bash maap_runtime/compare_branches.sh name-of-branch-to-test

Register and build release candidate image

We can also register a new algorithm on DPS to test that build process before releasing a new version. Once all of your changes are ready and tested, on MAAP Hub (so that you are already authenticated to DPS), edit fireatlas/maap_runtime/coordinator/algorithm_config.yaml, changing the following fields.

# fields to change
algorithm_name: eis-feds-dask-coordinator-v3-candidate
algorithm_version: name-of-branch-to-test

Make sure to save these changes, but do NOT commit them to git- we are just using this notebook to submit a one-off release candidate build to DPS.

Then, run fireatlas/maap_runtime/register_all.ipynb to submit the algorithm registration to DPS. This will have DPS build a new image based off of the latest code committed to your branch on GitHub. Click on the job_web_url in the response to view the build process and ensure it completes successfully.

Manual DPS run with candidate image

Finally, we can use the manual-v3 GitHub Action to submit a test run to this new DPS algorithm we just registered. Just change the parameters algorithm name -> eis-feds-dask-coordinator-v3-candidate and Branch name or version tag -> name-of-branch-to-test- it is important that these match the values you put in the YAML file above exactly.

This will not copy the outputs to VEDA for ingest into our API database, but it will use production inputs AND outputs. If you run this using a regnm used in production, be mindful about the time of day in case you cache an incomplete version of allfires outputs for that region.

Use the MAAP console to inspect the logs from your run and ensure it succeeded. You can also manually inspect the FEDS outputs produced.

Backups

The general recovery strategy is that we can re-download the input data from the UMD FTP server (for monthly VIIRS active fire detections standard product) or FIRMS (as shown in fireatlas/notebooks/20_FIRMS_input_backfill) and re-generate known-good FEDS outputs from this in just a few hours.

In some cases, you may wish to make backup copies of key directories before releasing. For example:

aws s3 cp s3://maap-ops-workspace/shared/gsfc_landslides/FEDSoutput-v3/CONUS/2026/ s3://maap-ops-workspace/shared/zbecker/FEDSbackups/FEDSoutput-v3/CONUS_backup/2026/ --recursive

aws s3 cp s3://maap-ops-workspace/shared/gsfc_landslides/FEDSinput/VIIRS/VJ114IMGTDL/ s3://maap-ops-workspace/shared/zbecker/FEDSbackups/VJ114IMGTDL_backup/ --recursive

aws s3 cp s3://maap-ops-workspace/shared/gsfc_landslides/FEDSinput/VIIRS/VNP14IMGTDL/ s3://maap-ops-workspace/shared/zbecker/FEDSbackups/VNP14IMGTDL_backup/ --recursive

How To Release

In this context “releasing” means the following things:

  • tagging the algorithm with a certain semantic version (semver for short)

  • building an image off that tag that will be used in some async task runner (currently only DPS) to run the regional algorithm jobs asynchronously.

Most of this can be automated but since semver is often about considering if the newest set of changes we are packaging up under a version is backward compatible it does require a human to choose the version.

Choose a Version Number

Look at the current release tags and versions and decide if the minor or patch version should be incremented:

  • are all the merged changes in this release just bug fixes? then bump the patch (<major>.<minor>.<patch>) version by one
  • did any of the merged changes going out include new features? then bump the minor (<major>.<minor>.<patch>) version by one

Create a new PR for DPS Jobs

Once the releaser has a version number, they will need to create a PR that modifies version in a couple places:

  • the algorithm config algorithm_version in ./maap_runtime/coordinator/algorithm_config.yaml:
algorithm_description: "coordinator for all regional jobs, preprocess and FireForward steps"
algorithm_version: <NEW VERSION NUMBER HERE>
environment: ubuntu
  • (DEPRECATED) unfortunately all the scheduled jobs also pass this version to kick off jobs and therefore also need to be updated in ./.github/workflows/schedule-*.yaml:
- name: kick off the DPS job
  uses: Earth-Information-System/fireatlas/.github/actions/run-dps-job-v3@conus-dps
  with:
    algo_name: eis-feds-dask-coordinator-v3
    github_ref: <NEW VERSION NUMBER HERE>
    username: gcorradini

Merge PR and Manually Release

You can then merge the above PR and then kick off a new release by doing the following:

  1. Go to https://github.com/Earth-Information-System/fireatlas/releases

  2. click “Draft New Release”

  3. create a new tag for this release that matches the version chosen above

  4. click the “Generate release notes”

  5. review the release notes and clean up

  6. click the “Publish release”

Verify DPS Image Build

The biggest thing that can go wrong with this workflow is that the DPS image builder fails to build our image.

In the GitHub Actions release job you should be able to see something like this:

{
  "code": 200,
  "message": {
    "id": "ec3202d4adeb02f7d887d88d2af9784184e60344",
    "short_id": "ec3202d4",
    "created_at": "2024-07-30T20:34:28.000+00:00",
    "parent_ids": ["91dfb3a4edff20c7049825101f015b67c8a05d3a"],
    "title": "Registering algorithm: eis-feds-dask-coordinator-v3", 
    "message": "Registering algorithm: eis-feds-dask-coordinator-v3",
    "author_name": "root",
    "author_email": "root@845666954fdb",
    "authored_date": "2024-07-30T20:34:28.000+00:00",
    "committer_name": "root",
    "committer_email": "root@845666954fdb", 
    "committed_date": "2024-07-30T20:34:28.000+00:00",
    "trailers": {},
    "web_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/commit/ec3202d4adeb02f7d887d88d2af9784184e60344", 
    "stats": {
      "additions": 7,
      "deletions": 7, 
      "total": 14
    },
    "status": "created",
    "project_id": 3,
    "last_pipeline": {
      "id": 14293,
      "iid": 1332,
      "project_id": 3,
      "sha": "ec3202d4adeb02f7d887d88d2af9784184e60344",
      "ref": "main",
      "status": "created",
      "source": "push",
      "created_at": "2024-07-30T20:34:29.737Z",
      "updated_at": "2024-07-30T20:34:29.737Z", 
      "web_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/pipelines/14293"
    },
    "job_web_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/jobs/14578",
    "job_log_url": "https://repo.maap-project.org/root/register-job-hysds-v4/-/jobs/14578/raw"
  }
}

Click on job_web_url to view the DPS image build and ensure it succeeds.

Deploy New Version to Production

Up to this point, we have now successfully released a new tagged version of the fireatlas code and used that to build a new image/register a new algorithm on DPS. But, we still aren’t using that in NRT production. The version of the code being used in production depends ONLY on the Prefect variable feds_algo_version in our production job orchestrator. We MUST change this to the latest version for the new release to go into production.

Refer to the internal orchestration runbook for directions on how to do this.

Rollbacks and Troubleshooting

As mentioned above, the fastest way to roll back to a previous algorithm version in production is to change feds_algo_version back to the latest stable version. All new NRT jobs submitted after this point will use that version.

If input, intermediate or output data become corrupted, you can delete it and either restore from backups made earlier, or re-download the input data from FIRMS (see fireatlas/notebooks/20_FIRMS_input_backfill) and re-generate the outputs for the current year to date by simply triggering a new NRT run. If doing this, be sure to delete all relevant preprocessed files as well, as these can cache stale data.