Technical Deep diveField NotesData Engineering

Nimbus Forecast: From Manual yr.no Lookups to a Deployed Platform at DCCMS

April 22, 20264 min read

Jimmy Matewere

The first Sunday, they asked me to collect rainfall forecasts from yr.no for 99 stations. The workflow was manual: open the site, search each coordinate, read the 7-day totals, log the numbers. I got through the first 10 and was genuinely annoyed. Not frustrated in a general way, annoyed in the specific way you get when something feels like it has no business being done by hand. The site had an API. That was enough information. I finished the task, then I told myself I was never doing it again.

The first version of the script was messy. Timezone issues, day-boundary misattribution, summing logic that would quietly produce the wrong totals without signalling anything was wrong. It took several iterations before I trusted the output. By the time it was stable, it handled up to 10 days of forecast, checked whether an output file for that date range already existed before writing, and asked whether you wanted to start from today or tomorrow. Small decisions, but each one had a reason. The prompts existed because I'd run it wrong enough times that I'd automated the sanity checks I was doing manually.

Terminal output of the original Python script fetching 7-day rainfall forecasts from yr.no for 99 stations — The original script that replaced the manual yr.no lookups. This is what the platform made obsolete.

The multimodel version came after. YR, ECMWF, GFS, ICON. Four sources, one output. The Open-Meteo models were clean: API values matched their own platform exactly. YR was different. Days 1 to 3 of the daily breakdown would show small variances against what yr.no displayed on the site, usually under 2mm, then stabilize from day 4 onward. I understood why eventually: the near-term forecast uses a denser hourly timestep, and if you query close to a model update cycle, you catch a partial run. Not a bug, just how the API works. Under 2mm I could live with.

The scripts worked. People were using them. I still couldn't leave it there.

Part of it was that the script required a terminal. That's fine for me, less fine for colleagues who don't live in VS Code. Part of it was that the multimodel output was still a CSV that fed an Excel file that fed a bulletin. There were too many steps between the data and the decision. But if I'm honest, part of it was also that I wanted to see how far I could take it. The scripts were a solution. The platform was a question about what the solution could become.

The initial plan had a Python backend on GCP. I scrapped it quickly. What was happening under the hood was API calls, data extraction, storage, display. Next.js handles all of that. Adding GCP was adding infrastructure for infrastructure's sake, and GCP's free-tier cold-start delays would have made the thing slower. I stayed in TypeScript and never looked back.

The platform went through more iterations than I can clearly sequence. Async concurrency for the API calls, because fetching 99 stations sequentially was too slow. A mutex lock in the database to prevent two concurrent ingestion runs from deleting and rewriting simultaneously. The correct timezone handling, which turned out to be harder than expected: Node's Intl engine on some Vercel serverless runtimes injects invisible Unicode characters into date strings that silently break Postgres parameter binding. I ended up doing UTC+2 manually, just arithmetic, to get a string Postgres would accept without complaint.

The map came together. The station detail drawer. The daily breakdown chart showing all four models side by side.

Nimbus Forecast data grid showing YR, ECMWF, GFS and ICON rainfall forecasts for all 99 stations with spread and mean columns — The core view: four models, spread, mean, and the horizon slider that became the Monday morning workflow.

At some point I ran validation across all 99 stations after a fresh ingestion and got 100% parity on the Open-Meteo models. YR matched within the margins I'd already accepted.

Station detail drawer in Nimbus Forecast with grouped bar chart comparing four models across the daily breakdown — Drawer view for a single station. The spread between models is what forecasters actually check before the bulletin.

A few weeks ago it became part of the actual workflow, not a demo, not a pilot, the actual Monday morning process. I was happy. The kind of happy that's quiet. My work was being used at DCCMS. The feedback that came back was useful, and it gave me a clearer picture of where the friction still was.

The Vale Logistics portal came from that read. Eleven railway corridor stations, multi-parameter, temperature and wind alongside the rainfall, a CSV export formatted exactly to match the bulletin template. That extension wasn't in any original plan. It came from seeing that the rainfall picture was useful but the workflow around it still had too many manual steps. I like that kind of problem. It tells you exactly where to go next.

Vale Logistics portal in Nimbus Forecast showing Day 0 summary for 11 railway corridor stations with temperature, wind and rainfall — The Nacala extension that came from a real friction point. Same platform, different parameters and export format.

The platform will keep growing. Five-day regional forecasts across Malawi's climate zones sit next on the list, though that one still waits on the same kind of manual step that started everything: someone has to assign all 99 stations to their forecast regions in the CMS before any code can be written. Some things still require a human.

Correspondence

Continue the Conversation

I welcome peer perspectives and questions regarding any of the topics discussed.

Contact the Author

Next upClimate Risk Dashboard v2: Why I Rebuilt It