{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# add/remove param S04P merge example\n", "Demos doing a merge on S04P CTD data, this notebook was made to test the functions on a real dataset" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from zipfile import ZipFile\n", "from collections import defaultdict\n", "\n", "import xarray as xr\n", "import requests\n", "\n", "from cchdo.hydro.core import add_param, remove_param\n", "import cchdo.hydro.accessors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The netCDF4 python library really wants things on disk to read, there are alternatives and ways around this, but it's easy to just write it out" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "ctd_dl = requests.get(\"https://cchdo.ucsd.edu/data/41655/320620180309_ctd.nc\")\n", "with open(\"320620180309_ctd.nc\", \"wb\") as f:\n", " f.write(ctd_dl.content)\n", "beamcp = requests.get(\"https://cchdo.ucsd.edu/data/14754/2018_S04P.zip\")\n", "with open(\"2018_S04P.zip\", \"wb\") as f:\n", " f.write(beamcp.content)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Load the data, note that I did some exploring of the beamcp input before finalizing this for the \"load step\"\n", "ctd_data = xr.load_dataset(\"320620180309_ctd.nc\")\n", "with ZipFile(\"2018_S04P.zip\") as zf:\n", " beamcp = zf.read(\"2018_S04P.txt\").decode(\"ascii\").splitlines()\n", " beam_cp_cells = [line.split() for line in beamcp]" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# split the profiles into... profiles\n", "profiles = defaultdict(list)\n", "last_profile = None\n", "for line in beam_cp_cells:\n", " if len(line) > 2:\n", " *_, station, cast = line\n", " last_profile = (station, cast)\n", " continue\n", " if last_profile is None:\n", " continue\n", " profiles[last_profile].append(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The remove and add param functions are what was being tested here, the incoming data repalces the raw data" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "removed_raw_beamcp = remove_param(\n", " ctd_data, \"CTDXMISS [VOLTS]\", delete_param=True, require_empty=False\n", ")\n", "new_ctd = add_param(removed_raw_beamcp, \"CTDBEAMCP [/METER]\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make the merge_fq structure for merging, note that the incoming values are kept as strings, this is so the extract precision functions can update the print format in the netCDF file" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "fq_json = []\n", "expocode = new_ctd.expocode[0].item()\n", "for (station, cast), profile in profiles.items():\n", " cast = int(cast)\n", " for row in profile:\n", " if row[1].startswith(\"-88\"):\n", " continue\n", " # in the recalibrated ODF file, the last pressure level of station 119 was dropped\n", " if station == \"119\" and row[0] == \"3210.0\":\n", " continue\n", " fq_json.append(\n", " {\n", " \"EXPOCODE\": expocode,\n", " \"STNNBR\": station,\n", " \"CASTNO\": cast,\n", " \"SAMPNO\": row[0],\n", " \"CTDBEAMCP [/METER]\": row[1],\n", " }\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "not quite planned, but need to fix the station ids to remove leading zeros,\n", "this was done in the origional merge by see in 2019" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "new_ctd[\"station\"][:] = np.strings.lstrip(new_ctd.station.values.astype(np.str_), \"0\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do the actual merge, this took like 40 seconds on an m1, kinda long..." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 2 s, sys: 45.6 ms, total: 2.05 s\n", "Wall time: 2.07 s\n" ] } ], "source": [ "%%time\n", "merged_ctd = new_ctd.cchdo.merge_fq(fq_json)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "merged_ctd.attrs[\"comments\"] = (\n", " f\"Remerged CTDBEAMCP data into ODF resubmission\\n{merged_ctd.comments}\"\n", ")\n", "merged_ctd.attrs[\"cchdo_software_version\"] = \"hydro 1.0.2.9\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write some output files to examine and share with colleagues" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "merged_ctd.to_netcdf(\"s04p_merged_ctd.nc\")\n", "merged_ctd.cchdo.to_exchange(\"s04p_merged_ct1.zip\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.14" } }, "nbformat": 4, "nbformat_minor": 4 }