Using JupyterLab to monitor coastal and marine environments in the Philippines on CopPhil

This notebook demonstrates how Sentinel-2 satellite imagery can be used to estimate and visualize key water quality indicators in Laguna de Bay, Philippines.

By applying spectral indices derived from Sentinel-2 reflectance bands, we can monitor parameters such as chlorophyll-a, cyanobacteria, turbidity, CDOM, DOC, and water color.

What this notebook covers

  • Connecting to the /eodata archive to access Sentinel-2 Level-2A products

  • Loading and preprocessing the required spectral bands

  • Calculating empirical water quality indices

  • Visualizing results with custom color scales and percentile-based clipping

  • Exploring how satellite-based monitoring can support environmental management ### Important Note The values derived in this notebook are based on empirical models. They provide useful relative indicators of water quality but cannot be considered absolute concentrations without in-situ calibration and validation. Ground measurements are essential for accurate monitoring and decision-making. ### References The empirical formulas for water quality proxies are adapted from the Sentinel Hub SE2WAQ.

Prerequisites

No. 1 Access to CopPhil site

You need a CopPhil hosting account, available at https://infra.copphil.philsa.gov.ph.

No. 2 Access to JupyterLab

JupyterLab used in this article is available here: https://jupyter.infra.copphil.philsa.gov.ph/hub/login?next=%2Fhub%2F.

No. 3 Working knowledge of JupyterLab

See article Introduction to JupyterLab on CopPhil

No. 4 Using API

This is an introductory example of obtaining EODATA using API calls, in Python:

Download satellite data using API on CopPhil

No. 5 Information on Sentinel-5P mission

Page Sentinel-5P mission shows basic information on Sentinel-5P mission, which is used in this article as a source of information.

No. 6 Have credentials to access EODATA ready

To obtain the necessary credentials to access EODATA, follow this article:

How to get credentials to access EODATA on CopPhil

You can generate up to 200 different keys per account.

Prepare your environment

import requests
import json
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import boto3
import rasterio
import matplotlib.colors as mcolors
We first establish a connection to the CopPhil S3 storage provided

by CloudFerro. This bucket (/eodata) hosts the Sentinel-2 archive, accessible directly as .SAFE folders.

If the connection succeeds, we can later open Sentinel-2 products directly from /eodata paths without downloading them locally.

#Enter your eodata credentials here:
access_key = ''
secret_key = ''
host = 'https://eodata.infra.copphil.philsa.gov.ph/'

# Initialize S3 resource
s3 = boto3.resource(
    's3',
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key,
    endpoint_url=host,
)

# Verify connection by accessing the 'eodata' bucket
bucket_name = 'eodata'
try:
    bucket = s3.Bucket(bucket_name)
    # Check if the bucket is accessible by listing its contents
    objects = list(bucket.objects.limit(1))  # Limit to 1 object for testing
    if objects:
        print(f"Connection to bucket '{bucket_name}' successful. Objects found.")
    else:
        print(f"Connection to bucket '{bucket_name}' successful, but no objects found.")
except Exception as e:
    print(f"Connection to bucket '{bucket_name}' failed:", e)
Connection to bucket 'eodata' successful. Objects found.

Search Sentinel-2 Products (Manila Bay)

We create a search query for the CopPhil OData API to find Sentinel-2 products that cover Manila Bay between the beggining of 2024 and today.

# search parameters: Sentinel-2 L2A over Manila Bay
catalogue_odata_url = "https://catalogue.infra.copphil.philsa.gov.ph/odata/v1"
collection_name = "SENTINEL-2"
product_type = "S2MSI2A"   # Level-2A (surface reflectance)

# AOI: Manila Bay
aoi = "POLYGON((121.05 14.65, 121.50 14.65, 121.50 14.15, 121.05 14.15, 121.05 14.65))"

# Time window: wide window to get more scenes
search_period_start = "2024-01-01T00:00:00.000Z"
search_period_end   = "2025-08-26T23:59:59.000Z"

# Optional: Cloud coverage set to <10% to get clearer images
# (cloudCover is a DoubleAttribute in the catalogue)
cloud_cover_max = 10.0

search_query = (
    f"{catalogue_odata_url}/Products?"
    f"$filter=Collection/Name eq '{collection_name}' "
    f"and Attributes/OData.CSC.StringAttribute/any(att:att/Name eq 'productType' "
    f"and att/OData.CSC.StringAttribute/Value eq '{product_type}') "
    f"and Attributes/OData.CSC.DoubleAttribute/any(att:att/Name eq 'cloudCover' "
    f"and att/OData.CSC.DoubleAttribute/Value lt {cloud_cover_max}) " #lt = less than
    f"and OData.CSC.Intersects(area=geography'SRID=4326;{aoi}') "
    f"and ContentDate/Start gt {search_period_start} "
    f"and ContentDate/Start lt {search_period_end}"
    f"&$top=1000"
)

print(search_query)
https://catalogue.infra.copphil.philsa.gov.ph/odata/v1/Products?$filter=Collection/Name eq 'SENTINEL-2' and Attributes/OData.CSC.StringAttribute/any(att:att/Name eq 'productType' and att/OData.CSC.StringAttribute/Value eq 'S2MSI2A') and Attributes/OData.CSC.DoubleAttribute/any(att:att/Name eq 'cloudCover' and att/OData.CSC.DoubleAttribute/Value lt 10.0) and OData.CSC.Intersects(area=geography'SRID=4326;POLYGON((121.05 14.65, 121.50 14.65, 121.50 14.15, 121.05 14.15, 121.05 14.65))') and ContentDate/Start gt 2024-01-01T00:00:00.000Z and ContentDate/Start lt 2025-08-26T23:59:59.000Z&$top=1000

Retrieve and Inspect Search Results

We send the query and parse the JSON response into a DataFrame for easier inspection. This gives us an overview of how many Sentinel-2 images from the search query are available.

response = requests.get(search_query).json()
result = pd.DataFrame.from_dict(response["value"])
print(f'Number of products which meets query filters: {len(result)}')
Number of products which meets query filters: 9

After confirming the images exist we can list paths to eodata:

products = json.loads(requests.get(search_query).text)

path_list = []
for item in products['value']:
    path_list.append(item['S3Path'])

path_list
['/eodata/Sentinel-2/MSI/L2A/2024/11/22/S2A_MSIL2A_20241122T022011_N0511_R003_T51PTS_20241122T053347.SAFE',
 '/eodata/Sentinel-2/MSI/L2A/2024/03/07/S2A_MSIL2A_20240307T021541_N0510_R003_T51PTR_20240307T053851',
 '/eodata/Sentinel-2/MSI/L2A/2024/02/26/S2A_MSIL2A_20240226T021651_N0510_R003_T51PTR_20240226T053853',
 '/eodata/Sentinel-2/MSI/L2A/2024/04/16/S2A_MSIL2A_20240416T021611_N0510_R003_T51PTS_20240416T075854',
 '/eodata/Sentinel-2/MSI/L2A/2024/04/11/S2B_MSIL2A_20240411T021529_N0510_R003_T51PTR_20240411T043123',
 '/eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE',
 '/eodata/Sentinel-2/MSI/L2A/2024/04/16/S2A_MSIL2A_20240416T021611_N0510_R003_T51PTR_20240416T075854',
 '/eodata/Sentinel-2/MSI/L2A/2024/02/26/S2A_MSIL2A_20240226T021651_N0510_R003_T51PUR_20240226T053853',
 '/eodata/Sentinel-2/MSI/L2A/2025/08/04/S2B_MSIL2A_20250804T021529_N0511_R003_T51PTS_20250804T041948.SAFE']

And input the image folder path into output_dir, for extraction of bands:

# Specify the directory where the files are located
output_dir = '/eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE'
# We can either use the full path above or use the path_list from the search results
#output_dir = path_list[x]  # Replace x with the index of the desired product

Loading Sentinel-2 Bands (20m Resolution)

This step defines a helper function to automatically locate and load

Sentinel-2 20m resolution bands from a .SAFE product. The function searches for JP2 files in the R20m folder and loads only the requested bands.

We then:

  • Specify the bands needed for water quality analysis (B01, B02, B03, B04, SCL).

  • Load them into memory as arrays.

  • Normalize reflectance values by dividing by 10,000 (converting digital numbers into surface reflectance).

This provides ready-to-use band arrays for further index calculations.

def find_and_load_s2_bands_20m(safe_folder, bands):
    """
    Find and load Sentinel-2 bands at 20m resolution (JP2 files) inside a .SAFE folder.
    Returns a dict {band_name: array}.
    """
    band_data = {}

    for root, dirs, files in os.walk(safe_folder): #walk through the directories
        if "R20m" in root:  # only look into the 20m folder
            for file_name in files:
                if file_name.lower().endswith(".jp2"):
                    for band in bands:
                        if f"{band}.jp2" in file_name or f"_{band}_" in file_name: # check if the band is in the filename
                            path = os.path.join(root, file_name) # full path to the file
                            with rasterio.open(path) as src:
                                band_data[band] = src.read(1)
                            print(f"Loaded {band}: {path}")
    return band_data

# Example usage
bands_needed = ("B01","B02","B03","B04","SCL")
s2_bands_20m = find_and_load_s2_bands_20m(output_dir, bands_needed)

# Access arrays directly
b01 = s2_bands_20m.get("B01")
b02 = s2_bands_20m.get("B02")
b03 = s2_bands_20m.get("B03")
b04 = s2_bands_20m.get("B04")

scl = s2_bands_20m.get("SCL")
print("Bands loaded:", list(s2_bands_20m.keys()))

#band normalization
b01 = b01 / 10000.0
b02 = b02 / 10000.0
b03 = b03 / 10000.0
b04 = b04 / 10000.0
Loaded B01: /eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE/GRANULE/L2A_T51PTS_A040927_20250106T022850/IMG_DATA/R20m/T51PTS_20250106T022009_B01_20m.jp2
Loaded B02: /eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE/GRANULE/L2A_T51PTS_A040927_20250106T022850/IMG_DATA/R20m/T51PTS_20250106T022009_B02_20m.jp2
Loaded B03: /eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE/GRANULE/L2A_T51PTS_A040927_20250106T022850/IMG_DATA/R20m/T51PTS_20250106T022009_B03_20m.jp2
Loaded B04: /eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE/GRANULE/L2A_T51PTS_A040927_20250106T022850/IMG_DATA/R20m/T51PTS_20250106T022009_B04_20m.jp2
Loaded SCL: /eodata/Sentinel-2/MSI/L2A/2025/01/06/S2B_MSIL2A_20250106T022009_N0511_R003_T51PTS_20250106T041630.SAFE/GRANULE/L2A_T51PTS_A040927_20250106T022850/IMG_DATA/R20m/T51PTS_20250106T022009_SCL_20m.jp2
Bands loaded: ['B01', 'B02', 'B03', 'B04', 'SCL']

Visualizing Sentinel-2 True Color (RGB)

We stack bands B04 (Red), B03 (Green), and B02 (Blue) into an RGB

composite to create a natural-color image.

To improve contrast, the values are stretched between the **2nd and

98th percentiles**, which enhances visibility by removing extreme dark/bright pixels.

The result is a true-color Sentinel-2 image at 20 m resolution, plotted with Matplotlib.

# Stack into RGB
rgb = np.dstack([b04, b03, b02])
# stretch between 2nd–98th percentile to imporve contrast
rgb_min, rgb_max = np.percentile(rgb, (2, 98))
rgb_stretched = np.clip((rgb - rgb_min) / (rgb_max - rgb_min), 0, 1)

# Plot
plt.figure(figsize=(9,9))
plt.imshow(rgb_stretched)
plt.title("Sentinel-2 True Color (B04, B03, B02) - 20m")
plt.axis("off")
plt.show()
../_images/Using-JupyterLab-to-monitor-coastal-and-marine-environments-in-the-Philippines_15_0.png

Extracting Water Pixels with the Scene Classification Layer (SCL)

Sentinel-2 Level-2A products include a Scene Classification Layer (SCL), which provides pixel-level labels such as vegetation, cloud, shadows, and water.

In this layer, water pixels are assigned the class value of 6.

Here we:

  • Create a water_mask by selecting all pixels where SCL == 6.

  • Print out the total number of water pixels found.

If the count is 0, the selected scene does not contain any detected water.

#Water pixels are classified with value 6 in the SCL band
water_mask = (scl == 6)
#print out number of water pixels, if it's 0, then the scene has no water.
print("Water pixels:", int(water_mask.sum()))
Water pixels: 7816831

Estimating Chlorophyll-a Concentration

Chlorophyll-a (Chl-a) is a key indicator of phytoplankton biomass and overall water quality. We calculate it here using an empirical model based on Sentinel-2 reflectance bands:

\[Chl\_a = 4.26 \times \left(\frac{B03}{B01}\right)^{3.94}\]
Steps performed: - Compute Chl-a concentration from the **Green

(B03)** and Coastal Aerosol (B01) bands.

  • Clip values to the 95th percentile to reduce the influence of extreme outliers.

  • Apply the water mask (from SCL) to exclude land and other non-water pixels.

  • Define a custom color scale and class breaks ([0, 5, 8, 11, 16]) based on the pixel value distribution in this scene.

  • Visualize the Chl-a concentration map with a colorbar for interpretation.

# Calculating the empirical models for Chlorophyl concentration
chl_a = 4.26 * np.power((b03 / b01), 3.94)
# Clip the values to the 95th percentile to avoid outliers messing up the color scale
chl_a = np.clip(chl_a, 0, np.nanpercentile(chl_a, 95))
# Apply SCL water mask
chl_a = np.where(water_mask, chl_a, np.nan)
scaleChl_a = [0, 5, 8, 11, 16]  #custom scale for this photo based on pixel value distribution
# Custom color scale
colorScale = [
   (73/255, 111/255, 242/255),   # blue
   (130/255, 211/255, 95/255),   # green
   (254/255, 253/255, 5/255),    # yellow
   (253/255, 0/255, 4/255),      # red
   (142/255, 32/255, 38/255),    # dark red
   (217/255, 124/255, 245/255)   # purple
]
custom_cmap = mcolors.ListedColormap(colorScale)
# Create norm to map the values to the corresponding bins
norm_chl  = mcolors.BoundaryNorm(scaleChl_a, custom_cmap.N)
# plot
fig, ax = plt.subplots(figsize=(8, 6))
im = ax.imshow(chl_a, cmap=custom_cmap, norm=norm_chl)
ax.set_title("Chl-a (µg/L)")
# Add colorbar
plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04, label="Chl-a (µg/L)")
ax.axis("off")
plt.tight_layout()
plt.show()
../_images/Using-JupyterLab-to-monitor-coastal-and-marine-environments-in-the-Philippines_19_0.png

Calculating and Visualizing Multiple Water Quality Parameters

In addition to Chlorophyll-a, we calculate several other empirical water quality indicators derived from Sentinel-2 reflectance bands:

  • Cyanobacteria (Cya)

    \[Cya = 115530.31 \times \left(\frac{B03 \cdot B04}{B02}\right)^{2.38}\]
  • Turbidity

    \[Turb = 8.93 \times \frac{B03}{B01} - 6.39\]
  • CDOM (Colored Dissolved Organic Matter)

    \[CDOM = 537 \times e^{-2.93 \cdot \frac{B03}{B04}}\]
  • DOC (Dissolved Organic Carbon)

    \[DOC = 432 \times e^{-2.24 \cdot \frac{B03}{B04}}\]
  • Color (optical water color proxy)

    \[Color = 25366 \times e^{-4.53 \cdot \frac{B03}{B04}}\]

Processing steps:

  • Clip values at the 95th percentile to remove outliers.

  • Apply the SCL water mask to restrict analysis to water pixels.

  • Define custom bins for each parameter, based on the observed distribution.

  • Visualize all indices in a 2 × 3 subplot grid with consistent color scales.

This provides a comparative overview of different water quality proxies for the selected Sentinel-2 scene.

# Formulas for other water quality parameters
cya   = 115530.31 * np.power(((b03 * b04) / b02), 2.38)
turb  = 8.93 * (b03 / b01) - 6.39
cdom  = 537 * np.exp(-2.93 * (b03 / b04))
doc   = 432 * np.exp(-2.24 * (b03 / b04))
color = 25366 * np.exp(-4.53 * (b03 / b04))
# Clip the values to the 95th percentile to
cya   = np.clip(cya,   0, np.nanpercentile(cya, 95))
cdom  = np.clip(cdom,  0, np.nanpercentile(cdom, 95))
doc   = np.clip(doc,   0, np.nanpercentile(doc, 95))
color = np.clip(color, 0, np.nanpercentile(color, 95))
turb  = np.clip(turb,  0, np.nanpercentile(turb, 95))

# Apply water mask
cya    = np.where(water_mask, cya,    np.nan)
turb   = np.where(water_mask, turb,   np.nan)
cdom   = np.where(water_mask, cdom,   np.nan)
doc    = np.where(water_mask, doc,    np.nan)
color  = np.where(water_mask, color,  np.nan)

# Custom scales
scaleCya   = [12, 900, 1300, 2100, 5200]    # cyanobacteria
scaleTurb  = [0, 3, 4, 5, 6]                # turbidity
scaleCDOM  = [0, 17, 20, 25, 35]            # CDOM absorption
scaleDOC   = [0, 30, 35, 40, 50]            # dissolved organic carbon
scaleColor = [0, 120, 160, 210, 350]        # water color

# Norms
norm_cya  = mcolors.BoundaryNorm(scaleCya, custom_cmap.N)
norm_turb = mcolors.BoundaryNorm(scaleTurb, custom_cmap.N)
norm_cdom = mcolors.BoundaryNorm(scaleCDOM, custom_cmap.N)
norm_doc  = mcolors.BoundaryNorm(scaleDOC, custom_cmap.N)
norm_col  = mcolors.BoundaryNorm(scaleColor, custom_cmap.N)

# Plot all side by side in 2x3 grid
fig, axs = plt.subplots(2, 3, figsize=(18, 12))

im1 = axs[0,0].imshow(chl_a,  cmap=custom_cmap, norm=norm_chl);   axs[0,0].set_title("Chl-a (µg/L)")
im2 = axs[0,1].imshow(cya,    cmap=custom_cmap, norm=norm_cya);   axs[0,1].set_title("Cyanobacteria")
im3 = axs[0,2].imshow(turb,   cmap=custom_cmap, norm=norm_turb);  axs[0,2].set_title("Turbidity")
im4 = axs[1,0].imshow(cdom,   cmap=custom_cmap, norm=norm_cdom);  axs[1,0].set_title("CDOM")
im5 = axs[1,1].imshow(doc,    cmap=custom_cmap, norm=norm_doc);   axs[1,1].set_title("DOC")
im6 = axs[1,2].imshow(color,  cmap=custom_cmap, norm=norm_col);   axs[1,2].set_title("Color")

for ax, im, label in zip(axs.flat,
                         [im1, im2, im3, im4, im5, im6],
                         ["Chl-a (µg/L)", "Cya", "Turbidity", "CDOM", "DOC", "Color"]):
    plt.colorbar(im, ax=ax, fraction=0.046, pad=0.04, label=label)
    ax.axis("off")

plt.tight_layout()
plt.show()
../_images/Using-JupyterLab-to-monitor-coastal-and-marine-environments-in-the-Philippines_21_0.png

Next Steps and Improvements

While this notebook demonstrates the potential of Sentinel-2 imagery for monitoring water quality in Laguna de Bay, several important improvements can make the analysis more robust and actionable:

In-situ validation

The empirical indices used here must be calibrated and validated with ground-truth measurements to ensure quantitative accuracy.

Time-series analysis

Extending the workflow to multiple Sentinel-2 acquisitions would allow detection of seasonal trends, algal blooms, and long-term water quality changes.

Cloud and shadow masking

Refining the masking using the SCL layer or additional algorithms would improve reliability by excluding contaminated pixels.

This workflow provides a relative assessment of water quality proxies. For practical environmental management and decision-making, it should always be complemented with in-situ data collection and calibration.