Using EDITO Datalab

Python Example - Data Processing

import pyarrow.parquet as pq
import s3fs
import pandas as pd

# Read parquet data
parquet_url = "https://s3.waw3-1.cloudferro.com/emodnet/biology/eurobis_occurrence_data/eurobis_occurrences_geoparquet_2024-10-01.parquet"
s3_path = parquet_url.split('s3.waw3-1.cloudferro.com/')[-1]
fs = s3fs.S3FileSystem(endpoint_url="https://s3.waw3-1.cloudferro.com", anon=True)

parquet_file = pq.ParquetFile(s3_path, filesystem=fs)
biodiversity_data = parquet_file.read_row_groups([0]).to_pandas().head(1000)

# Filter and process
marine_data = biodiversity_data[biodiversity_data['scientificName'].str.contains('fish|mollusk|algae', case=False)]
processed_data = marine_data.groupby('scientificName').agg({'decimalLatitude': 'mean', 'decimalLongitude': 'mean'})

Using EDITO Datalab

15-Minute Tutorial for Marine Researchers

What We'll Cover (15 minutes!)

Whats in the EDITO Datalab?

Find Services

Go to EDITO Datalab

What You'll See:

Navigating to datalab.dive.edito.eu and browsing services

Configure & Launch

Choose Your Service

RStudio Service

Jupyter Service

VSCode Service

Launching VSCode Service in EDITO Datalab

Run Analysis

R Example - STAC Search & Parquet Reading

Querying STAC using R in VSCode

Python Example - Data Processing

Data Analysis using Python scripts

Using your EDITO S3 Storage

Using MyFiles in an EDITO Service

Saving into EDITO Storage

Your Storage is Ready!

R Example

Python Example

Save Data Analysis results to EDITO storage

Complete Workflow

4 Simple Steps

Key Benefits

Try It Now!

Get Started in 2 Minutes

Questions?

Main docs and support