Supported file formats
This page documents the file formats found in the marine data archive and their support status in the SeisLabData extraction pipeline.
GDAL raster formats
These formats are read using GDAL raster drivers. The extractor can retrieve: bounding box, CRS/EPSG code, band count, pixel resolution, and nodata value.
GeoTIFF
| Property | Value |
|---|---|
| Extension(s) | .tif, .tiff |
| Media type | image/tiff |
| GDAL driver | GTiff |
| Categories | Bathymetry, Backscatter |
| Stages | QC, Processed data, Interpreted data |
Georeferenced raster format with embedded CRS and metadata. Widely supported across GIS tools. Used as primary file for processed bathymetry/backscatter grids and interpreted products.
XYZ grid
| Property | Value |
|---|---|
| Extension(s) | .xyz |
| Media type | text/plain |
| GDAL driver | XYZ |
| Categories | Bathymetry, Backscatter |
| Stages | Raw data, QC, Interpreted data |
Plain text point cloud format with X, Y, Z columns. Used as primary file for raw/QC bathymetry data and as secondary file in some stages. GDAL reads it as a gridded raster dataset.
ASCII Grid
| Property | Value |
|---|---|
| Extension(s) | .asc |
| Media type | text/plain |
| GDAL driver | AAIGrid |
| Categories | Bathymetry, Backscatter |
| Stages | Interpreted data |
ESRI ASCII raster format. Header defines grid dimensions, cell size, origin, and nodata value, followed by space-delimited cell values. Used as primary file for interpreted products.
Float Grid
| Property | Value |
|---|---|
| Extension(s) | .flt + .hdr |
| Media type | application/octet-stream |
| GDAL driver | EHdr |
| Categories | Bathymetry, Backscatter |
| Stages | Processed data, Interpreted data |
Binary raster format storing 32-bit IEEE floating-point values in row-major order. Requires a
companion .hdr header file containing grid metadata (ncols, nrows, cellsize, nodata_value,
byteorder). Used as secondary file for processed data and as primary/secondary for interpreted
products.
NetCDF
| Property | Value |
|---|---|
| Extension(s) | .nc, .nc4 |
| Media type | application/x-netcdf |
| GDAL driver | netCDF |
| Categories | Bathymetry, Backscatter |
| Stages | Processed data, Interpreted data |
Network Common Data Form, a self-describing binary format widely used for oceanographic, atmospheric, and climate data. GDAL can read NetCDF files as raster datasets, but only when they follow the CF (Climate and Forecast) conventions with properly defined coordinate variables, dimensions, and grid mapping attributes. Unstructured NetCDF files or files that do not follow the CF standard structure cannot be read by GDAL.
CSV
| Property | Value |
|---|---|
| Extension(s) | .csv |
| Media type | text/csv |
| GDAL driver | CSV |
| Categories | Bathymetry, Backscatter |
| Stages | Raw data, QC, Interpreted data |
Comma-separated or delimited text file containing point data (X, Y, Z or lon, lat, value). Similar to XYZ but with header row and flexible column naming. OGR reads it as a point vector dataset; can also be treated as raster via VRT.
OGR vector formats
These formats are read using OGR vector drivers. The extractor can retrieve: bounding box, CRS/EPSG code, feature count, and geometry type.
Shapefile
| Property | Value |
|---|---|
| Extension(s) | .shp + .shx + .dbf |
| Media type | application/x-shapefile |
| OGR driver | ESRI Shapefile |
| Categories | Bathymetry |
| Stages | Raw data, QC, Processed data, Interpreted data |
ESRI vector format storing point, line, or polygon geometries with attribute data. Companion
files .shx (index) and .dbf (attributes) are required. Used as secondary file across all
workflow stages.
CSV
| Property | Value |
|---|---|
| Extension(s) | .csv |
| Media type | text/csv |
| OGR driver | CSV |
| Categories | Bathymetry, Backscatter |
| Stages | Raw data, QC, Interpreted data |
Comma-separated or delimited text file containing point data with coordinate columns (X/Y, lon/lat). OGR reads it as a point vector dataset when columns with coordinates are identified. Also listed under GDAL raster formats as it can be treated as a gridded raster via VRT.
File Geodatabase
| Property | Value |
|---|---|
| Extension(s) | .gdb (directory) |
| Media type | application/x-filegdb |
| OGR driver | OpenFileGDB |
| Categories | Bathymetry, Backscatter |
| Stages | Raw data, QC, Processed data, Interpreted data |
ESRI File Geodatabase, a directory containing multiple database files that store vector and raster datasets. Used as secondary file for raw/QC/processed data and as primary/secondary for interpreted products. Read-only access via the OpenFileGDB driver (no ESRI license required).
GeoJSON
| Property | Value |
|---|---|
| Extension(s) | .geojson, .json |
| Media type | application/geo+json |
| OGR driver | GeoJSON |
| Categories | Bathymetry, Backscatter |
| Stages | Processed data, Interpreted data |
Open standard format for encoding geographic data structures using JSON. Supports point, line, polygon, and multi-geometry types with associated properties. Always uses WGS 84 (EPSG:4326) as its coordinate reference system.
GeoPackage
| Property | Value |
|---|---|
| Extension(s) | .gpkg |
| Media type | application/geopackage+sqlite3 |
| OGR driver | GPKG |
| Categories | Bathymetry, Backscatter |
| Stages | Processed data, Interpreted data |
OGC open standard based on SQLite. Can store both vector and raster data in a single file. Supports multiple layers, spatial indexes, and arbitrary CRS. Modern alternative to Shapefile and File Geodatabase.
KML/KMZ
| Property | Value |
|---|---|
| Extension(s) | .kml, .kmz |
| Media type | application/vnd.google-earth.kml+xml |
| OGR driver | KML / LIBKML |
| Categories | Bathymetry, Backscatter |
| Stages | Interpreted data |
Google Earth markup format for geographic visualization. KML is XML-based; KMZ is a compressed (ZIP) archive containing a KML file and optional resources. Supports point, line, and polygon geometries. Always uses WGS 84 (EPSG:4326). OGR can read both KML and KMZ via the KML or LIBKML drivers.
Specialized formats (future)
These formats require dedicated extractors that are not yet implemented. They will be added in future iterations.
CSAR
| Property | Value |
|---|---|
| Extension(s) | .csar |
| Media type | application/octet-stream |
| Categories | Bathymetry, Backscatter |
| Stages | Processed data |
CARIS Spatial Archive file. Proprietary raster format for gridded bathymetry and elevation data. Used as primary file for processed data. Requires CARIS software/license for full access.
KMALL
| Property | Value |
|---|---|
| Extension(s) | .kmall |
| Categories | Bathymetry, Backscatter |
| Stages | Raw data |
Binary datagram-based format from Kongsberg multibeam echosounder systems. Contains position
data (latitude, longitude, timestamp), depth measurements, and quality indicators. The #SPO
datagram provides position data; #MRZ provides depth/reflectivity.
A Python reader module is available on GitHub for parsing datagrams.
SEG-Y
| Property | Value |
|---|---|
| Extension(s) | .segy, .sgy |
| Categories | Seismic |
| Stages | Raw data |
Standard seismic data exchange format (SEG-Y Revision 2.0). Structure:
- Text File Header (3200 bytes)
- Binary Header (400 bytes)
- Extended Text Header (optional)
- Traces (individual seismic traces with 240-byte trace headers)
Python libraries: segyio (actively maintained, by Equinor) and segpy.
JSF
| Property | Value |
|---|---|
| Extension(s) | .jsf |
| Categories | Seismic |
| Stages | Raw data |
EdgeTech sonar data format. Binary format with defined header structure (JSFDefs.h). Used for sub-bottom profiler (SBP) raw data.
P1/11
| Property | Value |
|---|---|
| Extension(s) | .p111 |
| Categories | Geophysical position data |
| Stages | Raw data |
Geophysical position data exchange format. Can be imported using the SeisPos_Import QGIS plugin.
Other formats
HIPS Project
| Property | Value |
|---|---|
| Extension(s) | Directory (proprietary structure) |
| Categories | Bathymetry |
| Stages | Processed data |
CARIS HIPS and SIPS project directory. Proprietary format containing processed multibeam data. No automated extraction is planned, treated as an opaque directory.
XLS
| Property | Value |
|---|---|
| Extension(s) | .xls, .xlsx |
| Media type | application/vnd.ms-excel |
| Categories | Bathymetry |
| Stages | Raw data, QC, Processed data |
Spreadsheet/tabular data. Used as secondary file for metadata or ancillary information. No GDAL/OGR support — would require a dedicated reader if extraction is needed.