Using Cesium for display of remote parquet.
parquet
spatial
recipe
This page renders points from an iSamples parquet file on cesium using point primitives.
TipUsing a local cached file for faster performance
DuckDB-WASM running in the browser cannot access local files via file:// URLs due to browser security restrictions. However, you can use a local cached file when running quarto preview:
Local Development (recommended)
The repository includes a cached parquet file. To use it:
Ensure the file exists in
docs/assets/oc_isamples_pqg.parquet(691MB)- The file must be in Quarto’s output directory
docs/assets/, not just the sourceassets/directory - If needed, copy:
cp assets/oc_isamples_pqg.parquet docs/assets/
- The file must be in Quarto’s output directory
When running
quarto preview, use the full localhost URL:http://localhost:4979/assets/oc_isamples_pqg.parquet(Replace
4979with your actual preview port)
Alternative: Python HTTP server
# In the directory containing your parquet file:
cd /Users/raymondyee/Data/iSample
python3 -m http.server 8000Then use: http://localhost:8000/oc_isamples_pqg.parquet
Benefits of local cached file: - Much faster initial load (no network transfer) - Works offline - Matches the notebook’s local file access pattern
Limitation: Only works during local development, not on published GitHub Pages.
WarningHeads up: first interaction may be slow
The first click or query can take a few seconds while the in‑browser database engine initializes and the remote Parquet file is fetched and indexed. Subsequent interactions are much faster because both the browser and DuckDB cache metadata and column chunks, so later queries reuse what was already loaded.
Code
db = {
const instance = await DuckDBClient.of();
await instance.query(`create view nodes as select * from read_parquet('${parquet_path}')`)
return instance;
}
async function loadData(query, params = [], waiting_id = null, key = "default") {
// latest-only guard per key
loadData._latest = loadData._latest || new Map();
const requestToken = Symbol();
loadData._latest.set(key, requestToken);
// Get loading indicator
const waiter = waiting_id ? document.getElementById(waiting_id) : null;
if (waiter) waiter.hidden = false;
try {
// Run the (slow) query
const _results = await db.query(query, params);
// Ignore stale responses
if (loadData._latest.get(key) !== requestToken) return null;
return _results;
} catch (error) {
if (waiter && loadData._latest.get(key) === requestToken) {
waiter.innerHTML = `<pre>${error}</pre>`;
}
return null;
} finally {
// Hide the waiter (if there is one) only if latest
if (waiter && loadData._latest.get(key) === requestToken) {
waiter.hidden = true;
}
}
}
locations = {
// Performance telemetry
performance.mark('locations-start');
// Get loading indicator element for progress updates
const loadingDiv = document.getElementById('loading_1');
if (loadingDiv) {
loadingDiv.hidden = false;
loadingDiv.innerHTML = 'Loading geocodes...';
}
// Fast query: just get all distinct geocodes (no classification!)
const query = `
SELECT DISTINCT
pid,
latitude,
longitude
FROM nodes
WHERE otype = 'GeospatialCoordLocation'
`;
performance.mark('query-start');
const data = await loadData(query, [], "loading_1", "locations");
performance.mark('query-end');
performance.measure('locations-query', 'query-start', 'query-end');
const queryTime = performance.getEntriesByName('locations-query')[0].duration;
console.log(`Query executed in ${queryTime.toFixed(0)}ms - retrieved ${data.length} locations`);
// Clear the existing PointPrimitiveCollection
content.points.removeAll();
// Single color for all points (blue)
const defaultColor = Cesium.Color.fromCssColorString('#2E86AB');
const defaultSize = 4;
// Render points in chunks to keep UI responsive
const CHUNK_SIZE = 500;
const scalar = new Cesium.NearFarScalar(1.5e2, 2, 8.0e6, 0.2);
performance.mark('render-start');
for (let i = 0; i < data.length; i += CHUNK_SIZE) {
const chunk = data.slice(i, i + CHUNK_SIZE);
const endIdx = Math.min(i + CHUNK_SIZE, data.length);
// Update progress indicator
if (loadingDiv) {
const pct = Math.round((endIdx / data.length) * 100);
loadingDiv.innerHTML = `Rendering geocodes... ${endIdx.toLocaleString()}/${data.length.toLocaleString()} (${pct}%)`;
}
// Add points for this chunk
for (const row of chunk) {
content.points.add({
id: row.pid,
position: Cesium.Cartesian3.fromDegrees(
row.longitude, //longitude
row.latitude, //latitude
0 //elevation, m
),
pixelSize: defaultSize,
color: defaultColor,
scaleByDistance: scalar,
});
}
// Yield to browser between chunks to keep UI responsive
if (i + CHUNK_SIZE < data.length) {
await new Promise(resolve => setTimeout(resolve, 0));
}
}
performance.mark('render-end');
performance.measure('locations-render', 'render-start', 'render-end');
const renderTime = performance.getEntriesByName('locations-render')[0].duration;
// Hide loading indicator
if (loadingDiv) {
loadingDiv.hidden = true;
}
performance.mark('locations-end');
performance.measure('locations-total', 'locations-start', 'locations-end');
const totalTime = performance.getEntriesByName('locations-total')[0].duration;
console.log(`Rendering completed in ${renderTime.toFixed(0)}ms`);
console.log(`Total time (query + render): ${totalTime.toFixed(0)}ms`);
content.enableTracking();
return data;
}
function createShowPrimitive(viewer) {
return function(movement) {
// Get the point at the mouse end position
const selectPoint = viewer.viewer.scene.pick(movement.endPosition);
// Clear the current selection, if there is one and it is different to the selectPoint
if (viewer.currentSelection !== null) {
//console.log(`selected.p ${viewer.currentSelection}`)
if (Cesium.defined(selectPoint) && selectPoint !== viewer.currentSelection) {
console.log(`selected.p 2 ${viewer.currentSelection}`)
viewer.currentSelection.primitive.pixelSize = 4;
viewer.currentSelection.primitive.outlineColor = Cesium.Color.TRANSPARENT;
viewer.currentSelection.outlineWidth = 0;
viewer.currentSelection = null;
}
}
// If selectPoint is valid and no currently selected point
if (Cesium.defined(selectPoint) && selectPoint.hasOwnProperty("primitive")) {
//console.log(`showPrimitiveId ${selectPoint.id}`);
//const carto = Cesium.Cartographic.fromCartesian(selectPoint.primitive.position)
viewer.pointLabel.position = selectPoint.primitive.position;
viewer.pointLabel.label.show = true;
//viewer.pointLabel.label.text = `id:${selectPoint.id}, ${carto}`;
viewer.pointLabel.label.text = `${selectPoint.id}`;
selectPoint.primitive.pixelSize = 20;
selectPoint.primitive.outlineColor = Cesium.Color.YELLOW;
selectPoint.primitive.outlineWidth = 3;
viewer.currentSelection = selectPoint;
} else {
viewer.pointLabel.label.show = false;
}
}
}
class CView {
constructor(target) {
this.viewer = new Cesium.Viewer(
target, {
timeline: false,
animation: false,
baseLayerPicker: false,
fullscreenElement: target,
terrain: Cesium.Terrain.fromWorldTerrain()
});
this.currentSelection = null;
this.point_size = 1;
this.n_points = 0;
// https://cesium.com/learn/cesiumjs/ref-doc/PointPrimitiveCollection.html
this.points = new Cesium.PointPrimitiveCollection();
this.viewer.scene.primitives.add(this.points);
this.pointLabel = this.viewer.entities.add({
label: {
show: false,
showBackground: true,
font: "14px monospace",
horizontalOrigin: Cesium.HorizontalOrigin.LEFT,
verticalOrigin: Cesium.VerticalOrigin.BOTTOM,
pixelOffset: new Cesium.Cartesian2(15, 0),
// this attribute will prevent this entity clipped by the terrain
disableDepthTestDistance: Number.POSITIVE_INFINITY,
text:"",
},
});
this.pickHandler = new Cesium.ScreenSpaceEventHandler(this.viewer.scene.canvas);
// Can also do this rather than wait for the points to be generated
//this.pickHandler.setInputAction(createShowPrimitive(this), Cesium.ScreenSpaceEventType.MOUSE_MOVE);
this.selectHandler = new Cesium.ScreenSpaceEventHandler(this.viewer.scene.canvas);
this.selectHandler.setInputAction((e) => {
const selectPoint = this.viewer.scene.pick(e.position);
if (Cesium.defined(selectPoint) && selectPoint.hasOwnProperty("primitive")) {
mutable clickedPointId = selectPoint.id;
}
},Cesium.ScreenSpaceEventType.LEFT_CLICK);
}
enableTracking() {
this.pickHandler.setInputAction(createShowPrimitive(this), Cesium.ScreenSpaceEventType.MOUSE_MOVE);
}
}
content = new CView("cesiumContainer");
async function getGeoRecord(pid) {
if (pid === null || pid ==="" || pid == "unset") {
return "unset";
}
const q = `SELECT row_id, pid, otype, latitude, longitude FROM nodes WHERE otype='GeospatialCoordLocation' AND pid=?`;
const rows = await loadData(q, [pid], "loading_geo", "geo");
return rows && rows.length ? rows[0] : null;
}
async function get_samples_1(pid) {
if (pid === null || pid ==="" || pid == "unset") {
return [];
}
// Path 1: Direct event location - enhanced to match Eric's query structure
const q = `
SELECT
geo.latitude,
geo.longitude,
site.label AS sample_site_label,
site.pid AS sample_site_pid,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
samp.description AS sample_description,
samp.thumbnail_url AS sample_thumbnail_url,
samp.thumbnail_url IS NOT NULL as has_thumbnail,
'direct_event_location' as location_path
FROM nodes AS geo
JOIN nodes AS rel_se ON (
rel_se.p = 'sample_location'
AND
list_contains(rel_se.o, geo.row_id)
)
JOIN nodes AS se ON (
rel_se.s = se.row_id
AND
se.otype = 'SamplingEvent'
)
JOIN nodes AS rel_site ON (
se.row_id = rel_site.s
AND
rel_site.p = 'sampling_site'
)
JOIN nodes AS site ON (
rel_site.o[1] = site.row_id
AND
site.otype = 'SamplingSite'
)
JOIN nodes AS rel_samp ON (
rel_samp.p = 'produced_by'
AND
list_contains(rel_samp.o, se.row_id)
)
JOIN nodes AS samp ON (
rel_samp.s = samp.row_id
AND
samp.otype = 'MaterialSampleRecord'
)
WHERE geo.pid = ?
AND geo.otype = 'GeospatialCoordLocation'
ORDER BY has_thumbnail DESC
`;
performance.mark('samples1-start');
const result = await loadData(q, [pid], "loading_s1", "samples_1");
performance.mark('samples1-end');
performance.measure('samples1-query', 'samples1-start', 'samples1-end');
const queryTime = performance.getEntriesByName('samples1-query')[0].duration;
console.log(`Path 1 query executed in ${queryTime.toFixed(0)}ms - retrieved ${result?.length || 0} samples`);
return result ?? [];
}
async function get_samples_2(pid) {
if (pid === null || pid ==="" || pid == "unset") {
return [];
}
// Path 2: Via site location - enhanced to match Eric's query structure
const q = `
SELECT
geo.latitude,
geo.longitude,
site.label AS sample_site_label,
site.pid AS sample_site_pid,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
samp.description AS sample_description,
samp.thumbnail_url AS sample_thumbnail_url,
samp.thumbnail_url IS NOT NULL as has_thumbnail,
'via_site_location' as location_path
FROM nodes AS geo
JOIN nodes AS rel_site_geo ON (
rel_site_geo.p = 'site_location'
AND
list_contains(rel_site_geo.o, geo.row_id)
)
JOIN nodes AS site ON (
rel_site_geo.s = site.row_id
AND
site.otype = 'SamplingSite'
)
JOIN nodes AS rel_se_site ON (
rel_se_site.p = 'sampling_site'
AND
list_contains(rel_se_site.o, site.row_id)
)
JOIN nodes AS se ON (
rel_se_site.s = se.row_id
AND
se.otype = 'SamplingEvent'
)
JOIN nodes AS rel_samp ON (
rel_samp.p = 'produced_by'
AND
list_contains(rel_samp.o, se.row_id)
)
JOIN nodes AS samp ON (
rel_samp.s = samp.row_id
AND
samp.otype = 'MaterialSampleRecord'
)
WHERE geo.pid = ?
AND geo.otype = 'GeospatialCoordLocation'
ORDER BY has_thumbnail DESC
`;
performance.mark('samples2-start');
const result = await loadData(q, [pid], "loading_s2", "samples_2");
performance.mark('samples2-end');
performance.measure('samples2-query', 'samples2-start', 'samples2-end');
const queryTime = performance.getEntriesByName('samples2-query')[0].duration;
console.log(`Path 2 query executed in ${queryTime.toFixed(0)}ms - retrieved ${result?.length || 0} samples`);
return result ?? [];
}
async function get_samples_at_geo_cord_location_via_sample_event(pid) {
if (pid === null || pid ==="" || pid == "unset") {
return [];
}
// Eric Kansa's authoritative query from open-context-py
// Source: https://github.com/ekansa/open-context-py/blob/staging/opencontext_py/apps/all_items/isamples/isamples_explore.py
const q = `
SELECT
geo.latitude,
geo.longitude,
site.label AS sample_site_label,
site.pid AS sample_site_pid,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
samp.description AS sample_description,
samp.thumbnail_url AS sample_thumbnail_url,
samp.thumbnail_url IS NOT NULL as has_thumbnail
FROM nodes AS geo
JOIN nodes AS rel_se ON (
rel_se.p = 'sample_location'
AND
list_contains(rel_se.o, geo.row_id)
)
JOIN nodes AS se ON (
rel_se.s = se.row_id
AND
se.otype = 'SamplingEvent'
)
JOIN nodes AS rel_site ON (
se.row_id = rel_site.s
AND
rel_site.p = 'sampling_site'
)
JOIN nodes AS site ON (
rel_site.o[1] = site.row_id
AND
site.otype = 'SamplingSite'
)
JOIN nodes AS rel_samp ON (
rel_samp.p = 'produced_by'
AND
list_contains(rel_samp.o, se.row_id)
)
JOIN nodes AS samp ON (
rel_samp.s = samp.row_id
AND
samp.otype = 'MaterialSampleRecord'
)
WHERE geo.pid = ?
AND geo.otype = 'GeospatialCoordLocation'
ORDER BY has_thumbnail DESC
`;
performance.mark('eric-query-start');
const result = await loadData(q, [pid], "loading_combined", "samples_combined");
performance.mark('eric-query-end');
performance.measure('eric-query', 'eric-query-start', 'eric-query-end');
const queryTime = performance.getEntriesByName('eric-query')[0].duration;
console.log(`Eric's query executed in ${queryTime.toFixed(0)}ms - retrieved ${result?.length || 0} samples`);
return result ?? [];
}
async function get_sample_data_via_sample_pid(sample_pid) {
if (sample_pid === null || sample_pid === "" || sample_pid === "unset") {
return null;
}
// Eric Kansa's query: Get full sample data including geo and site info
const q = `
SELECT
samp.row_id,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
samp.description AS sample_description,
samp.thumbnail_url AS sample_thumbnail_url,
samp.thumbnail_url IS NOT NULL as has_thumbnail,
geo.latitude,
geo.longitude,
site.label AS sample_site_label,
site.pid AS sample_site_pid
FROM nodes AS samp
JOIN nodes AS samp_rel_se ON (
samp_rel_se.s = samp.row_id
AND
samp_rel_se.p = 'produced_by'
)
JOIN nodes AS se ON (
samp_rel_se.o[1] = se.row_id
AND
se.otype = 'SamplingEvent'
)
JOIN nodes AS geo_rel_se ON (
geo_rel_se.s = se.row_id
AND
geo_rel_se.p = 'sample_location'
)
JOIN nodes AS geo ON (
geo_rel_se.o[1] = geo.row_id
AND
geo.otype = 'GeospatialCoordLocation'
)
JOIN nodes AS site_rel_se ON (
site_rel_se.s = se.row_id
AND
site_rel_se.p = 'sampling_site'
)
JOIN nodes AS site ON (
site_rel_se.o[1] = site.row_id
AND
site.otype = 'SamplingSite'
)
WHERE samp.pid = ?
AND samp.otype = 'MaterialSampleRecord'
`;
const result = await loadData(q, [sample_pid], "loading_sample_data", "sample_data");
return result && result.length ? result[0] : null;
}
async function get_sample_data_agents_sample_pid(sample_pid) {
if (sample_pid === null || sample_pid === "" || sample_pid === "unset") {
return [];
}
// Eric Kansa's query: Get agent info (who collected/registered)
const q = `
SELECT
samp.row_id,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
samp.description AS sample_description,
samp.thumbnail_url AS sample_thumbnail_url,
samp.thumbnail_url IS NOT NULL as has_thumbnail,
agent_rel_se.p AS predicate,
agent.pid AS agent_pid,
agent.name AS agent_name,
agent.alternate_identifiers AS agent_alternate_identifiers
FROM nodes AS samp
JOIN nodes AS samp_rel_se ON (
samp_rel_se.s = samp.row_id
AND
samp_rel_se.p = 'produced_by'
)
JOIN nodes AS se ON (
samp_rel_se.o[1] = se.row_id
AND
se.otype = 'SamplingEvent'
)
JOIN nodes AS agent_rel_se ON (
agent_rel_se.s = se.row_id
AND
list_contains(['responsibility', 'registrant'], agent_rel_se.p)
)
JOIN nodes AS agent ON (
list_contains(agent_rel_se.o, agent.row_id)
AND
agent.otype = 'Agent'
)
WHERE samp.pid = ?
AND samp.otype = 'MaterialSampleRecord'
`;
const result = await loadData(q, [sample_pid], "loading_agents", "agents");
return result ?? [];
}
async function get_sample_types_and_keywords_via_sample_pid(sample_pid) {
if (sample_pid === null || sample_pid === "" || sample_pid === "unset") {
return [];
}
// Eric Kansa's query: Get classification keywords and types
const q = `
SELECT
samp.row_id,
samp.pid AS sample_pid,
samp.alternate_identifiers AS sample_alternate_identifiers,
samp.label AS sample_label,
kw_rel.p AS predicate,
kw.pid AS keyword_pid,
kw.label AS keyword
FROM nodes AS samp
JOIN nodes AS kw_rel ON (
kw_rel.s = samp.row_id
AND
list_contains(['keywords', 'has_sample_object_type', 'has_material_category'], kw_rel.p)
)
JOIN nodes AS kw ON (
list_contains(kw_rel.o, kw.row_id)
AND
kw.otype = 'IdentifiedConcept'
)
WHERE samp.pid = ?
AND samp.otype = 'MaterialSampleRecord'
`;
const result = await loadData(q, [sample_pid], "loading_keywords", "keywords");
return result ?? [];
}
async function locationUsedBy(rowid){
if (rowid === undefined || rowid === null) {
return [];
}
const q = `select pid, otype from nodes where row_id in (select nodes.s from nodes where list_contains(nodes.o, ?));`;
return db.query(q, [rowid]);
}
mutable clickedPointId = "unset";
// Loading flags to control UI clearing while fetching
mutable geoLoading = false;
mutable s1Loading = false;
mutable s2Loading = false;
mutable combinedLoading = false;
// Precompute selection-driven data with loading flags
selectedGeoRecord = {
mutable geoLoading = true;
try {
return await getGeoRecord(clickedPointId);
} finally {
mutable geoLoading = false;
}
}
selectedSamples1 = {
mutable s1Loading = true;
try {
return await get_samples_1(clickedPointId);
} finally {
mutable s1Loading = false;
}
}
selectedSamples2 = {
mutable s2Loading = true;
try {
return await get_samples_2(clickedPointId);
} finally {
mutable s2Loading = false;
}
}
selectedSamplesCombined = {
mutable combinedLoading = true;
try {
return await get_samples_at_geo_cord_location_via_sample_event(clickedPointId);
} finally {
mutable combinedLoading = false;
}
}
md`Retrieved ${pointdata.length} locations from ${parquet_path}.`;
Loading…
Code
viewof pointdata = {
const data_table = Inputs.table(locations, {
header: {
pid: "PID",
latitude: "Latitude",
longitude: "Longitude",
location_type: "Location Type"
},
});
return data_table;
}The click point ID is “”.
Loading selected location…
1 getGeoRecord (selected)
Code
pid = clickedPointId
testrecord = selectedGeoRecord;2 Samples at Location via Sampling Event (Eric Kansa’s Query)
Loading samples…
This query implements Eric Kansa’s authoritative get_samples_at_geo_cord_location_via_sample_event function from open-context-py.
Query Strategy (Path 1 Only): - Starts at a GeospatialCoordLocation (clicked point) - Walks backward via sample_location edges to find SamplingEvents that reference this location - From those events, finds MaterialSampleRecords produced by them - Requires site context (INNER JOIN on sampling_site → SamplingSite)
Returns: - Geographic coordinates: latitude, longitude - Sample metadata: sample_pid, sample_label, sample_description, sample_alternate_identifiers - Site context: sample_site_label, sample_site_pid - Media: sample_thumbnail_url, has_thumbnail
Ordering: Prioritizes samples with images (ORDER BY has_thumbnail DESC)
Important: This query only returns samples whose sampling events directly reference this geolocation via sample_location (Path 1). Samples that reach this location only through their site’s site_location (Path 2) are not included. This means site marker locations may return 0 results if no events were recorded at that exact coordinate.
3 Understanding Paths in the iSamples Property Graph
3.1 Why “Path 1” and “Path 2”?
These terms describe the two main ways to get from a MaterialSampleRecord to geographic coordinates. They’re not the only relationship paths in the graph, but they’re the most commonly used for spatial queries.
Path 1 (Direct Event Location)
MaterialSampleRecord
→ produced_by →
SamplingEvent
→ sample_location →
GeospatialCoordLocation
Path 2 (Via Sampling Site)
MaterialSampleRecord
→ produced_by →
SamplingEvent
→ sampling_site →
SamplingSite
→ site_location →
GeospatialCoordLocation
Key Differences: - Path 1 is direct: Event → Location (3 hops total) - Path 2 goes through Site: Event → Site → Location (4 hops total) - Path 1 = “Where was this specific sample collected?” - Path 2 = “What named site is this sample from, and where is that site?”
Important: The queries below use INNER JOIN for both paths, meaning samples must have connections through both paths to appear in results. Samples with only one path will be excluded.
3.2 Full Relationship Map (Beyond Path 1 and Path 2)
The iSamples property graph contains many more relationships than just the geographic paths:
Agent
↑
| {responsibility, registrant}
|
MaterialSampleRecord ────produced_by──→ SamplingEvent ────sample_location──→ GeospatialCoordLocation
| | ↑
| | |
| {keywords, └────sampling_site──→ SamplingSite ──site_location─┘
| has_sample_object_type,
| has_material_category}
|
└──→ IdentifiedConcept
Path Categories: - PATH 1: MaterialSampleRecord → SamplingEvent → GeospatialCoordLocation (direct location) - PATH 2: MaterialSampleRecord → SamplingEvent → SamplingSite → GeospatialCoordLocation (via site) - AGENT PATH: MaterialSampleRecord → SamplingEvent → Agent (who collected/registered) - CONCEPT PATH: MaterialSampleRecord → IdentifiedConcept (types, keywords - direct, no event!)
Key Insight: SamplingEvent is the central hub for most relationships, except concepts which attach directly to MaterialSampleRecord.
3.3 Query Pattern Analysis (from Eric Kansa’s open-context-py)
The following analysis is based on Eric’s query functions that demonstrate different path traversal patterns:
3.3.1 1. get_sample_data_via_sample_pid - Uses BOTH Path 1 AND Path 2
MaterialSampleRecord (WHERE pid = ?)
→ produced_by → SamplingEvent
├─→ sample_location → GeospatialCoordLocation [Path 1]
└─→ sampling_site → SamplingSite [Path 2]
Returns: sample metadata + lat/lon + site label/pid
Required: BOTH paths must exist (INNER JOIN)
3.3.2 2. get_sample_data_agents_sample_pid - Uses AGENT PATH
MaterialSampleRecord (WHERE pid = ?)
→ produced_by → SamplingEvent
→ {responsibility, registrant} → Agent
Returns: sample metadata + agent info (who collected/registered)
Independent of: Path 1 and Path 2 (no geographic data)
3.3.3 3. get_sample_types_and_keywords_via_sample_pid - Uses CONCEPT PATH
MaterialSampleRecord (WHERE pid = ?)
→ {keywords, has_sample_object_type, has_material_category} → IdentifiedConcept
Returns: sample metadata + classification keywords/types
Independent of: Path 1, Path 2, and SamplingEvent!
3.3.4 4. get_samples_at_geo_cord_location_via_sample_event - REVERSE Path 1 + Path 2
GeospatialCoordLocation (WHERE pid = ?) ← START HERE (reverse!)
← sample_location ← SamplingEvent [Path 1 REVERSED]
├─→ sampling_site → SamplingSite [Path 2 enrichment]
└─← produced_by ← MaterialSampleRecord [complete chain]
Returns: all samples at a given location + site info
Direction: geo → samples (opposite of other queries)
Summary Table:
| Function | Path 1 | Path 2 | Direction | Notes |
|---|---|---|---|---|
get_sample_data_via_sample_pid |
✅ Required | ✅ Required | Forward | INNER JOIN - no row if either missing |
get_sample_data_agents_sample_pid |
❌ N/A | ❌ N/A | N/A | Uses agent path instead |
get_sample_types_and_keywords_via_sample_pid |
❌ N/A | ❌ N/A | N/A | Direct edges to concepts |
get_samples_at_geo_cord_location_via_sample_event |
✅ Required | ✅ Required | Reverse | Walks from geo to samples |
6 Geographic Location Classification