-->

KMWorld 2024 Is Nov. 18-21 in Washington, DC. Register now for Super Early Bird Savings!

How to Link Datasets


Produced by Steve Nathans-Kelly

With tons of data pouring into organizations, there’s a chance some of it is duplicated—adding stress to systems already bursting with information. At KMWorld 2022 Bess Schrader, Enterprise Knowledge senior consultant, discussed how to connect these datasets.

There are several strategies to linking information across multiple semantic datasets, Schrader explained. The ideal method is re-using URIs for institutional entities enabling federated queries. Another option is to match on “hooks,” which are important properties such as key identifiers, and more.

The next technique is using alternative labels or educated guesses. And the final option is performing a manual review.

“You need a human being to go in and look and say, ‘Yes, this is the same thing,’” Schrader said.

In the best-case scenario, owners/creators of semantic data sets reuse URIs between data sets at the time of creation, so there’s no guess work involved in matching entities across datasets.

Lacking re-used URIs or institutional identifiers, organizations often have to make up matching logic to determine if entities are the same.

Extraction, or label matching, against the data already in the graph helps with the transformation, allowing us to standardize/match references to projects in one system to existing URI for that project, she said.

Save the Date for KMWorld 2023—November 6–9, 2023—JW Marriott | Washington, DC!

KMWorld Covers
Free
for qualified subscribers
Subscribe Now Current Issue Past Issues