Big data: expediting and validating analyses

The data lake

“Data lake” is a relatively new term that refers to a heterogeneous set of data that may include structured, unstructured and semi-structured data stored in its native format. Putting data in a data lake is quick and does not require processing. Another advantage is that all the enterprise data in the data lake is in one spot, unlike the more typical situation in which it is scattered throughout different departments in an enterprise.

“A data lake is as much an approach as a technology,” says Kamran Khan, CEO of Search Technologies. “Putting data in a single store makes it easier to enrich or analyze. In addition, re-indexing the data is much faster.” A frequent problem, especially in large organizations, is that people do not want to tag their documents. “Using big data and statistical analyses, the information can be made useful even when this structure is not applied,” Khan adds.

Because a schema has not been applied to the data when it was put in the repository, the analyses that can be done are less limited. At the same time, the lack of structure leads to lack of governance; for example, retention schedules that depend on certain metadata will not be applied. The data may be more accessible, but it is not managed. Therefore, issues may arise with data quality.

“In the past, a KM solution would focus on finding information,” Khan says, “but now users want to analyze it. The future of KM will be to combine search and big data because what you can do with the two techs together is fantastic.” For many companies, data lakes will prove to be a useful component of a knowledge management strategy.

Free

for qualified subscribers

Subscribe Now Current Issue Past Issues

Save with Early Bird Pricing for KMWorld 2026!
Register NOW and join us November 16-19

Big data: expediting and validating analyses

The data lake

Mining Business Knowledge From Unstructured Data

Checklist Report - Preparing for Agentic AI: KM Playbook

2026 State of KM & AI Report

More

Agentic AI at the Core: Building Faster, Smarter Search Experiences

Knowledge at Your Fingertips: Building Workflows with Embedded Intelligence

GenAI Without Limits: Harnessing KM for Accuracy, Trust, and Scale

The AI Knowledge Maturity Model: Assessing Readiness and Measuring Progress

More Webinars

Save with Early Bird Pricing for KMWorld 2026!Register NOW and join us November 16-19

Big data: expediting and validating analyses

The data lake

Save with Early Bird Pricing for KMWorld 2026!
Register NOW and join us November 16-19