Rounding up pertinent web content with Northern Light at KMWorld 2022
Web content is unruly, misbehaved, messy, scattered, and badly formatted (if formatted at all). It’s also informative, insightful, timely, and crucial for business analysis and technical research.
Web content that is “free” and “openly accessible” to the public is a fast-growing content category but presents copyright infringement pitfalls. Enterprises have a significant task on their hands when it comes to populating knowledge management portals with web content.
David Seuss, CEO Northern Light and Carl Prizzi, EVP of revenue at Hawksearch, presented strategies for addressing the various issues associated with incorporating web content into enterprise knowledge management systems, during their session, “Beating Web Content Into Submission.”
Prizzi focused on how combining search with AI technologies like natural language processing, semantic understanding, and machine learning, and intelligent enterprise search can provide a dramatically improved search experience, and significantly more insights for companies.
“Web content is as essential to business analytics as drummers are to rock ‘n roll,” Seuss said.
Northern Light aggregates and curates many collections of web content for its Fortune 500 market and competitive intelligence clients, Seuss explained.
He presented several examples of how to find relevant content and curating metadata. It is extremely difficult to parse untagged and unstructured metadata using automated means from HTML on the page, he explained.
If you don’t filter out the noise for web content, it’ll be difficult to find the useful information you are searching for, Seuss said. The value of filtering out the noise by curating the collection is immediately evident.
Content harvested from the web can be indexed for search, provide search results that contain bibliographic metadata, display aggregated research and analysis, link it on the web in its native location, make thumbnails of included images to decorate the search result.
You can’t serve a copy of the full-text from your servers to your users, serve a copy of images from your servers in their original size and resolution, transfer copies of web content to another company, and more, Seuss noted.
Machine learning influences relevancy and recommendation strategies which enables organizations to provide more proactive, customer-centric solutions with less friction, Prizzi explained.
“There are new customer expectations, and the way people find information has evolved,” Prizzi said. “Highly curated experiences are what they’re craving.”
The customer-centric experience hones in on the customer to provide new reasons to engage with relevant content that are frictionless, omnichannel, data-driven, and highly personalized.
The Amazon algorithm is one such example of a highly personalized search engine, he explained, where content can be tailored to the end user.
There are three core concepts to utilizing machine learning in this way:
- Wisdom of the crowd: product discovery
- Wisdom of the content: content discovery
- Wisdom of you: AI-driven personalization
KMWorld returned in-person to the J.W. Marriott in Washington D.C. on November 7-10, with pre-conference workshops held on November 7.
KMWorld 2022 is a part of a unique program of five co-located conferences, which also includes Enterprise Search & Discovery, Office 365 Symposium, Taxonomy Boot Camp, and Text Analytics Forum.
Companies and Suppliers Mentioned