Smarter Web scraping from Connotate
Connotate has significantly upgraded its technology with the introduction of Connotate4, which, the company explains, simplifies and streamlines the “Webdata” extraction process and ensures full coverage of a website. It adds that the key component of Connotate4 is a custom browser that uses the Webkit engine powering such browsers as Safari and Chrome.
Connotate states its core technology is based on visual abstraction techniques that allow machines to view Web pages as humans do, enabling high-volume extraction of data from Web pages to be automated through a point-and-click interface. Further, it says, because “agents” are not relying on HTML code to find the data to extract, they can easily adjust to moderate site changes without breaking.
In addition to the custom browser at the center of Connotate4, the company highlights the following capabilities:
- Inline data transformation within the agent development process is a powerful new capability that will ease data integration and customization.
- Enhanced change detection with highlighting can be requested during the agent development process via a point-and-click checkbox, enabling highlighted change detection that is illustrated at the character, word or phrase level.
- Parallel extraction tasks make it faster to complete tasks, allowing even more scalability for even larger extractions.
- Build-and-expand capabilities turn the act of reusing a single agent for related extraction tasks a one-click event, allowing for faster Agent creation.
- A simplified user interface enables simplified and faster Agent development.
The new release allows Connotate’s intelligent extraction Agents to access about 95 percent of Webdata. And the adaptive platform can quickly accommodate new Web properties and technologies as they emerge, providing the ability to scale far beyond the competitive landscape. Existing customers of Connotate’s hosted solution will not be affected by the introduction of this new platform. On-premises customers will be migrated on an as-needed basis.