Don't miss Data Summit, May 22-23. Learn about big data, AI, machine learning, cognitive computing, blockchain, and more.

XML: a multipurpose solution

This article appears in the issue October 2005 [Volume 14, Issue 9]


   Bookmark and Share

For years, Extensible Markup Language (XML) has been touted as the great equalizer for content management, the technology that will let companies bring it all together. But partly because standards have taken a while to develop, its progress has been steady rather than swift. Now that Microsoft Word is XML-enabled, awareness of the format is likely to increase. XML blends flexibility with structure. When heterogeneous data needs to be transformed, integrated, presented and queried, XML is the way to go.

Quick solution for a data feed

Companies often begin their use of XML in response to a single business need and then expand into other areas. U.S. Financial Life Insurance was asked by its parent company, the MONY Group, to begin sending the data feed of its agents' commissions in an XML format. Up to that point, U.S. Financial Life had been sending the information in printed format from its IBM AS/400. As a relatively small company with limited IT staff, U.S. Financial Life didn't have extensive resources with which to create a solution.

"We wanted an XML tool that was effective and easy to implement, and that would allow us to expand our XML offerings in the future," says Erik Simmons, VP of management information systems at U.S. Financial Life. The company followed up on a previous contact with Whitehill, which produces xml Transport, a product that transforms data from a variety of formats into XML. The deployment of xml Transport was straightforward.

"The representative from Whitehill came on a Monday and installed the software," Simmons recounts. "For the next few days he worked with our programmer to produce a test feed, and we took over by the end of the week."

The MONY Group is now receiving the commission data in the format it needs for its financial systems, and the existing databases used by U.S. Financial Life were not disrupted. The next step for the company will be to redo its financial Web site using XML data as the content source. That change will allow agents to download pending cases rather than just viewing them online.

"XML is ideally suited to situations where organizations have different systems," observes Stephen Brooks, director of product marketing at Whitehill. For example, in Canada, four major telecoms merged into one organization, Aliant Telecom. From among those four firms, Aliant selected the best product offered for various services--landline, cell and Internet.

"Aliant wanted a unified bill," Brooks says, "but did not want to physically integrate the information systems from each of the service providers." The solution was to transform the data using Whitehill's xml Transport so it could be integrated into one billing system. By using XML, Aliant was able to avoid re-engineering its existing information systems while producing a simplified bill for its customers.

Customizing training content

Prosoft Learning publishes training materials in information and communications technology. The company creates learning content and also distributes it through a content distribution division. An online product called CustomPREP allows purchasers to easily customize content, so that they select and print just what they need. The original CustomPREP application generated an MS Word document using Visual Basic for Office. However, Prosoft found Visual Basic for Office somewhat restrictive and difficult to work with. When Prosoft began to develop learning content for Office 2003, the company decided to convert its training materials to XML and provide the CustomPREP product as a Web-based tool.

After some investigation, Prosoft selected a solution from Xyleme (xyleme.com), which offers a family of products that provide authoring, conversion and storage of XML content.

"Our documents are developed in MS Word, and we are now able to convert to XML using the Xyleme Content Migration Platform," says Lindsay Miller, VP of operations and CIO at Prosoft. Content is then stored on the Xyleme Content Server. "Depending on the lessons that a customer selects," he continues, "related exercises and other materials are identified and included in the download."

In addition, page references that direct the user to other locations in the document are automatically changed if the material the user selects has changed the document. Prosoft's clients have responded very favorably to the smoother process for customizing their training content.

Xyleme is focused specifically on managing XML content. "We specialize in content that is semi-structured, dynamic and needs to be managed at a very granular level," says Therese McGee, VP of marketing at Xyleme. Often, she says, companies have tried other solutions and come to the realization that they need XML.

One example is the development and management of derivatives contracts, which are sophisticated and complex financial products. "No two contracts are the same, so financial services companies found that elements of the contracts cannot be easily stored and managed in a highly structured relational database," McGee says. An approach using a native XML database provides traders with real-time business intelligence and flexible, global reporting across products and customers.

Let's get mechanical

A prime example of using XML to integrate data from multiple sources is shown in a new application rolled out recently by Volvo Cars. The Vehicle Information and Diagnostics for Aftersales (VIDA) is a service parts and information system based on the Enigma 3C Platform. Documents are converted from their original format, if necessary, and accessed through Enigma 3C as XML documents. The system provides mechanics with such diverse content as service manuals, electronic parts catalogs, drawings and service bulletins through a single interface. The vehicle identification number (VIN) is linked to each piece of relevant content, so a mechanic can locate information specific to that vehicle and order parts or identify required service actions.

"Manufacturers represent some of the largest publishers in the world," says John Snow, VP of marketing and business development at Enigma. "Properly maintaining a complex piece of equipment such as an automobile, aircraft or weapon system requires fast access to a wide variety of constantly changing information."

In addition, the data must be displayed in a way that facilitates the workflow of the mechanics and technicians who maintain the equipment. Use of XML allows information from many sources to be integrated into a single interface. "Information can be filtered so that only the relevant parts and procedures are displayed," adds Snow, "which makes life easier for technicians."

XML also facilitates localization of content by streamlining the translation process from one language to another. In the case of Volvo's VIDA system, support is provided for 17 different languages, making the system a true global resource for 3,000 Volvo dealers and 12,000 independent motor traders.

XML and searching

Because XML has both structured and unstructured aspects, it offers the best of both worlds for searching. Queries typical of those run on relational databases can make use of the metadata, while text searches can also be carried out using proximity, truncation and other techniques commonly used in searching documents. Nerac, a research firm that provides information on intellectual property for its clients, decided to migrate to XML.

"We wanted to enhance the power of our search capability," says Gerri Potash, VP of client services for Nerac, "as well as to better integrate content from disparate sources." Some of the providers of patent information were beginning to deliver their content in XML, and Nerac wanted to get the best value from the content it was purchasing.

Nerac settled on MarkLogic Server, an XML content management product from Mark Logic, which queries and presents information using XQuery, an XML standard. Research reports prepared by Nerac typically consist of a summary of the results of the search process, along with an interpretation of the information by the search specialists, who are experienced scientists and engineers. Reports might provide information on the latest developments in a particular industry, a literature review of technical journals or a summary of a patent search.

The output produced for Nerac's clients is typically a research report containing the content found during the research, along with analysis and interpretation of the information. The use of XML allows Nerac to customize the search results easily; for example, some clients might want to review just abstracts and the authors' names.

"MarkLogic Server gives us a more powerful way to search and present the content that we have," Potash says. "It helps us zero in on the right content and get to the answers that differentiate us from other information services."

Customers want to be able to look at relevant information to a finer-grained level than the document, according to Max Schireson, VP of services and alliances at Mark Logic. A given document might have a lower level of relevance overall, but might have just the nugget the user needs. "People need to be able to retrieve a paragraph, section or table that zeroes in on just the content they are looking for," he says.

A medical researcher wanting to find information on tennis elbow, for example, does not want a list of textbook titles, but a compilation of excerpts that relate directly to the topic. With XML, the content can be located and then rendered in a way that is meaningful to the user.

XML not lost in translation

Another aspect of the ability of XML to disseminate information to multiple sources is its role in translating content from one language to another. The XML tags create a structure into which translated text can replace the original version (for example, English) and be displayed in a different language.

"The steps involved in sending out text in one language and replacing it with another can be automated, which helps global companies reach their customers in many local languages," says Bill Rogers, CEO of Ektron.

Ektron has developed two tools: eWebEditPro+XML, which allows non-technical users to create and edit XML content, and eWebEditPro, an authoring tool for creating and editing Web content, including conversion of MS Word documents into HTML. Those tools are bundled into Ektron CMS400.NET, a native .NET content management application geared toward midsize companies and projects.

"Ektron is one of few CMS [content management systems] vendors now supporting XLIFF, the standard that is being adopted by most language translators," says Rogers. XLIFF (XML Localization Interchange File Format) provides a data structure with features that help translators, such as the ability to track multiple alternatives for a translation.

A translator might want to use several approaches sequentially—first, a program that matches a phrase to a previously translated text string in a database, and then a machine translation program that uses linguistic rules. The human translator can then select the best alternative, or translate the text directly. XLIFF presents the results of those operations without showing the complex structural formatting underlying the text that is displayed.


Judith Lamont is a research analyst with Zentek Corp., e-mail jlamont@sprintmail.com.


Search KMWorld

Connect

Buyers' Guide
Learn More in the Buyers' Guide!