Behind the scenes, XML sizzles
The Gartner Group projects that 80% of B2B Web activity will be XML-based by 2003. A survey by IDG Research Services Group predicts that XML technology budgets will increase by 86% over last year.
One of the requirements for effective knowledge management is data integration, which allows synthesis of input from disparate sources to create a meaningful picture of a customer or an enterprise. Yet the diverse repositories of data, which range from financial to contracts and customer correspondence, have not been easy to integrate. The use of eXtensible Markup Language (XML) allows for interoperability across many applications. It offers new avenues for data exchange that supports collaboration and e-commerce, but as a developing technology, it also faces some hurdles.
Among the most promising applications are:
- content management. Since XML is a file format designed for text, content management is a natural and relatively straightforward application. Technical documentation and any other content can be organized and tagged, for example, and then easily updated or repurposed.;
- access to legacy data. Information from mainframes (which still store most electronic data, particularly in large corporations) can be extracted and presented in numerous ways, no matter what application was used originally to store it.;
- e-commerce. XML is an ideal way to implement Web-based transactions. Marketplaces, catalogs, online purchasing and delivery can be managed over the Web. The ability of XML to handle transactions across multiple applications and enterprises is what sets it apart.;
The most challenging application is e-commerce because it must reflect business processes, some of which are very complex. In addition, because e-commerce applications usually span multiple organizations, implementation requires more coordination. Nevertheless, the Gartner Group projects that 80% of B2B Web activity will be XML-based by 2003. A survey by IDG Research Services Group predicts that XML technology budgets will increase by 86% over last year.
Two essential ingredients for growth of XML-based applications are the stabilization of standards and emergence of enabling technologies. Both have made significant progress in the past year. Many standards for vertical industries are being developed and accepted (see sidebar), and numerous software tools are emerging. Platforms such as Microsoft’s /A> BizTalk Server 2000 and Lotus’ Domino 5.2 are providing the infrastructure for XML-based systems. Intel’s NetStructure 7280 XML and Director 7210 XML Accelerator are hardware devices designed to improve the processing of XML transactions by servers.
Tools of the trade
Some tools, such as those from Arbortext, are oriented around content management and B2B c-commerce solutions. Others, such as Business Engine’s product suite and Lotus Notes, focus on the collaborative opportunities provided by XML-based systems. UNIFACE, from Compuware, is among the niche products now emerging in the XML market; it generates XML on request. XMLSolutions, recently purchased by Vitria, makes a software product that translates EDI into XML.
IBM’s DB2 XML Extender enables the DB2 relational database to intelligently store and manage XML content. The tool allows DB2 to understand XML as a data type and builds a Data Access Definition that ties the XML structure to the database structure.
“XML is wonderful for information that has a high text content,” says IBM’s Jeff Jones, senior program manager with IBM Data Management Solutions, “since it can describe text intelligently. But storing the data in a file system is not as effective as using a relational database.”
Implementation has proceeded more rapidly in some industries than in others. “Right now, the most active verticals in e-commerce are manufacturing companies,” says Don Nanneman, VP of Savvion, “because the supply chain is well defined.” Savvion’s BusinessManager, a workflow product, is built on Java and uses XML extensively as the page description and communications language within modules. Nanneman emphasizes the importance of viewing the entire supply chain process, rather than just what happens within an enterprise.
“The flexibility of XML is one of its greatest benefits,” says Chris Carfi, marketing director at Extricity, “but at the same time it poses a challenge, because everyone can define things their own way. You need to define the business processes carefully, for example, and decide how much you want to share with your partners.”
Extricity is a B2B software platform provider with a suite of products; its Extricity B2B product models business processes, manages integration into back-end systems and coordinates interactions with external organizations.
Even within a company, coordination isn’t necessarily assured. Kelly Ward, managing director of the Southeast region for Acuent, which provides e-business consulting and integration services, cites an example. In one company, five departments had each developed their own schemas. “Everyone thought their idea was the best,” says Ward. “The key issues were not technological but organizational.” Schemas provide a set of rules that define an XML document.
“XML offers a powerful and flexible means of transforming and integrating data sources to support e-business,” says Steven Wright, marketing director for the Critical Services Management Group at Candle Corporation. “However, this process depends on implementation and maintenance of standardized schemas.” He advises users to evaluate the schemas carefully to be sure they meet the company’s business needs.
Sunny Singh, CEO of Edifecs, agrees, “Standardization is very important but can be painstakingly slow.” One reason for the slow progress is that making the meanings consistent can be difficult. The tags might match, but the interpretation even of simple words such as “revenue” might not. Edifecs produces SpecBuilder, a schema authoring and management tool.
While the schemas must be compatible, data sources can vary. XML offers great promise in a data world that is very fragmented. More tools are emerging to translate to and from XML.
“We believe the world will be heterogeneous,” says Singh. “SpecBuilder can work with any file format, including XML, EDI and proprietary formats. For example, it can convert an EDI schema to XML or define an XML schema from scratch.”
He also recognizes the potential for XML in the e-commerce area. “Accessing legacy data and reconciling information across systems within an enterprise represent the low-hanging fruit for XML,” he says, “but the long-term growth will be from exchanging XML-based information with trading partners.”
And now the downside
Like many new technologies, XML has had its advantages touted while its limitations remain in the background, only to be discovered with experience. For example, the metadata contained in the XML tags provides the information that makes the contents meaningful, but it also adds to file size.
“If only a small amount of data is being moved, as in real-time transactions,” says Melody Huang, chief architect at Keane (keane.com), “the increment may not produce a noticeable difference in performance. But for batches of files containing many tags, transmission time could be affected substantially.” Rather than jump on the XML bandwagon, she advises, users should carefully consider the nature of the transactions they are planning and be sure there is a fit.
A final hurdle to the adoption of XML is communication with potential users about the technology. On one software company’s Web site appears the following:
“In sharp contrast to most XML repositories that decompose the XML instance file into element content objects and store the markup as metadata in a relational or an object-oriented database, [anonymous product] doesn't touch the native XML instance, but rather parses it and builds indices based on markup.”
And who could argue with that? But on the other hand, a newcomer to XML won’t make much sense of it. The basic value of XML in terms of interoperability is easy to grasp, but the topic gets technical very quickly. Vendors and systems integrators should make a concerted effort to convey to their customers both the value and the limitations of XML, and develop tools to make it more accessible to users.
John Matranga, director of XMLabs, offers the following advice to prospective users of XML:
- Don’t be overly focused on any one use, but take the time to set a generic approach. Put an infrastructure in place that can handle many uses, and pick one use to test your architecture decisions.;
- Keep an eye on what’s happening in the vertical industries with respect to standards.;
- Don’t wait until all the vertical standards are finalized, because XML is still evolving. Plan on adjusting the mapping when final versions are ratified.;
XMLabs, a division of Omicron, specializes in research and implementation of XML technology. Matranga notes that recently, XML infrastructures have become more compatible, and will continue to do so over time.
Microsoft BizTalk Server 2000 is an XML-based platform that allows companies to integrate applications or trading partner relationships by sharing data. One in a series of Microsoft’s Net servers that includes Commerce Server 2000 and Exchange 2000, BizTalk Server is oriented toward business processes and sharing applications across organizations.
The BizTalk Editor provides a user-friendly way of creating schemas for business operations, and also handles translation from other formats such as EDI and text files into XML. The BizTalk Mapper can create a template that tells the server how to transform one XML document into a new XML document. That process allows many different input documents to be mapped into a standard target format. Although initial setup must be done by programmers (connecting the server to the existing or new infrastructure, for example), once it is in place, business analysts can set up maps with new partners.
A major feature called BizTalk Orchestration is designed to create workflows, and uses the XLANG language, which is an XML language designed for workflow descriptions. One of the key features of BizTalk Orchestration is its ability to separate the business process from its implementation (implementation being the steps needed to actually make the system work, such as messaging and security). The business analyst therefore can focus on the process, while the systems integrator addresses the implementation issues.
BizTalk Server 2000 runs on Windows 2000 Advanced Server and requires SQL Server 7.0 or later. It is priced at $4,995 per CPU that supports a limited number of applications and trading partners. An enterprise edition priced at $24,999 per CPU supports an unlimited number of applications and trading partners.
A look at some standards and definitions
In the electronics and computer industry, RosettaNet (rosettanet.org) is developing a common e-business language for the information technology, electronics and semiconductor industries. Partner Interface Processes (PIP) define business processes among members of those industries. The RosettaNet Partner Conference held in April focused on RosettaNet standards development, successful PIP implementations and emerging market opportunities.
The Chemical Industry Data Exchange (CIDX) announced early this year that it had completed a phase of the Chem eStandards initiative to develop XML-based standards for the chemical industry. More than 700 data elements have been defined, and nearly 50 transactions can be conducted. Document Type Definitions (DTD), which define structures of schemas, have been developed for such actions as request for quote, order cancel, payment, and shipment status.
Approval of the Technical Architecture Specification for ebXML was announced in February. ebXML is a set of specifications for an electronic business framework that allows enterprises to conduct business through the exchange of XML-based messages. The ebXML Messaging specification defines common protocols such as SMTP, HTTP and FTP that for data exchange. Technical Architecture Specification provides the foundation for all other ebXML specifications. The goal is to deliver a complete suite of specifications this month (May).
The Simple Object Access Protocol (SOAP), a protocol for running applications over the Internet, was recently approved by the World Wide Web Consortium. Its role in running applications is somewhat similar to the role of the TC/PIP standard in transmitting data over the Internet, allowing applications to communicate with each other. While SOAP does not have universal support in the industry, it does make possible Web-based distributed computing through a relatively simple messaging-based solution.
A schema is a set of rules for defining a particular type of XML document, including any constraints on the content of data elements.
A Document Type Definition (DTD) describes formally the information structure in an XML document.
Judith Lamont is a research analyst with Zentek Corp., e-mail email@example.com.
XML--a government perspective
By Owen Ambur
In essence, KM comes down to two things: 1) profiling (associating metadata) with records (explicit knowledge), and 2) profiling people (tacit or implicit knowledge). In a digitally networked environment, the terms "metadata" and "management" are virtually synonymous. The issue is: How much management (i.e., metadata) is warranted for any particular record, series of records, person or grouping of people?
The potential to use XML to classify government records was one of the original drivers that led to formation of the XML Working Group--see http://xml.gov/documents_completed.htm and specifically http://irm.fws.gov/xmlops.htm. The basic set of metadata that is applicable to all records created and received by U.S. federal agencies is set forth in DoD Std. 5015.2 (http://users.erols.com/ambur/RMmetadata.htm#5015.2).
Initiatives such as the White and Blue Pages, UDDI , HR-XML and the HR-DN are beginning to point the way toward profiling people to enable the efficient and effective sharing of implicit knowledge across organizational boundaries.
Owen Ambur is co-chair of the XML Working Group and a member of the Federal Information and Records Managers board of directors, e-mail firstname.lastname@example.org