| |||
|
Volume 20, Number 6 • March 22, 2004
Extended fixup languageThe notorious complexity and inefficiency of XML have opened up some intriguing and perhaps lucrative opportunities for technology startupsOne of the most pleasing things about technological progress, at least from the perspective of the early-stage investor, is that it is rife with imperfections. It is, in fact, fraught with mistakes, misguided assumptions, and miscalculations, some big, some small, many of which translate into splendid opportunities for creating new companies and products. No firm, no engineer, no computer scientist can predict the precise consequences of this new protocol or that new product line. And so, there are always bothersome yet potentially lucrative potholes, detours, dead-ends, and downed bridges on the road that leads to the future. We doubt that anything as quantifiable or formal as Darwinian evolution is at work here, but the comparison is an obvious and appealing one. New ecological niches are constantly opening up as organisms, and products, compete with each other to either occupy or eliminate those niches. It's thought, for example, that certain dinosaur species originally developed wispy protrusions on their skin as a mechanism for dissipating excess internal heat. Over time, as the surrounding environment changed under stress from various factors, these protrusions evolved into something radically new and beneficial to descendant species in a quite unpredictable way: They became what we now recognize as feathers and eventually helped their owners to achieve flight. A prime example of this phenomenon can be seen today in that somewhat woolly but formidable species of software called XML, which is enjoying tremendous acceptance in enterprise and Web-based computing. We've been keeping a keen eye on this so-called eXtended Markup Language ever since it hit the market six or seven years ago. Only in the past couple of years, though, with the advent of Web services, have we seen any truly promising opportunities for startups to make money directly from the XML boom. XML editors and parsing programs were early to appear, but as products they were hardly the seeds of defensible businesses. Lately, however, the rising use of XML in Web services has revealed some unforeseen shortcomings in the language and caused a growing stress on the surrounding IT environment. The result: some brand-new and financially appealing inventions, which we'll take a look at in this letter. Lingua francaThe great promise of XML has always been to do for computer-to-computer communications what HTML did for the Web, namely provide a standardized way for data to be self-describing and therefore more widely useful across differing systems. On the Web, HTML tags describe to any properly configured browser program, regardless of brand, an author's intentions for how a particular Web page's text and graphics should be displayed on a screen: text font, color, visual layout, and so forth. XML takes this tagging idea a major step further by indicating something about the meaning of specific data elements. In short, XML tags can provide rich layers of context that software applications can, with the proper setup, use to interpret the data supplied by other machines. Within an enterprise, for example, XML-tags could help two customer relationship management (CRM) systems share information about specific individuals or accounts, even though the individual records and tables in two apps' databases are organized quite differently. On the Web, XML tags can help companies transact business with each other electronically, in much the same way that formally-struc-tured EDI (electronic data interchange) messages have done in the past but with considerably more flexibility and scope. Loosey gooseyIntensive use of XML, however, raises some stiff technical challenges. It's both exceedingly complex and exceedingly verbose. Despite all the talk of its enjoying support from IBM, Microsoft, Sun Microsystems, and just about every other computer company on the planet, the XML standard doesn't actually unambiguously define a language or data structure for all to share and use. Instead, the specs simply lay out a formalized way in which any party, supplier or user, can construct their own language - their own set of tags, or meta-data, that is, for describing the meaning of specific data elements. That's not a trivial step forward, by any means, but it hardly solves, once and for all, the problem of getting disparate computer systems to share data with each other. In fact, the XML spec was adopted and then vigorously promoted by the indus-try's major players without first having been forged into a proper, well-defined standard, critics say. It was derived from a 30-year-old document markup language called SGML, which was never intended to be used in representing data as such, only documents intended for printing or display on a screen. As a result, right from the start, the computer industry has been essentially jury-rigging XML to handle tasks it's not actually designed for. There is tremendous room for interpretation within the formal XML spec, to the point where simply representing a single field of data and its value can be written in a good 20 different ways, all of them mutually incompatible. The spec more or less describes how tags are to relate to each other - namely in a nested hierarchy - but it doesn't say anything about how to arrange the branches of that hierarchy or what meaning to assign each of them. The naming and organization of these tags is left up to each user. Word playThe result, as is increasingly obvious, is that myriad XML dialects have flourished around the world, all erected on a flawed and ill-defined foundation. Differences abound not only between disparate industries but between companies within those industries and even between brands of software installed within a single company. There have been numerous attempts, some of them quite successful, to hammer out dialects for specific industries, such as financial services, autos, and electronics, but there still remain significant differences between this system's and that sys-tem's definition of even such common items as Customer, Order, and Volume Discount, for instance. In the boxTo patch over these differences and exploit XML effectively turns out to require extensive work in translating between XML vocabularies, a non-trivial task. Great amounts of computing energy must be spent to grind through XML documents that now contain perhaps five times more characters of XML meta-data than characters of the original data itself. This ratio will likely increase over time, too, as XML is used more extensively and documents and messages are intended to carry around more context with them wherever they go. These problems, and others, have opened the door to a handful of startups including Conformative Systems, DataPower Technology, and Sarvega. Together, these firms have established the market for a new kind of network infrastructure box, an appliance or add-in card that's designed to speed up a highly specialized task that would otherwise get done in software and consume most of the cycles available in traditional application servers. The trouble with the latter approach is that it forces customers to beef up their app servers or suffer serious declines in transaction-processing rates. So, an alternative is called for. Climbing treesWithin the increasingly large, XML-tagged documents are intricate hierarchical structures of tags and data that must be navigated with great precision when each document is being evaluated. It's not enough to simply search the document for a certain string of characters that represent a given XML tag and then map that string to a corresponding tag in the target XML dialect. That's a job that could be sped up by using specialized pattern-recognition hardware, perhaps based on the kind of application-specific integrated circuits (ASICs) that have been developed for searching text documents or inspecting data packets at high speed. In fact, with XML, it's necessary to keep track of exactly where in the hierarchical structure each tag is located - in which position on which sub-branch, for instance - because that location and its relation to other tags indicates its meaning. And that calls for some specialized hardware of its own. The preferred way of handling such translations is a scripting-based language called XSLT, which is part of yet another language called XSL. Programmers use XSL to define and work with XML stylesheets. The first XML boxes that market pioneers DataPower and Sarvega developed relied on standard Intel-Linux boxes to run performance-tuned XSLT programs that would evaluate and transform XML documents and messages at high speed. They simply offloaded XML processing from app servers, which provided a significant performance advantage. In some cases, the overhead incurred by analyzing and transforming XML documents from one format to another as they move between apps can consume as much as 80% of the processing power available in a datacenter's app servers. So taxing is XML processing that one particularly powerful application of the technology - validating XML documents to make sure they conform to a given stylesheet - is generally avoided by most organizations. Home bakingClearly, adding app server capacity would help. But as the volume of XML-tagged traffic grows by leaps and bounds, and as the complexity of the XML being used continues to grow, it looks increasingly attractive to apply specialized silicon to the job. And that's what several of the XML appliance companies have set out to develop. Fortunately, as we understand it, XSLT is what's called a declarative programming language, in contrast to procedural languages like Java and C, so it can take good advantage of parallel-processing techniques. What's more, it appears that building ASICs will be more effective than just ganging together multiple Pentiums or some other general-purpose processor. That approach would consume lots of physical space and electrical power and generate excessive heat. The startups are not describing their planned ASICs in much detail, except to say that they will use parallel processing architectures specially designed for the tasks of parsing and validating XML documents. The earliest-stage of the startups, Conformative Systems tells us it will use its ASIC in an appliance designed solely for the task of translating between XML vocabularies and other data formats. Transforming XML documents is a function that will be increasingly important in business-to-business e-commerce as purchase orders, logistics schedules, and other elements required to execute a long, extended B2B transaction move back and forth between suppliers, customers, and other business partners. Bridging the gapThe kinds of enterprise application integration (EAI) facilities provided by firms such as WebMethods and Vitria can be of help here, but they are typically designed to aid businesses that already are aware of each other and have sat down to jointly hammer out point-to-point links between their respective applications. The promise of XML and the kind of Web services it makes possible is that they will enable companies to do business with each other with much less prearrangement. (A startup called Cast Iron Systems has developed what it sometimes calls EAI-in-a-box: a self-contained appliance designed to translate data exchanged by disparate enterprise apps. Its main advantage is that it can be remotely configured and managed, and thus installed in far-off locations, and it pays for itself in 6 months or less.) Conformative sees its technology helping not only with translating between XML vocabularies but also in transforming data extracted from a relational database into XML, or vice-versa. That's a task that Oracle and other database software makers are currently handling with their own code, executing on the same server as the database management system itself. But as the volume and complexity of XML traffic increases - to as much as 20% of all LAN traffic by 2006, according to our friends at Gartner - this arrangement will likely bring the database server to its knees. Likewise, Conformative's product is being designed to handle two-way transformations between legacy apps and XML. Fast talkersEventually, we can imagine, the day will come when XML-tagged data predominates in enterprise computing, torrents of the stuff moving across enterprise data buses, or backbones. Conformative, for one, sees an opportunity there in providing not only a stand-alone XML processing appliance but also a plug-in card that would provide its ASIC's abilities directly within individual servers. The firm's appliance, the firm tells us, will be able to keep up with 1-Gbps Ethernet network streams, while the first plug-in card will handle volumes of as high as 250Mbps. The company maintains that XML processing will turn out to be much like graphics processing, a function that will benefit from special-pur-pose chips and won't be easily subsumed into the server's main processor. DataPower, now 5 years old, and Sarvega, founded in 1999, first caught our attention several years ago with appliances designed to speed the routing of XML-for-matted messages within a network of servers. By inspecting the tags on an incoming message, for instance, the appliance might determine the kind of transaction it embodied or the specific customer who sent it and thereby send it to an appropriate server. This kind of load balancing and content-based routing seemed to hold considerable promise, as we explored here (Please see "Getting real", June 24, 2002.) But since then, both DataPower and Sarvega have started to pay more attention to the problem of securing XML-based transactions. Zip your lipSecurity in the context of XML entails a number of functions, all dependent on parsing and analyzing the XML-defined structure in a message or document. The security issue has taken on particular importance as organizations start to expose the data and function of their most vital applications in the form of XML-based Web services. In theory, hackers could purposely inject malicious XML tags into an otherwise valid Web services message, causing all kinds of mischief. What's more, because of the great complexity of most XML-based tagging schemes, legitimate programmers may easily make mistakes when structuring an XML document and that, too, can wreak havoc on the receiving end. There are even distributed denial-of-service (DDoS) attacks that rely heavily on XML, too. To block these and other XML-rooted security problems, it can be helpful to validate incoming and outgoing XML structures to make sure they adhere to an organization's chosen definitions and specifications. In some cases, the XML will contain digital signatures and other encryption-related features, which may be offloaded from an app server, too. Sarvega plans to unveil its first line of XML security appliances, called Guardian, today. The company sees the products appealing to online retailers, telecom providers, government agencies, and other users that deal in significant quantities of XML data. The new products make use of a dual-Pentium architecture and a third party's crypto-processing chip to handle digital signatures. Though the company has not disclosed plans to develop its own ASIC, it does point out that one of its investors is Intel, with which it working closely. Sarvega tells us it has just won a deal to supply an unidentified European government agency with more than 100 of its XML appliances. Over the horizonDataPower, meanwhile, plans to unveil an XML-focused ASIC in the second quarter of this year. The chip will be used to speed up both DataPower's XML accelerator box and its security gateway product, which are now Pentium-based, and will eventually show up on a PCI card for insertion into standard server chassis. The company is selling direct and is enlisting a growing list of resellers, systems integrators, and OEMs. Among DataPower's customers are JP Morgan and the Dept. of Defense. We think that the big question hovering over these XML accelerator companies is whether or not their endeavors will attract the attention of Cisco Systems, Nortel, and other established networking equipment makers. By some lights, those companies are likely to acquire one or more of the startups as a way to break into what's being called the applications-aware networking market. This is a well-set pattern in networking, seen previously when layer 2 and 3 startups building routers, switches, load balancers, and other gear were snapped up at lush premiums. We can't see why this new market should be any different. XML does seem to be here to stay, even if it is not the most efficient software scheme ever conceived. As our analyst friend Tom Rhinelander, head of New Rowley Group, says XML looks like soft-ware's answer to duct tape: hardly perfect, hardly elegant, but quite useful in a million and one situations. Fattened upIndeed, all indications are that the use of XML will continue to accelerate in coming years. The reasons are many. We're not sure we go along with the idea, put forth by more than a few industry watchers on the Web, that makers of servers and networking gear have conspired to increase XML's notorious bloat - the relatively large size of XML-tagged data - because it helps them sell more hardware. We can believe, however, that the extraordinary complexity of XML serves the interests of certain large computer companies, such as IBM, Microsoft, and Sun.
By being as complex and riddled with incompatibilities as it is, the XML scene is somewhat frightening to users and to smaller software companies. They can easily recognize the great benefits that XML promises to deliver, but they cannot master the technology without a fair amount of difficulty and cost. There are too many moving parts, as it were. So, they tend to migrate to one of the over-arching software schemes that the big companies are making available, each of which comes with a heavy dose of XML: Microsoft's .Net setup, Sun's J2EE, or SAP's Netweaver framework, for instance. Under each of those umbrellas users can find a broad range of tools, components, and architectures to harness enough of XML's promise without excessive hassle. Thus, a complex and difficult XML is just what these big firms like: A nominally open standard that has enough slack in it to permit a good deal of "proprietary-ness" - the source of most profits in computing - to be introduced. Dipping in We're aware of some interesting schemes afoot to help make working with XML easier for developers. If they gain traction, they should not only ease the pain of using XML effectively but in so doing also increase its usage. One of the most intriguing approaches is a programming language called Water, being commercialized with a Draper Fisher Jurveston-backed startup called Clear Methods. It has been designed to handle and work with XML in a much more native way than most commercially important languages. Consider that Microsoft's .Net comprises more than a dozen different programming languages, ranging from C# to HTML, Cascading Style Sheets, and Active Server Pages (ASP). Microsoft would have each one used to handle a different aspect of a task within a large, complex information system. Yet none of these languages is XML-native, so to speak. Programmers working with each language must devise all sorts of complex routines to handle XML in a local way, but little of their code can be shared with others on the team. Multi-tongueWater attempts to solve this problem by providing what appears to be a fairly advanced programming language that has been designed to offer the best of several existing paradigms. We're not able to describe its workings or contours in too much detail, but suffice it say that Water combines ideas from a powerful functional language called ML, from Smalltalk (the dynamic object-oriented language developed at Xerox), and other languages such as LISP, Scheme, Self, and even HTML. One of the company's founders helped construct the widely-regarded Dylan language at Apple Computer. While that may sound like a hodgepodge that only a computer scientist could appreciate, Clear Methods tells us it can get newbies up and running on Water (so to speak) with only a day or two of training. That's largely because the language's syntax is based on HTML, with which a great many people are already familiar. Yet, Water and its runtime environment, called SteamXML Engine, are particularly useful in modeling complex systems and in rapid prototyping. In brief, Water combines advanced object-oriented facilities with native XML data structures and HTML-like presentation. Pain relieverClear Methods sees major opportunity in selling its software to corporate IT departments that have run into difficulties when using traditional software tools and languages to build critical Web services. The Water language, taking its name from that common liquid's ubiquity and fluidity, may appeal to corporations wishing to set up rich e-commerce links to their suppliers and customers but wanting to avoid the rigidity of using XML with traditional programming languages. Be sharpHaving a soft spot for advanced programming languages, we wish Clear Methods the best. We're concerned, though, that the firm, like any startup with "different" software tools, will have a tough time convincing Corporate America to adopt its wares. A few good showcase applications could do the trick, though, as may the firm's use of an open-source model and strict use of certain well-regarded standards. One potential threat looming on the horizon appears to be a new language that Microsoft researchers are reportedly developing, called X# or, most recently, Xen. We've seen it described as a new version of the firm's C# object-based language with built-in XML and SQL database facilities. Looking forward, we doubt very much that any competing technology will knock XML out of the picture, no matter how ill-conceived or inefficient it may turn out to be. And that, we can only guess, will create even more opportunities for startups. In computing, after all, inefficiency and complexity often serve as the best mothers of invention an entrepreneur could ask for. ? Private Profile: UPEKSTMicro Spinout Targets Growing Fingerprint Sensor MarketThough countries like Malaysia have been using a "smart" national ID card for the past few years, this year is the first that the U.S. will establish a similar system. In just a few months the U.S. government will begin to issue smart ID cards that can identify card carriers by their fingerprints. The fingerprint sensors and software that both governments rely on come from UPEK, a new spinout from semiconductor giant STMicroelectronics that has just closed $20 million in venture capital.
The company is backed by a mix of investors from the U.S., Europe and Asia. San Fran-cisco-based Sofinnova Ventures led the round, but it also included Sofinnova Partners of France, Diamondhead Ventures of Menlo Park, Earlybird Venture Capital of Germany, and EDBV Management and Green Dot Capital of Singapore. Though this UPEK has just raised a Series A round, the business has been a unit of STMicroelectronics since 1999 and a research effort since 1997. STMicro spun out the business at this time because although the group already had customers, biometrics remains an early stage market and, more importantly, is not part of STMicro's core business strategy. The semiconductor company will now control roughly a third of the equity in UPEK and will supply silicon and components to the spinout. Berkeley, Calif.-based UPEK sells fingerprint sensor systems to the U.S. and Malaysian governments for several hundred dollars each, but the firm will be focusing primarily on notebooks and the nascent cell phone market this year. In all of these markets UPEK's sensor is designed to replace passwords and better protect consumer data such as address books and other electronic files. Alan Kramer, CEO of UPEK, said the company will ship about a million fingerprint sensors this year, including 100,000 in notebooks, which currently bring in about half of the firm's revenue. Micron PC has incorporated the sensors for its notebooks and IBM will launch a notebook with UPEK's technology later this year. But these volumes could pale in comparison to a sensor that was incorporated into cell phones. "The real wild card is how quickly and steadily the cell phones come online," says Mr. Kramer, who has led the STMicro division since its inception. UPEK will introduce its first fingerprint sensor for cell phones in Japan this year. But another startup, Authentec, originally a spinout from Harris Semiconductor, already has a sizable presence in that market. Of the 1.5 million sensors that Authentec shipped last year, sensors for cell phones accounted for just over half of the company's revenue. This year Authentec aims to ship more than 3 million fingerprint sensors and expects cell phones to be a big component again, according to Authentec CEO Scott Moody. The company is planning for a big revenue ramp next year, with breakeven at the end of this year. Based in Melbourne, Fla., Authentec has been selling its sensors to an NTT DoCoMo supplier and has another four design wins for cell phones in Japan. Mr. Moody believes the sensors in cell phones may help bring about the growth of mobile commerce. "[You can] turn your cell phone into a debit card [or] a credit card," Mr. Moody says. For the price of $6 per sensor for a cell phone - and dropping -Authentec expects to see not only more customers in Asia but also in Europe and the U.S., perhaps as early as next year. In addition, Authentec is also targeting access control for homes, buildings and automobiles, though consumers won't see fingerprint sensors for their cars until at least 2006. The firm is already working with Delphi Automotive, a large automotive supplier. Samsung, American Power Conversion, Fellows, Tar-gus, SCM Microsystems, IBM, and Precise Biometrics are also customers. Though UPEK and Authentec do target some different markets, they have both decided that getting fingerprint sensors into cell phones is a top priority. Because of the high volumes of cell phones, the cellular market is crucial for UPEK, Authentec, Infineon, Atmel, and any other firm with high ambitions for the biometrics market. Though Authentec is the first company to tap the cell phone market, it may simply have opened up doors for its competitors. "Given [UPEK's] parentage, they're probably more likely to be able to get those kinds of deals than a pure start-up because they've got the connections and relationships already," says our friend Marlene Bourne, an analyst with In-Stat/MDR. Though UPEK won't make predictions about its cell phone business, it does have a backup strategy that its investors like. "The company's products are very well diversified. UPEK is the only company with products and technology and customers serving all these markets," says Eric Buatois, managing director of Sofinnova Ventures and a member of UPEK's board. UPEK's Mr. Kramer points out that his company sells not just the sensor for notebooks but also the software for the sensor, meaning high profit margins without the need for high volumes. The same is true for UPEK's products for governments; the high price for those systems means UPEK doesn't need high volumes. Mr. Kramer says UPEK aims to break even by the end of the year, though mid2005 is more likely. If the cell phone market sensor market takes off, those milestones will be a lot easier to hit.
MANAGING EDITOR
SUBSCRIBER SERVICES
120 Wooster Street • New York, NY 10012 212-343-1900 • Fax: 212-343-1915 computerletter.com
|
|||