It's 14 years since the world's first ever website, info.cern.ch, was posted online by Sir Tim Berners Lee. What had started in 1980 as a "play project" through which Sir Tim and fellow researchers at Switzerland's CERN laboratory could share and collaborate on research projects, resulted in the creation of the World Wide Web.
But despite the Web's success since, Sir Tim believes software developers are still only scratching the surface of its potential. That's because Hypertext Markup Language (HTML), the computer language of the Web, was designed by humans, to be read by humans.
To a computer, most of the information posted online is meaningless. A search for "Jaguar", for example, will bring up information on the car, the animal, and millions of pages related to neither. Great if you have the time to sift through the results, but where there are billions of different data to analyse, it's far less laborious to let a computer programme do the work. For that, phase two of Sir Tim's vision, the Semantic Web, must be built-a Web designed for machines.
"What the Web's been good at is presenting information to people, but where it falls down is the interpretation of that information," says Dr John Davies, manager of Next Generation Web Research at BT. "People still have to visit the Web page and do a lot of the interpretation themselves. What the Semantic Web is all about is describing that information so that it is more amenable to machine processing."
"The reality is that things like the Semantic Web don’t get out into the world unless people can see what it offers"As Sir Tim told the W3C conference earlier this year: "The day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machine talking to machine, leaving humans to provide the inspiration and intuition." So, to take a commercial example: a company, such as Amazon, might develop a programme that could purchase your favourite artist's CD for you, on the day of its release, having already checked your bank balance to see if you can afford it. It could then check your personal calendar and automatically book concert tickets for the same band as soon as they are available.
Advocates such as Dan Zambonini, technical director of internet consultancy Box UK, are positively evangelical. "The Semantic Web could save lives," he says, "because it could enable the identification of otherwise undetected patterns in large-scale, distributed data sets. It could help find medical cures and aid other problems in life sciences. It could help detect and contain viruses and outbreaks. And it might help analyse geological or meteorological data and limit the destruction of natural disasters."
But let's not get ahead of ourselves. W3C, the organisation Sir Tim has headed up since 1994, has been pushing for adoption of a standardised Semantic Web language for the past 12 years. In the same way that keeping his Web invention patent-free allowed a rush of private companies to commercialise and improve his creation, Sir Tim is hoping the adoption of Semantic open language standards, such as RDF and XML (see box P68), will enable companies to semantically "tag" their online data, opening up the Web to a whole host of new opportunities. But progress has been painfully slow.
According to Nigel Shadbolt, professor of artificial intelligence at Southampton University's School of Electronic Computer Science, the Semantic Web is spreading slowly but steadily, particularly in cases where the analysis of large databases requires automation. "The standards have taken a while to arrive," he says, "but incubator communities for the Semantic Web-people in areas like life sciences, defence and engineering-are starting to appear."
John Davies agrees that "information intensive" sectors, such as health or pharmaceuticals, are the most likely early adopters of Semantic technology. But he doesn't expect take up to be restricted to data-heavy applications. In addition to his role at BT, he is also director of the Semantically Enabled Knowledge Technologies (SEKT) project, an academic-industry partnership set up to help put the European IT industry at the centre of Semantic and knowledge-management technology.
One of SEKT's goals is to develop working commercial applications of Semantic technology. These include: efforts to enhance the information contained in BT's digital library, linking its two main databases to the wider Web and also making it easier to search; and an attempt to help Siemens Business Services consultants share information across the global workforce.
SEKT's work shows the flexibility of Semantic Web problem solving. Davies says he is working on problems of a similar kind in his role at BT, particularly using medical vocabulary. For example, if a hospital nurse inputs "almond allergy" into an electronic patient record, "what the system needs to know is that an allergy is a type of disease and that an almond is a kind of nut, because in the system it only knows about nut allergies-it has no record of almond allergies," he says. "So [Semantic Web technology] can infer that because the nurse has entered "almond allergy" that actually means it's a nut allergy. It's a simple reasoning example, but also a potentially life-saving one."
Semantic technologies are developing mostly in the business-to-business space, but there are signs that consumer applications aren't too far behind. The first to make an impression may well be Garlik, a personal data firm led by Tom Ilube, former CTO at Egg and founder of software company Lost Wax, and Mike Harris, founder of Egg and First Direct. Southampton University's Nigel Shadbolt is the company's CTO.
Ilube is hoping Garlik will help internet users regain control over their digital identities and advise them on the action they need to take to get something corrected. "There's a lot of information about you scattered around the Internet," he says. "On the whole, people are concerned for their own digital imprint, but often for their children's too, especially when they are young children." Semantic Web technology is at the heart of Garlik's model. But because Ilube and his team are slightly ahead of the curve, they've had to create the technology architecture themselves, from the ground up.
It's likely that this technology's tipping point will be supplied by the private sector. Once companies begin to see the commercial benefits of using Semantic technology for their own applications, more programmers will begin to use Semantic languages as standard. "The reality," says Ilube, "is that things like the Semantic Web just don't get out into the world unless people can see what it offers, or unless there is a commercial reason to do it. What you'll get is islands starting to join up with each other-that's the way the Semantic Web will emerge."
One of those larger "islands" is Oracle. The software company is the first to offer an RDF compliant database. Andy Cleverly, director of technology, EMEA, at Oracle says there are plenty of companies now starting to use the technology to integrate large sets of analogous enterprise data. "This is one area where it can make a difference to a company's ability to integrate different types of information," he says, "especially companies that are going through change, for example through mergers and acquisitions. Say a manufacturing company wants to buy another manufacturing company. Their parts catalogues may call the same product different things. How do you know these parts are exactly the same?"
Other technology companies are working on tools that their customers can use to build applications that will eventually push the Semantic Web into the wider domain. "Companies like Adobe, with its Extensible Metadata Platform, are starting to emerge with offerings that will help organisations extend their existing sites to the Semantic Web," says Richard Edwards, senior research analyst at Butler Group. Researchers at HP Labs, he adds, have been developing Jena-a Java-based open source Semantic Web toolkit.
But will that be enough, asks Zambonini. "We need a Microsoft to move in," he says. "The idea is out there that we need this thing to happen but nobody has any direction. If somebody like Google supported it then everybody would rush to put their data in RDF format."
But Zambonini can't see that happening any time soon. "Google has already said it's not interested in dealing with data that's meant for computers, it's more interested in dealing with data designed for humans-in other words HTML rather than RDF."
Ilube believes the private sector's focus is company development rather than technology adoption. "Garlik's mission is not to create the Semantic Web," he says. "We like the tool and the philosophy, but it's people like us, using it for their own purposes, that will cause others to start using it as well."


