Bill
Gates displays a paranoid tendency common among technology industry
billionaires. “In this business, by the time you realise you’re in
trouble, it’s too late to save yourself,” he once said. “Unless you’re
running scared all the time, you’re gone.”
Those words came in an
interview with Playboy magazine in 1994 – 10 years before Sergey Brin
and Larry Page, two new rock stars of the tech world, sat down for
their own heart-to-heart with the same magazine.
Tech fashions – and fortunes – shift with great speed. The Microsoft Mr Gates founded might not yet be on the scrapheap of history but, as its unsolicited takeover offer for Yahoo makes clear, even seemingly dominant companies find it hard to keep pace in the latest and most promising tech markets.
A
decade ago, who could have imagined that the feared monopolist of the
software business would be so roundly beaten in online search and
advertising by Google
that it would have to mount a hostile bid for another distant also-ran
to try to catch up? A decade from now, as the editors at Playboy stroke
the egos of some new Silicon Valley hotshot, will the Google founders’
playful interview (to which Mr Brin, hot off the company volleyball
court, went shoeless) be just a quaint memory?
Predicting where
the next big disruptive change in the technology industry will come
from is a perilous business. Google’s rise has been as much a result of
its business model innovation as its technological supremacy. By using
advertising to support its internet services, it may eventually be able
to pull the rug from under Microsoft in more traditional software
markets.
It seems a fair bet, though, that some of the biggest fortunes will continue to be made in Google’s area of focus:
finding and manipulating information gathered from the world wide web.
To hear the optimists in Silicon Valley describe it, a new wave of
technology is on the way that will leave Google’s early advances in its
wake.
Imagine, for instance, being able to ask a computer, “Where
should I go on holiday?” and receiving an answer that is as suitable as
anything you could have come up with yourself. That level of
computer-generated reasoning is on the horizon, says Nova Spivack, one
of the entrepreneurs involved. It may still take 15 years or more to be
fully realised, but between now and then lies a series of breakthroughs
that will revolutionise the way we draw information from the web, he
adds.
This technology draws its inspiration, and some of its
techniques, from a field that has provided more than its fair share of
disappointments over the years: artificial intelligence (AI). Based on
a collection of technologies that includes natural language processing,
image recognition and expert systems (programs that try to emulate the
skills of experts), AI is a 50-year-old dream that was meant to lead to
intelligent machines.
“I had some hope you could just put
everything into some big neural network that would just start to think
– but it doesn’t take long working in AI to realise it’s much more
complex than that,” says Danny Hillis, founder of Thinking Machines, a
company whose rise and fall in the 1980s came to symbolise both the
unbounded optimism and the failed hopes of the AI movement.
“I’ve
shifted over time from trying to make machines smarter to trying to
get machines to make people smarter,” Mr Hillis says now. That more
modest goal lies at the heart of the latest movement, with its
pragmatic emphasis on melding approaches from AI with new core
technologies that are changing the web.
As Google shows, being able to return a string of websites in
response to a query can give rise to a multi-billion dollar business.
With so much at stake, even small incremental improvements on the road
to AI may create big business opportunities. “It isn’t about being
perfect,” says Barney Pell, chief executive of Powerset, an ambitious
new search company. “It’s about being able to differentiate enough to
make a commercial product. People are realising that the goals of AI
may be way out, but in the field of AI the time is here for really
exciting applications.”
“There are vast areas of human activity
that are slowly being chipped away at,” agrees Mike Lynch, who heads
Autonomy, another search technology company. “Even automating a tiny
part of the problem can have a high economic impact.”
The
movement already has a name: web 3.0. Venture capital is drifting in,
even though no one seems too sure exactly how to define the field and
there are still sharp disagreements among the experts about the
effectiveness of some of the technologies. “When we started, it was
largely a science project,” says Mr Spivack, who has raised $20m (£10m,
€15m), a sign of the sudden interest of the financiers. Referring to
recent developments in online social networking, he adds: “These are
not little Facebook applications – these are significant technology
investments.”
The basic
building block for this new technology movement is something known as
the “semantic web”. This has become one of the most controversial, and
misused, terms in the internet industry, conjuring up as it does a
vague promise that meaning will somehow become part of the medium.
Yet
to suggest that computers will be able to determine meaning raises a
thorny question: whether meaning itself has an independent existence or
is something that arises only in the mind of the person perceiving it.
Terms such as “meaning” and “understanding” are so closely linked to
human intelligence that it is hard to conceive of their corollaries in
a computer-mediated world.
In reality, the semantic web is based
on a defined and narrow – even if still highly ambitious – set of
goals. It is the brainchild of Sir Tim Berners-Lee, who invented the
present web, a collection of documents connected by links using
hypertext mark-up language. Tracing those links, companies such as
Google are able to identify documents that are likely to be most
relevant to a particular search – though they can only point to the
document, not dig deeper to find the actual information that is being
sought.
To overcome this, Sir Tim imagined a new web formed by
linking the data contained inside the documents. That way the data, not
just the documents, would become accessible to machines. Riding this
network of links, computers would be able to follow related ideas from
one website to another and draw together related information. A
reference to Sir Tim in Wikipedia, the collaborative online
encyclopaedia, could for instance be connected directly to his name in
this article on FT.com and to his personal social network on Facebook.
“If
you put data on the web about yourself in this form, I can pull data
about you,” he says. Subject to privacy and other restrictions, the web
itself would In effect become one vast social network, tracing links
between people, or between people and things, that were previously
invisible.
This semantic web is the product of a set of core
standards promoted by the World Wide Web Consortium, the organisation
that Sir Tim leads. “It’s happening – it has just taken a long time to
build,” he says. “HTML is a really simple language. All this data stuff
is more complicated. It just takes more design work.”
Now, nearly
seven years after he outlined the idea, some supporters say enough
pieces are in place to make the first semantic web services a reality.
“A bunch of people have started making applications that share data
across the web,” says Thinking Machines’ Mr Hillis. Linking information
in this way is a first step. The next will be to write software that
can find and manipulate the data, opening the way to that automated
advice on holiday destinations.
Standing in the way of this grand
vision, however, are some very big obstacles. This is not just a matter
of technology: at a deeper level, it touches on philosophical questions
about the nature of language and meaning.
At the heart of the
problem is the need to make information on the web “understandable” to
machines, so that it can be extracted, processed and made useful. To
make this possible, machine-readable “tags” need to be attached to each
piece of data to describe what type of information it represents – a
person’s name, for instance, or a day of the week. A computer that
reads the tag knows to treat the first item as a name and can then
match it against the same name found in other sources.
Attaching
these tags to every piece of information on the web is in itself a huge
task. “Tagging is a complete non-starter: no one has the time to do
it,” says Mr Lynch of Autonomy. At Powerset, Mr Pell calls this a
“chicken and egg problem”. Without new semantic services capable of
using it, there is no incentive to undertake the laborious work of
tagging data, but creating the services is pointless unless the data
exist in the first place. To overcome this, computers are being
enlisted to “read” text and apply tags automatically.
Yet the
process of tagging, or categorising, the world’s information may be
beyond the capabilities of even the human brain. “Information is
relative; it’s not objective,” says Mr Lynch. “The possibility that the
person tagging and the person reading it mean the same thing is very
small.” Context and subjective judgment play too big a role in how
language is used, he adds.
To try to overcome the problem, the
semantic web depends on a set of “ontologies”, or dictionaries that
help to create common definitions that can be universally applied.
These may oversimplify the great complexities of meaning, but they are
designed to establish a basic common level of understanding about
language to allow machines to do their work. The word “city”, for
instance, conjures up different ideas in the heads of city planners,
local politicians or sewerage experts, says Mr Hillis. But for most
purposes, a lowest common denominator definition will do: for a city,
they “all agree more or less on what it is”.
To create those common ways of looking at the world, however,
means crossing some deep political, philosophical and cultural divides.
In areas such as religion, for instance, the meaning of words is
closely tied to a broader world view. “Who’s going to set all the
rules?” asks Robert Cailliau, one of the developers of the worldwide
web. “You can say two plus two equals four. But there are things like
the Bible and the Koran that also set out the rules about how you
should see the world.”
Some of the early web 3.0 companies are
setting out to stamp their mark on this process, sensing the chance to
put themselves at the centre of a new global information network by
defining the standards that bring meaning to the cacophony.
“We’re
trying to create a useful point of view,” says Mr Hillis, whose latest
company is seeking to build what it calls an “open, shared database of
the world’s knowledge”. Investors including Goldman Sachs have put more
than $50m into the company. Known as Freebase, it has a database
designed to operate similarly to Wikipedia. It tries to outline
standard definitions that are then made available for anyone to access
and link their own data to over the web.
A reference to London in
a web document, for instance, might be linked back to the Freebase
definition of London: this could then be connected to any other
instances of the word London on the web that are connected to the
Freebase definition. Freebase hopes that outlining this lowest common
denominator of meaning to help link data could make it part of the web
3.0 foundations.
Meanwhile, technologies first developed for use
in AI are being brought to bear. Chief among these is natural language
processing, or teaching software to discern the meaning in a piece of
text. Views about this technology differ sharply. Mr Lynch, for
instance, declares it a “dead duck: the world is just too complex”. The
fundamental ambiguity of language, and its dependence on context for
meaning, make it impossible to automate the process of extracting
meaning from text, he says.
Even simple words or concepts can
mean very different things to different people and their meaning
changes depending on the circumstances in which they are used, says Mr
Lynch. While the human mind can make the necessary adjustments,
computers that follow strict rules about language find it hard to grasp
the many context-specific meanings.
Although the companies trying
to employ natural language processing admit it is far from perfect,
they maintain that technical advances in recent years have at least
given it a level of practical application. By using software to “read”
text, services such as Powerset and Mr Spivack’s Twine aim to add tags
to data automatically. The natural language approach also raises the
possibility of new applications, for example being able directly to
answer questions posed by a user – which has long been a dream in web
search.
Powerset has become the most visible champion of this approach.
The plunging cost of computing and the wealth of data available on the
web have combined to breathe new life into this technology, according
to Mr Pell. “One of the big problems was just a lack of computing
resources,” he says of earlier attempts. Also, refining a natural
language search engine requires “a tremendous amount of ‘tuning’; you
need data to improve these systems”. Thanks to the explosion of
information on the web, data are not in short supply.
Powerset is
using technology licensed from Parc – the famed Silicon Valley research
laboratories formerly owned by Xerox – to try to solve the problems of
natural language processing. The software is based on similar ideas to
those in quantum physics, says Mr Pell. A number of potential meanings
for all the elements in the text are allowed to co-exist as equally
accurate during the “reading”, until the most likely answer is singled
out at the end.
Even supporters of this type of natural language
analysis limit their claims for the technology, though they say it does
not need to be perfect to be useful. According to Mr Spivack, an
accuracy level of 70 per cent in analysing and tagging text has its
uses.
Combining this approach with other techniques of data
analysis can lift the accuracy level further. One method relies on
statistics – predicting the meaning of a word based on the
probabilities of its proximity to other words in the text. “It treats
language as a mathematical problem,” says Mr Lynch, whose company uses
this method in preference to natural language. As words do not appear
in random sequences, the fact that one word has been used in a sentence
increases the chance that a particular other word will also turn up.
“Meaning depends on your viewpoint – it’s not absolute,” he says.
While
none of the semantic techniques has been perfected, some are reaching a
level of sophistication that could lead to practical applications, at
least in the eyes of the investors who are backing the start-ups.
“That’s the difference now – people are building artefacts that are
actually useful,” says Mr Hillis.
So what will these artefacts
produce? Most expect the impact of the technology to be felt in stages.
The early advances are likely to be “incremental improvements, and at
first they won’t be that noticeable”, says Mr Spivack. For instance, a
wide range of web services should start to become “smarter”: search
engines should return higher quality results, and services that rely on
personalisation should make better guesses about your preferences,
while targeted advertising systems should become more accurate.
The
existing big names on the web, including Google, should benefit from
these improvements – though entrepreneurs who are pushing the
boundaries of semantic web technology, like Mr Spivack, hope that they
can come up with advances that are distinctive enough to set them apart
from older sites that have not mastered the approaches.
Connecting
related data across the web may also usher in new types of service. A
common example used by the web 3.0 visionaries again involves planning
a holiday: a semantic web browser would be able to find and draw
together travel schedules, hotel details, weather forecasts and other
information needed to plan a trip.
Further in the future, adding
a degree of reasoning to the software may enable it to filter and
select information. That may start off simply – acting on your behalf,
for instance, a software agent sets out across the web to compare
prices for a product and identify the lowest. Eventually it may lead to
making decisions on your behalf. As Eric Schmidt, Google’s chief
executive, told the FT last year: “The goal is to enable Google users
to be able to ask the question, such as ‘What shall I do tomorrow?’ and
‘Which job shall I take?’”
This fuller version of artificial
intelligence is still over the horizon but the path towards it is “a
continuum”, says Mr Hillis. Contrary to the early dreams of AI, he
adds, it will not be intelligent machines that provide many of the
advances but dumb machines throwing up apparently smart answers by
using tricks that the human brain cannot match.
The current kings
of Silicon Valley certainly have no intention of being left behind. As
Mr Brin said in that 2004 Playboy interview: “It’s credible to imagine
a leap as great as that from hunting through library stacks to a Google
session, when we leap from today’s search engines to having the
entirety of the world’s information as just one of our thoughts.” But
in the race to get to that point, Google is assured of many rivals.
Recent Comments