Saturday, February 19, 2011

IBM Watson brain similar to the human being

IBM Supercomputer, which Watson shellacked higher human samples of danger during the exhibition game news this week, is powered by a 90-cluster server and attacked the network storage (NAS) with 9 .6TB of data.

In the end, though, his brain has only 80% of the processing power of a human brain.

Tony Pearson, master inventor and senior consultant at IBM, he explained that Watson uses only about 1 TB to process real-time answers to the questions of danger after configuring its storage back-end as RAID and then killing over data to be loaded into memory the system server in the cluster.

Pearson cited estimate technology futurist and author Ray Kurzweil that the human brain can hold approximately 1.25TB of data and performs approximately 100 teraflops. In comparison, Watson is an 80-teraflop with 1 TB of memory.

"So is 80% human," said Pearson. "Yes, we could have dealt with many other information. We could have put more memory on each server, but once we have the answers to three seconds, we don't need to go beyond ".

Pearson explained that reach the threshold of three second response was only a matter of simple math.

The original algorithm identified threaded on a single core processor took two hours to scan and memory to produce a response. So IBM technologists divided only two hours to 2880 CPU, who produced the ability to answer questions in three seconds.

If IBM Watson were only a few other human Jeopardy contestant, viewers probably would have tuned in using such an overwhelming victory. However, interest in the man vs machine battle gave the show its highest ratings in nearly six years.

Competition between humans and computers have long captured the imagination of the public. Remember that the chess match between the 1996 computer Deep Blue IBM and world champion Garry Kasparov?

In this case, however, Watson has more in common with humans more Deep Blue. Like us, he uses only a small percentage of its huge storage capacity to answer questions.

Behind Watson simple scribble in front of the monitor that used it as a competitor of danger are 90 IBM Power 750 Express powered by 8-core processors--four on each machine for a total of 32 processors per machine. The servers are virtualized by using an implementation KVM (kernel-based Virtual Machine), creating a server cluster with a total processing capacity of 80 teraflops. A teraflop is a trillion operations per second.

On top of processing power, each server has 160 GB DRAM to provide the complete machine with nearly 15 TERABYTES of memory.

On the backend of the computer is a general parallel File System SONAS (GPFS from IBM). SONAS, or NAS, scalability is a Linux-based cluster system files that IBM released almost exactly a year ago.

The clustered storage model provides massive throughput due to a larger port count that comes from the compilation of many archiving servers together into a single pool of drives and processors all work on a similar task and everyone can share the same data through a single global namespace. In other words, all disk drives appear as a large pool of storage capabilities from which they can draw Watson.

Watson SONAS is populated with 48 450 GB serial ATA (SATA) hard drives for a total of 21.61TB capacity in a RAID 1 (mirrored); that leaves 10 TB of raw data that is used by Watson every time it starts. Three terabytes of which, however, is used for the operating system and applications.

But this is not the disk-based storage that makes Watson SONAS so darned fast; This is CPU and memory. Each time you start Watson, 10.8 TB data is automatically loaded in RAM 15 TB Watson and that, only about 1 TB is parsed for use in answering questions of danger, said Pearson.

In case you're wondering, 1 TB of capacity is still fairly significant; It can hold 220 million pages of text or 111 DVD.

"The remarkable thing is that you can get all the answers with such a small set of data," said John Webster, an analyst with research firm Evaluator Group. "After more iterations of load and test and test and loading and updating the database on IBM SONAS, came up with a version of the database that would generate the dataset that you have loaded into memory."

Enter the Australian and SAMBA developer programmer Andrew Tridgell.

Tridgell created the algorithm of computers that are running on top of Watson that culls out hardware for data set. Tridgell developed an open source Database clustering (CTDB) banal, that the SAMBA file protocol uses to access the memory between 90 server Watson.

More importantly, the CTDB ensures that none of the server are stepping on each other as they also update the information after a show of danger.

During the show, Watson is read-only-meaning that backend SONAS gets written anything. After the show, Watson is off and the computer scientists go to work to update information and debugging--trying to figure out why gave incorrect answers, such as the choice of Toronto as the answer to a question about American city.

"I'm sure they're scratching their head about that," said Pearson.

Lucas Mearian covers storage, disaster recovery and business continuity, the infrastructure of the financial services and healthcare IT for Computerworld. Follow Lucas on Twitter @ lucasmearian or Subscribe to the RSS feed of Lucas. His e-mail address is lmearian@computerworld.com.

Learn more about mainframes and supercomputers in Computerworld Mainframes and supercomputers topic Center.


For more enterprise computing news, visit Computerworld. Story copyright © 2010 Computerworld Inc. All rights reserved.

No comments:

Post a Comment