may not know that Toronto isn't located in Canada, but Retrevo has designed similar artificial intelligence to analyze more than 50 million real-time data points to provide shoppers with the most comprehensive, unbiased information regarding what consumer electronics to purchase. In an interview I conducted with Aditya Vailaya, Retrevo's chief scientist, we gain insight as to how AI is being used in this field.Retrevo
is a consumer electronics shopping site and virtual shopping companion that helps shoppers make confident buying decisions. It reports real-time trendsAditya Vailaya
and demand for consumer electronics and gadgets, and as a pattern recognition and data mining specialist, Vailaya is one of wizards behind the curtain perfecting Watson-like technology to help electronics shoppers decide what to buy, when to buy and where to buy.
I thought it might be interesting to quiz Vailaya about the computer that beat man on the game-show Jeopardy and see how artificial intelligence can help us make informed decisions going forward.
1- Give us a sense in layman's terms how Watson works?
Watson is the first of its kind natural language question and answer system. It is manually trained to extract
concepts and relationships from a large corpus of text related to a topic and store it in its data-bank. Watson then takes a user input in terms of natural language text, extracts concepts and relationships (subject-verb-object) and searches for the best set of answers that match most of the concepts and relationships in the input. Watson also generates a confidence score with its answers which it uses to decide whether to buzz or not.
2- Pertaining to Watson's performance on Jeopardy, where do you think Watson excelled? And where did it fall short?
Watson defeated human Jeopardy champions hands down. That sums up its excellence. The key strengths of Watson are its ability to parse natural language clues, process it very quickly, and search for an answer within a short duration of time with a sufficiently high degree of accuracy. It is the combination of all these pieces that made Watson achieve this feat and IBM researchers and their collaborators should be commended.
Watson suffers from the same short-comings as any other machine learning system. “Garbage in, garbage out”. In other words, these systems only learn what they are fed in. Watson will learn incorrect facts, if some of the input data are incorrect. Moreover, the system cannot generate an answer if it hasn’t been trained on data related to the question.
The most important challenges in building a system like Watson are collecting an input set that both is valid and comprehensive, processing the input data into pieces of information (or features) that can be mathematically processed (which is where the computers’ prowess lie), and tuning parameters of the underlying mathematical/statistical models to find an optimal machine performance with the desired accuracy and efficiency. It took a number of researchers a few years to train Watson to play Jeopardy and it would take them a few years to train Watson in a different domain, such as in the health care or law.
3- I noticed with one question "What was the decade Oreo Cookies were introduced" - Watson duplicated the same incorrect answer as Jennings. Doesn't Watson have the ability to "search" for a new answer in his data bank, once he "learns" that his first choice answer was incorrect? Why not?
Watson did not have the ability to listen to other contestants – it didn’t have ears. Hence, it could not employ real-time input to modify its answers. Watson generates a set of potential answers ranked with confidence. If Watson could listen to real-time user inputs (such as other contestants) then it could skip any of the top answers that have already been answered and known to be incorrect. A challenge here is whether Watson can understand other answers in a very limited amount of time. In a different scenario, where timing is not as much of a constraint, Watson can incorporate additional negative clues and re-process for an answer.
The challenge for a show like Jeopardy would be to come up with an answer in the few milliseconds that it needs to answer after another contestant has answered incorrectly. One can foresee its uses where Watson is an expert consultant and has the luxury to answer in a less restrictive time frame.
4- What are you working on specifically at Retrevo that relates to the artificial intelligence that is similar to Watson? How does your work differ from what IBM is doing?
Retrevo is a consumer electronics company helping users across the life-cycle of products. We help users research, buy, and support their products. We employ statistical machine learning algorithms to identify product features from a vast source of resources across the web. We use a multi-core distributed system to crawl the web to identify data related to consumer products, such as product specifications, prices, expert and user reviews, current and past sentiment around the products, velocity of sentiment and data around the products, etc., and then extract information from these unstructured and semi-structured data. We have trained a number of machine learning algorithms to process these crawled pages and extract concepts and relationships along with a confidence score for the extracted information.
Machines are further trained to process all information related to products in a particular category to generate a Value Map for the respective category. The Value Map is a real-time comparison of products in a two-dimensional space displaying the inherent value of all products in a category as computed by our system against their respective current price.
We differ from Watson in that we employ our technologies in a different space and to a different outcome. Unlike Watson, we do not interact with our customers using natural language. We aren’t in a QA (question and answer) space, where there is one correct answer. Instead, our area of focus is on rating consumer electronics products. Our goal is to find the best value products and present the user with all the relevant information to make an informed buying decision. We train machines to extract relevant information and classify them into a product centered taxonomy and then use these information to generate a recommendation for individual products.
5- Man vs Machine is a topic and conjecture played out in a number of Hollywood movies where machine will eventually overtake man? What are the truths and fallacies about this thesis?
We should be talking about “Man and Machines” instead of pitting them against each other. Machines are and will remain inanimate objects for the foreseeable future. They simply perform tasks that they are trained for and do not have any inherent motivation to learn on their own. Their main strengths are in performing complex mathematical tasks, working on repetitive tasks without complaining, and reliably generating the same answer each time. Try and teach that to humans! Machines can be trained to perform better than humans in particular tasks, but that requires innumerous human hours to train them.
6- Since Retrevo is a consumer electronics shopping and review site helping shoppers decide what gadgets to buy- how does your work in artificial intelligence fit into their services?
We employ a large number of statistical machine learning algorithms to process raw data related to products and classify them into a comprehensive product-centered taxonomy. Further statistical models are built on top of the classified data to generate holistic, objective, and fact-based product recommendations. It has taken us over four years to train our system to work with a very high degree of confidence in classifying product related data and automatically generating product recommendations. We process over 50 million data points on a daily basis to generate recommendations for over 20,000 products.
7- How many other Watson's are there in existence today? And when will this type of product be available for consumers to purchase?
Watson is unique in terms of its achievements in a natural language Q&A systems. However, there are innumerous systems that solve limited problems in specific areas, for example our work at Retrevo. Other areas of AI use are in robotics, vision, speech recognition, autonomous navigation systems, and biotechnology, to name a few. The greatest limitation of AI is that it requires huge amounts of resources in highly specialized engineers and mathematicians and processing power to train a machine in a particular problem domain. People not knowledgeable in the field of AI are currently unable to train machines to perform at consistently satisfying levels. These systems are thus restricted to particular domains and are slow to reach mass market adoption.
* * *
For other posts on Watson, artificial intelligence and how it's being applied in other fields, particularly Web 3.0, please see the following.