Siri – Natural Language HCI

The biggest change in the way we use computers. For years Microsoft has been trying to re-invent the way we use computers, either through touchscreens using styluses or through voice control, however the biggest snag was that they tried to apply it to their operating system. This is a particularly bad move, trying to control an operating system that is designed for a mouse and keyboard, controlling this with your voice was never going to be a success.
This is where Siri comes in. I want to make it clear that I do not give credit to Apple for this, but rather the team that Apple bought who invented Siri. At the end of April 2011 Apple bought Siri for an estimated $200Million. I would like to tip this as the deal of the decade and only time will tell.
For years teachers have been talking about a natural language interface but there have been no real examples to show. Microsoft had a go, but it all depends on keywords and the order that you say them words. For example:
Command: Open MS word. Dictate: Chapter 1
Siri changes all of this. Finally we can talk to a computer in a language that is easy to say, and more importantly a way that is a lot more natural. This is so natural that I am guessing that everybody would be able to use it (Apart from the obvious people who struggle with speech).
This is the pinnacle of making computers easy to use. Now we all know that we cannot expect Siri to do anything overly clever at the moment. We will not be able to program spreadsheets using our voice any time soon. However this is the point, changing the way we use computers is important, not only because technology changes but because the way we use technology is changing.
In previous years people used computers to carry out business tasks, or to play games, to create websites and surf the net. In order to do all of these things you had to learn certain rules. Rules like formulas, controls, or how to write a great set of search criteria and look through a list of 10 blue links. Times are changing and we now want computers to help run our lives, make it easy to communicate and socialise but most importantly, the people that want to use computers is shifting. No longer are they willing to learn special commands, they want it to be plain and simple.
I can only imagine what will happen with Siri and the various other natural language, voice control pieces of software that are out there, but hopefully they will force the big companies to change their approach when designing software.

When I explain the topic of Human Computer Interfaces I always start by repeating the classic quote by the president of IBM, Thomas J. Watson in 1943.

“I think there is a world market for maybe five computers,”

This might have been true in 1943 as there would have only been a few people in the world who knew how to use computers. However if technology firms wanted to sell computers in the millions they would have to make them easy to use. This lead to the invention of the classic operating system that we see today. Lots of people know how to use this, and therefore they sell in their millions. If, however, they want to sell computers in the Billions they need to re-think just how easy it is to use. Siri makes things very easy to use and therefore is an interesting step into the future of making people more reliant on computers. Simply by cutting all the steps down to a single sentence:
“Remind me to call after work to see mum in 3 days time”
This will allow many more people to use technology and more importantly, more people will grow to depend on this.

If this is going to completely change the world though it will not be with Apple. Apple is a closed system and they want complete control. However in order to make this completely successful they will need to open up an API. This will allow developers to write their own instructions that Siri can use to search various different sources of information and use different tools. Only when this is enabled will we see just how creative people can be with voice commands.
Imagine a world where you see an advert for a TV program and you can simply say:
“Record ‘The World Cup Final’ on Friday” – using a TV guide API
Or imagine being able to say:
“I have got toothache, book me in for a dentist appointment sometime when I am not at work within the next week.” – using a combination of timetables and a dentist API.
One day we will not program good looking websites but clever audio interfaces. This is the future for casual computing.

