When I’m wearing my analyst hat, I’m constantly asked if “this the year for…” Is it the year for VoiceXML? The year for Speech Recognition? The year for speaker verification/voice biometrics? The year for VoIP? For the past year, I’d answer every question the same way, “2007 should be a big year” because the robustness of the technology, combined with a maturity of the vendors in the Conversational Access Technologies (CAT) arena lent towards the adoption of all of these technologies.
I still believe that 2007 is the year when we do turn that corner, hit the end of the runway and take off, cross the chasm and meet up with every other business cliché that describes what happens when the latent need for solutions breaks through the fear factor of being an early adopter and sales start to ramp up. However, next year’s growth is not due the technology or the vendors, or even cost avoidance. The next year’s growth will be based on meeting federal mandates such as FFIEC.
The first generation of Conversational Access Technologies were found in the financial industry, which brought us the first widespread use of IVRs for handling self service for credit cards. It’s the financial industry that will also drive adoption of the next generation of CAT.
The trigger, as mentioned in previous reports and advisories authored by (and for) Opus Research, is the FFIEC guidance. The guidance stated that in 2006, financial institutions needed to implement multi-factor authentication for the web. In 2007, this extends into telephony channels as well.
Early implementers of multi-factor security at banks primarily went down one of two paths: One-Time-Password generating tokens and Shared Secrets.
One-Time-Password generating tokens were obvious for many banks, as internally they have been used for years to restrict internal access to secure platforms. Solutions such as RSA’s SecurID and Verisign’s VIP generate a new numeric PIN every 60 seconds. A user would log into a website with their UserID and password, then enter the generated PIN and get access. It’s a very straightforward solution, though it has been considered expensive as each user needs to get their own token which displays the one-time password. RSA is considered the market leader in hardware based OTP technology.
Shared “Secret” Information makes up the other predominant solution for handling verification. There are three major categories of shared secrets:
- Self-Supplied Secrets: in this case, the system asks you, at the point of registration, to answer a number of questions (What city were you born in, what is your favorite color) and at login, you will be asked to answer one or more of these questions.
- Historical Data: in this case, the system uses historical information ranging from “what was the amount of your last deposit” to “when did you pay off your car loan” to “what was your address in January 2001, gleaned from a number of public and internal databases. You don’t pre-answer any question.
- Photo Preferences: also pushed to market by RSA as a result of its PassMark acquisition, this method has you pick a preferred photo out of a selection of up to thousands of photos and at login, you’ll have to select that photo again to log in.
The failure of shared “secret” information is that it is rarely secret, and more important, the more this “secret” information is used, the less secure it becomes.
For example, when it comes to self-supplied secrets, the most common questions are easily found on the web or in publicly accessible databases: birthdate, mother’s maiden name, pet name, etc. How did Paris Hilton’s Sidekick get “hacked”? Someone figured out that she used her dog’s name. More important, is the fact that most websites and services ask the same information: birthdate, street where you grew up, mother’s maiden name, favorite pet – which means that your information is more and more out in the open. Historical data is also challenging – when I went online to request a copy of my credit report, it took me five minutes to figure out if I ever had a student loan from a specific bank, as the lender has changed multiple times based on consolidations and one bank selling the loan to another bank.
However the largest challenge with shared “secret” information is that this information is very much only applicable for the web. Securing a phone transaction with a picture is ineffective, and being able to speak freeform text to answer a historical or shared secret question isn’t technologically feasible. The only option would be to present a multiple choice for the user to answer, but best practices and common sense rule out any security method where a potential answer is given at the time of the challenge.
This is why 2007 becomes the year of the CAT.
The mandate for banks to implement multi-factor authentication for the web left the field wide open for vendors to propose “creative” solutions to achieve FFIEC compliance. However, once voice is thrown into the mix, the list drops dramatically. With voice, there are two available methods for authenticating: touchtone and voice. This leaves two methods for strong authentication: speaker verification and one-time pins.
Now, though I am the CTO of a voice biometrics firm, from a functionality perspective – both solutions solve the problem. Asking a user to input a one-time numeric PIN generated by a hardware token or to leave a voiceprint to gain entry both satisfy the requirements for multi-factor authentication.
More importantly, both solutions can be easily implemented for web and voice, assuming that the bank has a strategy for implementing a well thought out CAT infrastructure.
Implementing one-time numeric PINs for the voice follows the web in a CAT environment. In this case, the voice application would ask the user for the OTP, pass it over to the appropriate authentication system and get a response back regarding the user passing or failing the authentication request. Since the OTP system is already integrated to a web process (typically a web service/SOAP call), the voice application can make the same call (simplified with the use of a VoiceXML 2.1 request) and parse the same response to gain access.
Implementing voice biometrics for a web process, however is more challenging, but still easily handled. The typical process, as shown by vendors ranging from Authentify to VxV Solutions (my company) show a process where a web user, after starting the login process, is instructed to call a phone number and authenticate his or her voice, either receiving back a one-time pin (also called a Soft-OTP) or being redirected back to the web application after passing the biometric claim.
In both cases, the key is that the bank can now standardize their processes for handling both web and voice transactions. However, standardizing the processes doesn’t necessarily mean standardizing the method. It is expected that banks, and enterprises in general, will support multiple authentication methods based on the user’s needs and status. For example, shipping an OTP token with the bank’s name engraved on the back may cost upwards of $30 per user, but for key clients, the cost may be mitigated by the fact that it is a very fast way to log in. Conversely, a voice biometric solution is typically much cheaper, though less convenient for web users as it requires the user to make a phone call to enable their web session.
What is expected is the growth of a new range of multi-factor brokerage services, such as Ping Identity’s PingLogin solution: designed to let a user select the preferred method of providing multiple factors. In this case, consider a preferred bank customer. He (or she) may have an OTP token provided by the bank and a fingerprint scanner at home. The bank may have also enrolled a voiceprint. When the user logs into the website from work, he could use a voiceprint or OTP – when calling in, he could use the same voiceprint or OTP, but when logging in from home, all three methods could be used. Fingerprint, in this case, would most likely be the fastest and least obtrusive.
Instead of integrating each of these solutions into the voice and web applications, and requiring separate dedicated logic, the authentication broker would simply determine which methods are available, which can be used based on the mode, and then allow the user to select the method of his or her choice.
Again, the benefit of this type of broker is now exponentially increased based on the implementation of CAT. Common SOAP interfaces and easy integration into voice and web applications allows for this choice of flexible multi-factor authentication.
If 2007 is the year that CAT turns the corner, or crosses the chasm, or whatever we’re calling it these days – I’m looking towards 2008 to be the year of federated security. You can’t have all of these banks making investment in strong, multi-factor authentication without someone finding a way where they can monetize the implementations – and leveraging these internal identity databases and authentication methods lends towards these FFIEC compliant banks looking towards becoming independent, trusted Identity Providers (IdP). The currently blog-centric OpenID movement shows the beginnings of a decentralized security model where a user could use an identity at their bank to get into their healthcare account, or into their cable system to get their latest bill. Adding trusted Identity Providers helps move the focus of OpenID from blogs to transactional accounts such as banking and finance.