Category Archives: Identity 2.0

OpenID and Promiscuity

Yesterday, Paul Madsen posted a small comment regarding the OpenID plugin for MediaWiki.

The plugin, as it has been created, gives the ability for Relying Parties (in this case, the MediaWiki sites) to build OpenID whitelists and blacklists. For example, the Relying Party could choose to only accept OpenID identity URLs from idp.myvauthid.com, or choose to reject all OpenID identity URLs from Livejournal.

Paul poses that this sort of model is antithetical to the whole idea of OpenID – that a Relying Party is bound by best practices to accept any OpenID from any OpenID identity provider.

However, when it comes to security – especially with the implementation of the OpenID AQE – giving Relying Parties the ability to specifically trust, or more importantly the ability NOT to trust an ID provider is critical. In a completely decentralized authentication model, such as OpenID, there will be the need to block rogue ID providers – especially once comment spammers figure out how easy it would be to set up an OpenID ID provider that always confirms an identity without requiring authentication.

As OpenID grows beyond wikis and blogs and becomes an identity system used for handling more secure or transactional data, the need to be able to trust specific Identity Providers becomes key. Methods such as the MediaWiki plugin may break part of the original vision of the standard, but it does provide the gateway towards OpenID’s future.

When Biometrics Goes Wrong

File this under the category: just because you can do something doesn’t mean you necessarily should do something.

There’s been a good amount of recent chatter online regarding the upcoming IE and Firefox plugins for the Polar Rose photo indexing service.

Similar to the initial intent of almost-acquired-by-Google Riya, Polar Rose’s goal is to create a search engine for faces in photos. As quoted from the company website:

“Polar Rose makes photos searchable by analyzing their content and recognizing the people in them.”

This works by getting people to identify people in their own photos and photos from friends (see this example from Flickr) – essentially using people to train the system. Using standard techniques from biometric facial recognition software usually used by law enforcement, these static photos are then used to create a three dimensional model. This is done by using a couple of standard position points (typically pupils, tip of nose, curvature of the mouth) as references and then calculating the position of the face.

By using meta-data encoded in the picture, it’s also possible to adjust the facial image based on the date and age of the photo. So, a picture of you clean shaven picture of you in the summer can be correlated with a later picture of you taken in the winter with a full cold-defying beard.

If you’re not concerned yet, think about the ramifications of this – photos posted online with plausible anonymity are no longer anonymous at all.

As a biometrics researcher, it’s been important for me to stay on the side of the authentication, not identification. Authentication means that there’s a claimed identity that you start with, and then the biometric system confirms it. For example, with the vAuth(tm) platform that by company has created, you have to provide your user ID (or an alternate identifier), then you have to provide voice samples to confirm your identity claim. Consumer/corporate Fingerprint systems work the same way – you present an identity claim (login ID, ID number, ID card) then validate that you are the person you claim to be.

Systems like Polar Rose’s are identification systems – it takes a photo, extracts the face, compares it against the corpus of collected faces and tries to identify who it is. Technology like this is better handled by law enforcement, not put in the hands of the general public.

There are millions of pseudo-anonymous pictures online. For example, I have a couple hundred pictures on Flickr. Some of these have my first name. None of them have my last name. Now, let’s say that one of my friends is a Polar Rose user – he or she could start identifying my face using my complete name – without my knowledge or permission. Now, let’s say someone else stumbles across a photo of me from somewhere else, like a candid from a cocktail night (where I am not mentioned in the photo commentary or credits) and is also a Polar Rose user. That person uses the plugin to see who I am and gets my full name. That person is now a single Google search away from knowing where I work, where I live, where I eat and drink from possible blog entries.

It’s a slippery slope to the erosion of privacy. Embarrassing photo from the office holiday party make it online? That photo of you in the skimpy bikini? The one from your buddy’s bachelor party? These are perfect examples of photos where you don’t care if they’re online because they’re pseudo-anonymous, meaning that if someone knows you in real life, they might be able to identify you from the photo. However, the majority of people who might see the picture don’t know who you are aside from whatever information you choose to attach to the picture. Consider these pictures to be generally available.

The ethics of using biometrics for identification are complex and murky. Putting a potential stalker’s tool into the hands of miscreants, stalkers and pedophiles isn’t something to be taken lightly – and a simple license agreement stating that the tool can’t be used for stalking just isn’t going to be enough.

Strong Auth Drives Conversational Access

When I’m wearing my analyst hat, I’m constantly asked if “this the year for…” Is it the year for VoiceXML? The year for Speech Recognition? The year for speaker verification/voice biometrics? The year for VoIP? For the past year, I’d answer every question the same way, “2007 should be a big year” because the robustness of the technology, combined with a maturity of the vendors in the Conversational Access Technologies (CAT) arena lent towards the adoption of all of these technologies.

I still believe that 2007 is the year when we do turn that corner, hit the end of the runway and take off, cross the chasm and meet up with every other business cliché that describes what happens when the latent need for solutions breaks through the fear factor of being an early adopter and sales start to ramp up. However, next year’s growth is not due the technology or the vendors, or even cost avoidance. The next year’s growth will be based on meeting federal mandates such as FFIEC.

The first generation of Conversational Access Technologies were found in the financial industry, which brought us the first widespread use of IVRs for handling self service for credit cards. It’s the financial industry that will also drive adoption of the next generation of CAT.

The trigger, as mentioned in previous reports and advisories authored by (and for) Opus Research, is the FFIEC guidance. The guidance stated that in 2006, financial institutions needed to implement multi-factor authentication for the web. In 2007, this extends into telephony channels as well.

Early implementers of multi-factor security at banks primarily went down one of two paths: One-Time-Password generating tokens and Shared Secrets.

One-Time-Password generating tokens were obvious for many banks, as internally they have been used for years to restrict internal access to secure platforms. Solutions such as RSA’s SecurID and Verisign’s VIP generate a new numeric PIN every 60 seconds. A user would log into a website with their UserID and password, then enter the generated PIN and get access. It’s a very straightforward solution, though it has been considered expensive as each user needs to get their own token which displays the one-time password. RSA is considered the market leader in hardware based OTP technology.

Shared “Secret” Information makes up the other predominant solution for handling verification. There are three major categories of shared secrets:

  • Self-Supplied Secrets: in this case, the system asks you, at the point of registration, to answer a number of questions (What city were you born in, what is your favorite color) and at login, you will be asked to answer one or more of these questions.
  • Historical Data: in this case, the system uses historical information ranging from “what was the amount of your last deposit” to “when did you pay off your car loan” to “what was your address in January 2001, gleaned from a number of public and internal databases. You don’t pre-answer any question.
  • Photo Preferences: also pushed to market by RSA as a result of its PassMark acquisition, this method has you pick a preferred photo out of a selection of up to thousands of photos and at login, you’ll have to select that photo again to log in.

The failure of shared “secret” information is that it is rarely secret, and more important, the more this “secret” information is used, the less secure it becomes.

For example, when it comes to self-supplied secrets, the most common questions are easily found on the web or in publicly accessible databases: birthdate, mother’s maiden name, pet name, etc. How did Paris Hilton’s Sidekick get “hacked”? Someone figured out that she used her dog’s name. More important, is the fact that most websites and services ask the same information: birthdate, street where you grew up, mother’s maiden name, favorite pet – which means that your information is more and more out in the open. Historical data is also challenging – when I went online to request a copy of my credit report, it took me five minutes to figure out if I ever had a student loan from a specific bank, as the lender has changed multiple times based on consolidations and one bank selling the loan to another bank.

However the largest challenge with shared “secret” information is that this information is very much only applicable for the web. Securing a phone transaction with a picture is ineffective, and being able to speak freeform text to answer a historical or shared secret question isn’t technologically feasible. The only option would be to present a multiple choice for the user to answer, but best practices and common sense rule out any security method where a potential answer is given at the time of the challenge.

This is why 2007 becomes the year of the CAT.

The mandate for banks to implement multi-factor authentication for the web left the field wide open for vendors to propose “creative” solutions to achieve FFIEC compliance. However, once voice is thrown into the mix, the list drops dramatically. With voice, there are two available methods for authenticating: touchtone and voice. This leaves two methods for strong authentication: speaker verification and one-time pins.

Now, though I am the CTO of a voice biometrics firm, from a functionality perspective – both solutions solve the problem. Asking a user to input a one-time numeric PIN generated by a hardware token or to leave a voiceprint to gain entry both satisfy the requirements for multi-factor authentication.

More importantly, both solutions can be easily implemented for web and voice, assuming that the bank has a strategy for implementing a well thought out CAT infrastructure.

Implementing one-time numeric PINs for the voice follows the web in a CAT environment. In this case, the voice application would ask the user for the OTP, pass it over to the appropriate authentication system and get a response back regarding the user passing or failing the authentication request. Since the OTP system is already integrated to a web process (typically a web service/SOAP call), the voice application can make the same call (simplified with the use of a VoiceXML 2.1 request) and parse the same response to gain access.

Implementing voice biometrics for a web process, however is more challenging, but still easily handled. The typical process, as shown by vendors ranging from Authentify to VxV Solutions (my company) show a process where a web user, after starting the login process, is instructed to call a phone number and authenticate his or her voice, either receiving back a one-time pin (also called a Soft-OTP) or being redirected back to the web application after passing the biometric claim.

In both cases, the key is that the bank can now standardize their processes for handling both web and voice transactions. However, standardizing the processes doesn’t necessarily mean standardizing the method. It is expected that banks, and enterprises in general, will support multiple authentication methods based on the user’s needs and status. For example, shipping an OTP token with the bank’s name engraved on the back may cost upwards of $30 per user, but for key clients, the cost may be mitigated by the fact that it is a very fast way to log in. Conversely, a voice biometric solution is typically much cheaper, though less convenient for web users as it requires the user to make a phone call to enable their web session.

What is expected is the growth of a new range of multi-factor brokerage services, such as Ping Identity’s PingLogin solution: designed to let a user select the preferred method of providing multiple factors. In this case, consider a preferred bank customer. He (or she) may have an OTP token provided by the bank and a fingerprint scanner at home. The bank may have also enrolled a voiceprint. When the user logs into the website from work, he could use a voiceprint or OTP – when calling in, he could use the same voiceprint or OTP, but when logging in from home, all three methods could be used. Fingerprint, in this case, would most likely be the fastest and least obtrusive.

Instead of integrating each of these solutions into the voice and web applications, and requiring separate dedicated logic, the authentication broker would simply determine which methods are available, which can be used based on the mode, and then allow the user to select the method of his or her choice.

Again, the benefit of this type of broker is now exponentially increased based on the implementation of CAT. Common SOAP interfaces and easy integration into voice and web applications allows for this choice of flexible multi-factor authentication.

If 2007 is the year that CAT turns the corner, or crosses the chasm, or whatever we’re calling it these days – I’m looking towards 2008 to be the year of federated security. You can’t have all of these banks making investment in strong, multi-factor authentication without someone finding a way where they can monetize the implementations – and leveraging these internal identity databases and authentication methods lends towards these FFIEC compliant banks looking towards becoming independent, trusted Identity Providers (IdP). The currently blog-centric OpenID movement shows the beginnings of a decentralized security model where a user could use an identity at their bank to get into their healthcare account, or into their cable system to get their latest bill. Adding trusted Identity Providers helps move the focus of OpenID from blogs to transactional accounts such as banking and finance.

An appropriate metaphor?

There’s been some chatter lately in Identity 2.0 circles regarding Cardspace. Cardspace is Microsoft’s latest take at controlling participating in the nascent user-centric identity space.

The Cardspace metaphor seems harmless enough – you create a number of identity “cards”. The idea is just like you have an employee ID and a driver’s license with your home information which you keep in your wallet – you can have a number of virtual IDs (I’ll call them InfoCards in reference to Cardspace’s internal project name), which you store in your browser.

Actually, to take it a level deeper, there are two types of cards you can have: managed and self-asserted. The managed cards are like the drivers licenses and employee IDs of the Cardspace arena. Managed cards have the information on it locked by the issuing party. Just like you have to get your address on a driver’s license changed by the motor vehicles department – when you change your information on the managed InfoCard, you have to get the InfoCard provider to change the data. In contrast, there are self-asserted InfoCards, the equivalent of a business card. Anybody can go to Kinkos and get a business card printed up asserting that the holder is a certain person, at a certain address and working for a certain company. There’s no real trust involved here – it’s more of a way to speed up the exchange of commonly needed addresses so you can conduct basic business.
Now, in theory, the metaphor makes sense. I have business cards for my company, the firm I occasionally consult for, and personal business cards that I give to friends and people in the bar industry. I also have a drivers license which is my primary form of validated identity credential – and I keep both in my wallet.

This is where the Cardspace metaphor starts to break down. I can choose which wallet I put my cards in, and I can put that wallet into any pair of pants I own (or even a jacket pocket). Cardspace is Windows-centric, and not just Windows, but Internet Explorer only. Yes, there are homegrown plugins for Firefox, but they’re not technically supported by Microsoft. Though I have been able to jerry-rig Cardspace to work in Firefox on my Mac, it’s less than optimal.

Now, since the cards – even the managed ones – are loaded into the browser, it’s not very portable. If I only have one or two machines, managing these cards is not horribly challenging – but I have three different identities on my computer (home, work, development) and each of these have three browsers (Firefox, Safari, Opera) – that’s nine browsers. Then add in the Parallels environment and the three browsers there (Firefox, IE, Opera) and I suddenly have 12 different browsers where I could have to store these cards. Plus, there’s the browser on my smartphone – and I haven’t even started to think about what happens when I occasionally use a colleague’s machine or even a shared terminal at an airport or hotel.

It’s the equivalent of saying that I can only carry my drivers license in a State of California issued wallet, and if I use another wallet, my license may not fit or possibly will not always give the same information… and I’ll need another wallet for each pair of pants.

Even if I was primarily a PC user, the model still doesn’t work.

Dick Hardt, one of the real evangelists for the user centric identity movement has introduced Sxipper, for populating “business card data” into web forms and using OpenID for the authentication method. The challenge to make Sxipper truly useful will be the ability to store the ID information within a secure Sxipper server, and that when I log into skipper from any browser, it fetches the current information – and if there are any changes locally, it synchronizes back to the central server. Though I haven’t broached it with Dick, I would expect this sort of architecture will be made available as an option as the product matures. As a user, I may choose to keep some identities local, some cached to a local machine, and some only fetched as needed from the network – a perfect solution for shared or public machines.

Owning the amount of desktops as Microsoft could become the dominant, if flawed, metaphor for the next wave of identity. Then again, it might just be a house of cards.