This is the first installment of Snickerdoodle’s Digital Identity thought leadership series.
In Web2, digital identity is a victim of silos. Our digital profiles – each of which represent but a fraction of our entire digital identity — are scattered across the web and housed by various institutions from academia to social media sites. These data bundles allow each institution to fulfill its mandate, whether that is provisioning financial services or ordaining graduation from higher education. The result is not only fragmentation due to the cordoning-off of digital identity systems, but a reliance on gatekeepers to repeatedly generate electronic identity attributes like credit history, transcripts, and patient records for users.
The shortcomings of a Web 2 identity model becomes even more stark when measured against the Digital Identity Alliance’s emerging consensus around ideal identity product requirements. Digital identity solutions should align with the ‘four Ps’: private, portable, persistent, and personal. This framework captures an ethos somewhat similar to the self-sovereign identity movement (SSI), which advocates for a decentralized identity management system that can operate independent of public or private actors and incorporates 12 guiding principles in its design.
The Four P’s
In this instance, private means control over one’s identity, data, and its permissioning (when it is shared and with whom). This kind of control has never truly existed in the Web2 era. Data ownership is relegated to the likes of corporations and governments, not users. In fact, data monopolies like Meta build moats around their data arsenals. They then extract value from these data arsenals – in the case of Meta, to the tune of 117 billion dollars, 98 percent of which is generated by identity-driven targeted advertising – and bury the ability to opt-out of commercialized third-party data sharing or rationalize exceptions through the T&Cs. From the onset, Web2 has never conceptualized data as an extension of users’ identities, but instead as a means to generate revenue or fees to export identity credentials (think of the costs associated with sending transcripts and test scores or renewing a passport, for example).
A portable identity is one where the underlying data is both accessible to the identity holder across several mediums and easily shared to relevant third-parties. Portability implies customer-centric design, a value not currently practiced by the architects of Web2. Because data resides in the centralized databases of each institution, users are at their mercy for data transfer. In contexts where institutions have invested little in bureaucratic digitization, like many healthcare or local government providers, data disclosure to third-parties is a particularly painful process. On the other hand, digitization is also no panacea. Even though regulations like GDPR and CCPA specify that exported personal data must be ‘useable’ in a structured and machine readable format, the downloaded files often resemble something out of a data dump hellscape. The threshold for regulatory approval over data usability is clearly a subjective and low-bar. For user approved third-parties looking to tease out the relevant identity information, data cleaning and engineering is necessary.
When an individual’s identity is persistent, it exists indefinitely beyond the confines of the institutions that may have originally captured or credentialed that identity data. Lack of persistence is a problem that plagues identity management systems in general and has been replicated in Web2. If my credit card provider closes shop tomorrow, I lose my credit and account transaction history (ironically, some data would be salvaged and live on through the credit bureaus, but I would have little recourse to amend or correct it). If my country collapses or descends into civil war, I would need to register for a new sovereign identity with the UNHCR upon escape. And while Web2 may only reproduce existing issues around identity persistence, the more important takeaway is that centralized internet cannot fundamentally fix it (more on this later).
An identity qualifies as personal if it is unique to the identity holder. Due to the siloed nature of Web2, many identity profiles cannot claim to be truly unique (although they will have a unique identifier, like an email or account number, to help facilitate authorization and access rights). With financial identity in the United States, for example, so hinged on Social Security numbers, data leaks of the magnitude of the Equifax hack have unleashed rampant identity fraud. Social engineering and phishing, meanwhile, usually entail some form of identity imitation, with social media data or other personal information repurposed. However, if all of these profiles existed in one place at the user level, it would be much easier for users to differentiate and distinguish themselves as the ‘unique’ identity. This is in harsh contrast to the current paradigm, where impersonators only need specific datasets to achieve their ends.
Now, astute readers might ask, do we need to migrate to a decentralized, permissionless, and trustless internet in order to reform identity and better align its application with the four Ps? Many industry stakeholders, including the Digital Identity Alliance, might argue no – in fact, there are members of its coalition that are incentivized to marginally improve identity without truly disrupting the underbelly of the data-sharing economy. But, I intentionally zeroed in the four Ps to illustrate the limitations of such improvements even against the backdrop of a Web2 friendly framework (versus that of SSI), and that, inevitably, we will hit walls that I believe only Web3 can tear down.
Most of the innovation in Web2 centers around portability, specifically around the introduction of centralized or federated identity management systems. In a centralized identity system, one identity provider – commonly a government agency – oversees identity data and user authentication. This centralized database is accessible to permissioned relying parties, who receive the underlying data attributes from the centralized identity provider. A federated identity, alternatively, is comprised of a consortium of industry-specific identity providers who enroll users and acquire akin identity data. The final repository of identity information is usually aggregated in one database. This consortium can then authenticate users and transfer data attributes to desired relying parties. (As a side note, federated authentication services are different from authorization for service provision. In many federated identity models, users can access the services of the consortium after his or her on-boarding with one partnering institution.)
Centralized and federated identity models, however, are just reductions in gatekeeping via slightly larger enclosures of fenced-in data. The trade-off for that modest value-add in portability of narrow datasets is not always compelling: the bilateral or multilateral coordination demanded to establish the necessary processes and secure technical infrastructure can be extensive, although basic APIs are fairly approachable. In Web2, that process must then be replicated for every dataset custodian. Unsurprisingly, with so many different identity providers solving data interoperability in their own kingdoms and for their own pain points, little collective thought has been given around standards for data presentation or formatting.
Somehow, that data flow circumvents the user entirely, with data passed between institutions and corporations. Web2 data vaults or lockers are an attempt to give users some semblance of privacy and control over their fragmented and often unusable data, but this is a tangential appendage to Web2 data streams. Users are not a pillar or linchpin, but an inconvenient afterthought. As long as that holds true, the application of ‘private’ will always be an illusion. Whether or not identity providers are upholding the integrity of consensual data sharing – and have not deliberately obscured the choice to opt-out – is based on trust and trust alone. Regulations like the CCPA and GDPR can do their best to insert consumer rights into the conversation, but if the endless discoveries of data privacy violations by data monopolies are any indication, lawmakers are stuck in a loop of reining in the excesses of Web2 instead of resetting the system.
Web3 architecture informed by SSI has flipped this paradigm on its head, deputizing the user as the heart of the data ecosystem. Through data wallets that rely on either local or decentralized cloud storage, users can lock away data contained within self-imported data attestations, non-fungible tokens (NFTs), soul-bound tokens (SBTs) or cryptographically-signed verifiable credentials (the mechanics of these data vessels and the consequences for identity applications will be further explored in the next identity series installment). This change in structure and data flow has five major implications:
With data living within a user’s wallet, data sharing is no longer dictated by trust. To initiate an exchange of data, users must opt-in. Users wield full control over data disclosure. This is substantially different from the Web2 opt-out model.
The justification for large centralized databases by enterprises is much reduced. Decentralized, trustless, and permissionless architecture enables on-going access to specific identity data when given the greenlight by the user. Data hacks would have to be orchestrated at the individual level, which is time consuming and the ultimate exposure is a much smaller scope of data.
Decentralized, trustless, and permissionless architecture demands open standards. The presentation of data represented by non-fungible tokens or verifiable credentials has already undergone a period of technical consensus building and standardization. While there is still progress to be made, the building blocks that will expedite asynchronous data interoperability and portability have begun to crystallize.
Due to the immutable nature of blockchain, there is a tamper-proof record of what entity issued what data schemas to which users. Even if the entity dissolves, the relationship between the identity provider and user has been archived. As long as the entity was a legitimate identity provider, there is no reason the data now stored in users’ wallets should not remain persistent.
Digital identity profiles that were previously siloed now exist harmoniously in a user’s wallet. Each wallet would truly correspond to a unique digital identity.
Even when graded against a more forgiving framework, Web2 fails identity. Web3, on the other hand, has the potential to reboot it. In our next installment, I will contour the different approaches to digital identity in Web3 and suggest a hybrid that maximizes adaptability.