Who owns The Social Graph and Social Media ? A Treatise on Content, Data, and Relationships.

Author: Jonathan Vanasco
Contact: jonathan@findmeon.com
Date: 2008-06-30
Copyright: © 2008 Jonathan Vanasco
Version: 0.2 Preview (may change)

Who owns The Social Graph and Social Media ?

I've spent the past few weeks in non-stop meetings with corporate executives, startup teams, ad+interactive agency management and everyone in between trying to get the 'Social Media Standards' project ( http://SocialMediaStandards.org ) up and running.

A question that constantly comes up (ad-nauseum) is that of ownership: who owns the media and data fostered by our new Social Media culture, and the social relations they digitally represent.

Notice my careful phrasing here - 'fostered by' - it's a key concept that we'll get to later on.

Depending on what an organization does, and why they ultimately care about Social Media, they will either insist that this information is wholly public or mandate that it is wholly private; there is no middle ground. However, no matter which of these beliefs are embraced, interests will always claim something to the effect of "the user owns their data," and hide behind the user's "best interests" whenever confronted for detailed reasoning.

I've been fortunate enough to sit down with some exceedingly brilliant ( and often key ) people for off-the-record discussions concerning the Social Media ownership debate. With these valued conversations and my own experience in the industry through FindMeOn behind me, I'd like to provide others with some insight, dispel some myths, and suggest new avenues for dialogue.

I've created this document as a primer for my clients, friends, and the industry at large to fully understand the Social Network arena. While this essay is designed to be an exhausting compendium of Social Media issues and solutions, it is by no means complete.

The social media landscape is continually evolving as new technologies are created and consumer culture shifts to adapt. Social applications ( and the total number of accounts ) are rapidly proliferating, as are total number of consumers who use these tools across properties. The interconnectivity of accounts and data is inevitable. Operating a social network means that you will soon be involved in data portability ) if you aren't already ) -- either as a source, transport layer, or consumer of information.

The porting of data serious raises legal questions that vary across implementations -- and creates the potential for numerous liabilities if the execution is wrong. The revelations and recommendations offered below are meant to start dialogues and identify shortcomings -- not provide a legally sound social media strategy.

In Preparation

To take a fair and balanced examination of the Social Media landscape, we need to drop all preconceived notions in the arena.

If you're reading this document from the corporate viewpoint, you must abandon all mandates that "the network" owns all social data. Similarly, if reading this as a user enthusiast, you must drop the supposition that everything belongs to them.

These are expectations , wants, and beliefs -- not necessarily facts.

Look Beyond the Spin

Just about every tech company and startup today has a high-profile Social Media initiative, and someone on staff who straddles the line between product manager and 'evangelist' -- often having both on their business card. Any time a social media or portability conference has a panel or presentation, one of these roles will inevitably ask or-be-asked-by-a-counterpart - "Why won't you open up?", "What can we do to get you involved", "How can you oppose what the users want?!".

It's the job of evangelists and product managers to stage questions like these, and make it seem like their company and product are at the forefront of user-oriented product design. As these events are covered by bloggers - not reporters - we can expect zero objectivity in our news. Journalists attempt a code of ethics, industry bloggers are deeply involved in 'the scene', often having their subjects as corporate friends/foes, drinking buddies, or significant others.

While many efforts are indeed honest and earnest, as we look past the spin many prove to be simple corporate posturing for the media and industry. Corporations are involved in social media for a reason - it affects their end business goals. As we approach this subject we're going to identify and analyze corporate posturing , disregarding press releases and media events from our debate.

Look Beyond the Culture

Social Network industry members ( developers, pundits, evangelists ) embrace ideologies and cultures of their own -- ones that are often not only cutting edge and ahead of the mainstream, but also tend to be philosophically different from the status quo. As industry professionals, they are far more accustom to communicating and living their lives online -- exhibiting different (sub)cultural sensitivities to concepts of privacy than the general public does. This cultural divide is exemplified by the typical industry member's eagerness to showcase their online activities and content across networks, while the general public shows a predisposition of apprehension when joining even a single social network to share personal information online.

Isolate... But Still Value these Insider Opinions

The information and viewpoints of involved corporations and individuals shouldn't be discounted or ignored -- they offer a valuable expert perspective; they just need to be contextualized and understood as perspectives, not indisputable facts.

By understanding and examining their perspectives on this situation, we can distill key facts that characterize the social media landscape.

The Facts

Fact #1 - Engagement: It's what everyone really cares about

People like to frame the social media dialogue in terms of "Owning Consumer Generated Content" or "Owning the Social Graph" - but these are just facets of a much larger, and intrinsically different, concept.

The corporate interests in Social Media are overly simplistic - concerns lie with what best generates revenue and drives brands forward. Users are not factored into these equations; numbers reign supreme.

Corporations don't actually care about owning social media or graphs. Most corporations, at the policy making level, don't even know or care about what these things are.

What these interest do care about owning is engagement -- the user's attention span, dialogue/transactions, brand affinity and fostering positive experiences associated with their brand. This is what drives continued association with the brand, and creates repeat customers.

To own a higher percentage ( ideally a monopoly ) on a user's engagement with their properties, corporations at times will become an "Agent of Change" to try and capitalize on opening up a network. At other times a corporation must resist openness and lock down users and data. Given all current popular implementations, neither has a real advantage over the other to the end user. Whether a corporation is trying to tear down or reinforce the walls of social networks is irrelevant - they're simply looking for the best way to capture the user's affinity.

Corporate interests have three ways to manage engagement in the social media arena:

  • Retain user attention in a network ( i.e.: walled gardens )
  • Harness user attention during portability services ( i.e.: cross-network platforms )
  • Grab user attention by converting / porting people from other networks ( i.e.: the portability consumer )

There are different monetization strategies within each realm of Engagement; a specific corporation may target one or all three forms; each one has its own pros and cons for the end-consumer. These details are unimportant to the simple point: corporations don't actually care about ownership, they care about the Engagement. Attempts to own or liberate data are the symptoms and practices of Engagement management, not the real concern.

Fact #2 - Social Media is a complex pairing of Content and Activity

The classic question people ask in social media portability circles is "Who owns my 'friends list'?".

Few questions can elicit such passionate responses. Everyone has an opinion, and everyone's opinion strives to be loudest in the room. While consumers and corporations each rush to insist it is a simple problem, and the solution is that they own data, a careful examination shows the problem to be much more complex.

If we take a detailed look at the concepts and practices behind existing Social Media implementations, it becomes crystal clear that this question has no universal answer. It's almost a trick question -- one that has a different answer by design for each network, and has people dancing around more important topics.

At the heard of the problem, is that when people talk about Social Media they all-too-often confuse Content and Activity with one another. This seemingly simple question is perhaps the best illustration of this fact.

To understand this, let's step back for a second and think about how we build our own 'Social Graphs' on the internet.

There are two fundamental ways that I can characterize someone as a 'friend' -- adding them -- to one of my generic social network accounts:

  • I physically enter in their details: name, address, email... or I 'import' them from another service or application. In both methods I provide the social network with data.
  • While using the social network's service, I see a profile or dialog box that offers me an opportunity to 'Add to My Friends' by clicking a button.

In the first example I am actively creating content of a sort. I am inputting contact details into a computer system for it to index and manage for me. I am providing the social network with data. I am figuratively telling this social network "I am friends with the person that this data is representative of".

I am doing something drastically different in the second example. I am not creating content. I am not inputting contact details. I am not providing the social network with data. The network is instead prompting me to respond: "Am I friends with X?". If I respond, then I am interacting with data already input into the system. I am merely providing the system with a confirmation of my relation to a specific existing 'node'. I do not input my own content, instead I rely on content already existing within the system. Instead of providing the social network with data, I provide it with a boolean repsonse: Yes, No; 1, 0. Not an email address, not a name; I send a one or a zero.

The existence of this dichotomy makes the concept of ownership in my social graph incredibly complex: potions of this graph exist because I actively entered them in myself -- providing records, creating nodes, compiling a list of 'email addresses' or otherwise; while the rest represents nodes I have associated myself to, that each contain content created by others.

_img/2-AddFriend-Facebook-Import.png

A Facebook.com prompt to upload social network contacts.

_img/2-AddFriend-Facebook-Import.png

A Facebook.com prompt to automatically import social contacts from other services.

_img/2-AddFriend-MySpace.png

A MySpace.com prompt to 'click to add' a person as a contact.

This key difference between content and activity is extremely important to the Social Media debate, as it raises numerous concerns in copyright , privacy, and contract law.

If we take the distillation a step further, we can better understand these new - extremely important - concerns.

Once I become 'friends' with a contact ( possibly requiring them to confirm our relation ), I become privileged to access their content: the name, address, phone number, photographs, postings, etc that they have generated.

Across the dozens of Social Networks I belong to, seldom have I ever actually entered in the data on my contacts. I've mostly only clicked "Add as a friend" or "Yes, I do know them!". The actual list of contacts and data entry that I have curated through my social networks has essentially been nothing more than a series of numbers that can be read "I am friends with user ids 10001, 100923, 209213".

Whether I own the data of that activity/listing myself is irrelevant to this current point - the actual value of information I want from this "Social Graph" is not the information that I had entered ; it is not the data or content that belonged to me and was entered into the system for storage; it is the names, addresses, and content that was entered by other users of the system and I have associated myself to.

This is the key point: the 'Social Graph' which I have assembled through social networks is not built through content I have created or uploaded - but comprised of nodes of information that are created by other people or entities and I have referenced and incorporated into my own view.

Is a friends list a simple list of node ids? Is it the content that I created? Is it the content that I referenced? What about content that I referenced that belongs to others?

When someone asks "Who owns the friends list?", the response isn't "The user" or "The network" -- the response is "What exactly is a friends list?" This question can not be met with an answer, only dialogue.

Possible Components of a portable Social Media Friends List

  • the base listing of node ids
  • the nodes + content that the user uploaded
  • the nodes + limited content that the user refers to / incorporates
  • the full content of referenced nodes

Fact #3 - Portability efforts are complicated by basic User Privacy concerns

The phrase "Social Graph" contains a word that people often forget about during discussions -- 'Graph'. At the risk of sounding inane - let's discuss the 'Social Graph' in terms of it being an actual graph that we can plot.

Imagine what things would look like if we logged onto one of our networks and tried to plot the Social Graph of our Friends List onto a sheet of Graph Paper:

  • We draw a dot on the paper to represent our account, and label it with our network identifier ( our url, numeric id, etc )
  • We draw 2 dots which each represent one of our friends, and label it with their unique identifiers
  • We draw a line from our dot to each of our friends' dots, illustrating and characterizing our friendships

That is literally our Graph within the network: points represent people, lines illustrate relationships. This illustration is incredibly simple, and complete - is is our social graph.

_img/1-GraphOnPaper.png

Assuming that this network is happy and eager to provide us with any portability tools we want, let's think about how we can 'free' this graph for use on other sites and applications.

The first thing we can do is to request a copy of our graph. The network gladly complies, sending us a machine-readable list that looks a little something like this:

  • Follows: 10001
  • Follows: 10002
  • FollowedBy: 10009
  • FollowedBy: 10011
  • Follows_AND_FollowedBy: 10023
  • Follows_AND_FollowedBy: 10024

At it's core essence, that list is our graph: the 'dots' that we are related to, and the type of line that relates us to them.

Unfortunately, this doesn't do much for our needs -- we just see some numbers that only exist and make sense within the network that we're on. We can't actually do anything with this information. We can't really 'port' this graph to another network or application ( unless these other people claim their ids on the networks, so they know what the numbers mean ).

The information we need isn't just our 'Social Graph' - but also the information behind the Social Graph -- what it represents.

Since we're on very good terms with this network - and they're committed to giving us any and all tools we need - we decide to ask them for some more information to help us describe these accounts.

They gladly respond with this packet of information

  • Follows: 10001 { name:'Adam' }
  • Follows: 10002 { name:'Brian' }
  • FollowedBy: 10009 { name:'Christina' }
  • FollowedBy: 10011 { name:'Danielle' }
  • Follows_AND_FollowedBy: 10023 { name:'Edward' }
  • Follows_AND_FollowedBy: 10024 { name:'Faith' }

Now this helps a little -- but not a lot. The network has given us the names of these people - but there are millions of people out there with the same name. We don't have any real way to associate these numbers with who these people really are.

We need more information - we need to know how to resolve these numbers to real people. So we need to ask the network for a little more data that can help us actually port this graph to another application and migrate our relationships.

We're looking for some sort of universal or global identifier - something that can help us look these people up or invite them, or connect with them in some other way. We need something like... something like... their email address!

We ask the network , who we're on great terms with and who is FULLY committed to portability , to help us out by giving us the email addresses of our connections.

The network gladly responds with this packet of information

  • No
  • No way.
  • Seriously?
  • Are you Fucking kidding me? Really. Are you?

Whoa! What happened? Were they annoyed by our constant requests for more info? Did we piss them off by trying to get at something they don't want to port?

No. Not at all. Unfortunately too many people think that. In our example our network is fully committed to opening and porting the graph for us -- but it can't under any sort of reasonable circumstance

A lot of people in the Data Portability movement believe that this response can only be born out of the network trying to retain the Social Graph, and this is due much in part to believing in a phrase I placed emphasis on above

We don't have any real way to associate these numbers with who these people really are.

This concept is at the root of a huge cultural discordance between the technology community and general population. On most social networks, we don't know who these people really are -- we know their ids, we know some curated facets of their lives, but we know these people digitally -- we know them online, not in real life.

Proponents of Data Portability tend to live outwardly public lives - showcasing to the world all of their online identities, offline contact methods, and blogging about their daily activities. They are clusters of people who have hundreds ( and often thousands ! ) of contacts on each network -- of likeminded people who want to share personal information with them.

The average internet person is far different - with dozens-to-hundreds of accounts as contancts, and their profiles are quite different ias well. On most social networks, detailed information such as a person's email-address/instant-messenger-account/telephone-number/etc aren't considered viewable content -- even by contacts or friends. The few networks that do feature this data (like Facebook) allow the owner of an account can decide who their contact information can be exposed to -- and it is often shielded.

_img/3-Facebook-Privacy.png

A Facebook.com privacy setting, where email can be shown to users.

_img/3-Facebook-Privacy-Custom.png

A Facebook.com custom privacy setting, allowing for granular editing.

The most popular social networks, like MySpace, do not allow this information to be shared -- by design. Their actions aren't about 'locking down' users to a network -- they are about giving people the flexibility of adding contacts and meeting new people without fully exposing themselves.

While Social Media Industry Insiders strive for the epitome of connectivity and self-promotion with the mentality of "publishing" their lives and contact info online, their end-consumers are apprehensive and weary of these concepts.

Imagine if befriending someone on MySpace meant sharing your email address with them. How would your use of that network be altered? Would you still add all the random bands / friend requests -- knowing that they now have your email and can now send you all sorts of emails, marketing materials or even sell/distributed your contact information? Would you still make new random friends knowing they can immediately do a websearch with your email address, or input it into other networks and find more information about you and your family?

If MySpace announced that subscriber email addresses would be made visible to contacts at noon tomorrow , would your friends-list shrink from 500 to 25 people overnight?

A key concept that too-few people understand is that Social Networks safeguard email addresses and detailed contact information out of common-sense user privacy concerns. This isn't about networks trying to retain users, or hiding behind Terms of Use or Privacy Policies , or even something that should be addressed by a new era of portability friendly privacy policies. This is something necessary and inherent to the functionality and success of social media: being friends on ThisSpace doesn't equate to being friends on ThatSpace or TheOtherSpace.

While working on our First-Generation Portability Solution a few years ago, my company FindMeOn did a lot of focus-group testing to gauge users' feelings towards exposing themselves across networks. The results weren't pretty. We found that most people weren't just weary of sharing information across networks, but sharing information within networks too.

The average non-industry person we focus-grouped was glad that MySpace didn't share their email address with contacts, but didn't like how the privacy settings were 'all or nothing' for sharing hometown or other info. Most importantly, many said that they would be weary of using services like MySpace if their personal information or contact methods were shared with others.

We weren't able to address intra-network concerns, but we took these notions to heart in concepting our inter-network framework. People were weary of letting their Social , Professional and Familial spheres from colliding with one another online across networks - so we designed and patented a 'switchboard' system of sharing network awareness and then customizing profiles and data syndication feeds based not only on who the information would be shared with , but the networks they would be exported to.

_img/3-FindMeOn-Privacy-Networks.png

A FindMeOn.com privacy setting screen, allowing network awareness to be configured. On this screen: network identities can be marked as Public, Private, or Friends-Only; settings can be overridden to always expose or hide an identity to the perspective of another social network.

_img/3-FindMeOn-Privacy-Profile.png

A FindMeOn.com privacy setting screen, allowing profile attributes to be configured. Each profile attribute can be hidden-from or exposed-to the perspective of a specific social network.

Casual social network users and industry veterans often told us: "We love this control over sharing our own information online" and "If only every network had these controls inside". ( They also often said "If only this were more user friendly." Actually, they said that last bit a whole lot! )

Fact #4 - Portability efforts are complicated by Copyright Concerns

The portability debate is even more complicated as 'viewable' information can be usable or not due to copyright concerns.

We earlier realized how much of the information and data on the nodes of your own social graph was entered by other parties -- and you just created an association to that material on your graph through Activity - not content creation. While you may own the link to the data by importing it onto your graph/account, the data that you link to may actually belong to someone else.

The question naturally arises: "Who might that data belong to?"

There are four roles that can effectively own information on your social graph: - You - The Content Creator - The Network - No-one

To better understand how these roles relate to Data Portability, we need a quick legal primer. I am not a lawyer , nor do I play one on TV -- but I did a fair share of pre-Law in undergrad, most of my friends have JDs, and my company FindMeOn is actively working with 5 of the country's best law firms. I may be an armchair 'legalist' - but I'm a decent one.

In 1991 the landmark US Supreme Court case "Feist v. Rural" set current precedent on 'information' copyright in regards to 'originality'. Rural Telephone Service was a small telephone company that sued Feist Publications for copying their local telephone listings and printing them in a regional phone book. The Supreme Court reasoned that Rural's content was not copyrightable, and Feist broke no law.

The court described the essence of their new precedent on US Copyright Law as follows ( summarized from some articles and WikiPedia ):

  • Copyright was designed to promote and encourage creativity and creative expression
  • as such, information itself is not copyrightable
  • collections of information can be copyrightable if there are creative aspects to the collection itself: how the collections is edited/curated or presented
  • in situations where collections are copyrighted, the information is not copyrighted -- just the collection

Assuming that a Social Network is willing to provide us with all the tools and information we want, let's apply these legal interpretations of case law to social media.

  • Copyright does not extend to 'contact' information on social networks or otherwise: pure data is not copyrightable, only created works or creatively curated lists.
  • It would be difficult to argue creative exercise in assembling a friends list, so a user claiming copyright ownership over their graph is not likely to succeed.
  • The network would be hard-pressed to find a way to claim any copyright or other ownership on this information.

So if we're talking about "who owns the social graph?" in terms of copyright , copyright law is wholly inapplicable to most of the argument -- the data just isn't copyrightable.

Does the user own copyright on this data? No. Does the network own copyright on this data? No.

Neither the user nor the network can own copyright on the basic information that exists in lists of nodes / contacts -- this information is simple raw data.

If we assume that there are no privacy settings in place, and we're able to access the email address or other raw information of users on Social Media websites, that information would not copyrightable either.

So in a very legalist manner, from the perspective of copyright, the core of information powering the social graph is not ownable -- it's just raw data.

However 9 out of 10 times the information I want... what is of value to me... that which exists beyond the basic nodes of a "friends list" and the raw data that it relates to - may indeed fall under copyright or ownership issues.

More importantly, accessing this data is a completely different story.

Fact #5 - Portability efforts are really complicated by Contract Law

Corporate lawyers are smart -- perhaps too smart: where copyright law will fail, they use contract law to succeed.

Feist vs. Rural set the standard on what is copyrightable and what is not -- however just because data is accessible doesn't mean that it is reusable or redistributable.

In 2000 a startup called Gigmania billed itself as the premier source of live music information, offering hordes of concert data to fans online. Instead of compiling or sourcing the information themselves, the Gigmania team had another idea - they data-scraped the entirety of Pollstar's website to redistribute their information as their own.

Gigmania's management thought they were savvy and covered -- they knew about Feist and that raw data was not copyrightable; they also figured they'd never get caught by Pollstar. But Pollstar caught them, and sued.

As a security measure commonplace in the publishing industry, Pollstar published fictitious concert listings of fictitious bands in fictitious cities -- which started showing up on the Gigmania website. When Pollstar started seeing a competitor showcasing information they made up, they figured they had a good copyright case.

So Pollstar sued Gigmania and won, but not on their original suit. The original claims of copyright infringement were dismissed after lenghty debates on timeliness and "hot news" -- but the court did find Gigmania guilty of violating contract law: Pollstar had a link to their Terms Of Service page on their website , which explicitly prohibited commercial use and redistribution. By repeatedly visiting the site, Gigmania was found to have accepted the terms of use contract.

Court cases prior-to and since Pollstar v. Gigmania have set boundaries on what constitutes a valid Terms of Service assent ( based on linkage and usage and context and other complicated stuff ). Fortunately the rationales are mostly irrelevant: when consumers sign-up for and use social networks, they almost always end up clicking a box that says "I have read and agree to the Terms Of Service" - and then become bound by that usage agreement. Large web-services firms also routinely send startups 'cease and desist' letters revoking authorization to use the network.

The key point here is that even if we have no hinderances from privacy policies or copyright, a cleverly worded Terms of Service agreement can completely lock down a social graph.

At this point, our debate needs to shift from the legal standpoint -- where we have no standing -- to the ethical one.

As a consumer, there's a lot of information that matters to me on social networks:

  • raw data that I've entered ( contact info )
  • creative content that I've entered ( comments , posts , picture )
  • information that I've collected through activity
  • information that I've generated through activity

In the legal sense, unless the Terms Of Use on a network has a hidden clause that re-assigns copyright from me to the network ( some networks do try to pull that ), I either own much of this information or no one does; and contract law defines how I can access and use it through the network.

In spirit, in a truly fair world, that data is all mine to access, use, and distribute as I see fit. The network is essentially just a middle man, who associates the numbers in my list with the data on other people's lists and gives me an interface to store content. I exclaim, "It's a moral outrage that the network will not let me port this information!"

But that is my view as a consumer.

Corporations have a different opinion:

The corporation, as a social network, offered me certain packaged services and abilities -- for no direct monetary compensation (though many earn income through advertising to me or selling information based on my profile). Dissatisfied with the service, or lured by a shiny new one somewhere else, I now want to leave ( or use to a lesser extent ) the service , and expect the network to provide me with the ability -- again, for free -- to ease my discontinuation the service. The network exclaims, "Are they kidding me? We're supposed to do R&D and hire people for this? And fund/maintain the software and hardware! What are they thinking?"

User's have a valid ethical point in maintaining access to this data -- but corporations do too. Contract law may come off as an 'evil' to many consumers, but it is a business necessary to the network operators. Beyond 'content', much of the data that social networks and applications generate through activity are the essence of the network and their business logic. Should a network be expected to fund a user moving to a competitor -- or export their own prized metrics, recommendations and filterings?

Fact #6 - Don't Forget, Social Media is both Social and Media: It's interconnected by definition

Social Media has turned publishing into a dynamic dialog: people no longer publish solitary postings, but engage in constant discourse with others -- making their primary content contributions and comments/replies important elements of a larger discussion.

Often associated with the concept of portability is that of destruction -- that which happens when a user decides to deactivate an account.

While the desire of a single user to terminate their account and delete activity is a fair and obvious expectation, it is now balanced by the social aspect of online media which lends to certain expectations of other users that have already seen or even interacted with content.

Some concerns believe that once an account is deactivated, all information should be removed. Others believe this to be incompatible with new concerns created by social interdependencies.

What happens to in-network messages when one participant destructs their account? - Are both sides of the conversation totally destroyed ? - Is only the destructed account side destroyed ? - Are both sides maintained in the remaining user's inbox and sent folders ?

What happens to threaded conversations that destructed accounts participated in, or media that they commented on in concert with others ? - Are their nodes in the conversation removed ? - Are their nodes in the conversation turned blank ? - Are the entire conversations destroyed ?

What happens to media that destructed accounts published, and others commented on ? - Is the entire history of the media and comments destroyed ? - Is the media destroyed but comments left ?

Beyond these user-experience based questions, the culture and architecture of the internet has fostered certain technological behaviors to improve usability and experience that raise new concerns about removed content. Information is rarely published from the original source, instead being distributed by cache servers which lighten the load. These cache servers lend to multiple copies of content being 'out of sync' with the master, which can require time for deletions and modifications to effectively propagate across computer networks. While these servers are sometimes operated-by or on-behalf-of the social network, they are more often operated by telecommunications and networking companies striving to increase efficiency. The destruction of accounts and removal of media brings into question to what extent networks are obligated to update content across the distribution infrastructure -- which they may not even control.

Fact #7 - True portability requires a public identifier, universal opt-in, or a third party service.

It's a simple desire -- people want to be able to migrate friendships from one social network to another. In a more perfect world this would be met with simple delivery too -- but the correct execution on this want has proven to be everything but easy.

Public Identifiers are not a solution - at all

The immediate urge many have when trying to port relations across networks is to use a universal public identifier -- like an email address or OpenID endpoint. By reducing all social network relations to their corresponding email addresses, we have a simple and straightforward way to download our social graph and import it onto a new network.

Unfortunately this idea is quite awful and grossly inept, born of incompetence and irresponsibility. It relies entirely on assumptions that are incorrect, and raises serious privacy concerns.

  • People do not use one and only one email address for all social networking. Many use a mixture of email addresses as they add accounts over time and shift between primary addresses; many also use a home email address for some networks and work/school for others.
  • Social Network Friendships do not equate to 'email' friendships. Being friends with someone on MySpace does not mean you want to expose your email address to them.
  • Social Network relations are not all created equal. Being friends with someone on one network does not mean wanting to befriend or expose yourself to them on another.
    • Users are increasingly worried about trying to maintain the character of their contact lists. A sense of awkwardness is created when someone is issued a Facebook or LinkedIn connection request from an old friend or casual acquaintance. "Do I have to add them?" "Can I reject/ignore them?" "What will happen?" Users must balance the feelings of others against their own desires to safeguard their own contact information of the identities of their friends.
    • This isn't just a matter of creating friendships, but also one of account knowledge / existence. Many users cringe at the thought of someone being able to divine their Facebook account or Flickr family photos through information shared on their MySpace or Blog identities.
  • On the technical level, portability strategies that consist of a universal identifier are extremely dangerous to user privacy. By reducing all accounts to a single universal identifier, one can trace the relations of all social network accounts to that single identifier - and query multiple networks to find identities through that identifier. This doesn't solve social network issues, it creates them by collapsing the social graph and conflating user identities. Casual, professional and familial facets of an identity collide; employers can discover unprofessional 'off hour' activities, families can see unapproved behavior documented.
    • Several high-profile cases have occurred in the past year where employees have lost their jobs because of MySpace or Facebook activity that was linked to their work accounts; a number of gay students have been inadvertently 'outed' to conservative families through information shared online as well.
  • Connecting with someone through a Social Network means connecting with that person within the context of the network - not in all of the universe. This is not a brand issue , this is a context issue.
    • Imagine signing up for a new professional social network where you post your resume and say "looking for new projects". If your boss were to join that network, and an automatic friendship migration happened -- which many Social Media Industry Professionals are advocating -- your boss would instantly be notified that you were looking for a new job as your friendship is ported and they are directed to your profile.

Social Network Industry members are consummate networkers and social media advocates -- they're on dozens of networks and services , and actively promoting their corporate brands to the masses, and personal brands ( their own different identities ) against one another. These individuals want to drive consumers back to a single point with all the information about them, their corporation, their products - its a cultural and philosophical positioning.

But this mindset is, simplest put, not how the general public feels about networking.

If you look at a Social Network Industry Member's online profiles, you'll find them generally jam-packed with personal info and linkage, touting their presences on each network against each other, and showing all of their contact info.

This is a stark contrast to the profiles of random members in the General Population -- who leave most fields with personal info and contact methods blank -- making heavy use of privacy settings and 'optional' fields. Again looking MySpace - the world's most successful network - vast percentages of users shares the bare minimum of information required, leaving their colleges/hometowns/employers/interests/age/sometimes-everything-optional blank.

Universal Opt-In is too complex and laborious

Universal opt-in is the most secure and privacy minded way to approach portability, but far too complex and laborious.

The underlying concept is fairly simple - users provide each network with a listing of the external identities they want contacts to know about, and can configure which contacts can see which identities. This is dubbed "Universal Opt-In" as every person on every network would have to opt-in to this sort of program for it to work.

When a user wants to migrate their graph, they would only need to migrate the node-ids of their contacts -- the new network would be able to relate those node-ids to users on their network who have claimed them. Since the network can do all of this behind the scenes, it would be able to issue friend-requests to its users without exposing their identity or existence on the network to the requester -- safeguarding their privacy should they not want to be known to them there.

Third Party Services

After working through these same privacy and technology issues, I launched a Third-Party-Services solution through FindMeOn.com in mid-2006. Any "Third Party" approach would offer a combination of the technologies we initially promoted:

  • A third party can manage "Universal Opt-In" in a single, central, place -- essentially tracking and replicating nodes on social graphs across multiple networks. We call this Social Mapping.
    • Information can be opted-in to the social map, or can be derived from publicly published information ( i.e.: rel="me" or other public assertations of Intra-Personal identity )
  • A third party can utilize people search technology based on public data to suggest matches and relations. FindMeOn experimented with this and found it very useful for sanitizing plaintext Intra-Personal identity assertions ( vs machine readable ones ) and advertising technology - - however using People Search for Social Portability and "you are connected to this person through _____" dialogs brings up numerous privacy concerns. Using People Search in this context hinges on the commitment of a corporation to not perform the best-possible-match, and instead dumb-down their technology to make the best matches possible based on limited data sets that ensure privacy.

A drawback of third party services is that they may be covered by patents or pendings, resulting in licensing costs or legal dilemmas on implementation.

Recommendations

I've created the following Recommendation Guidelines in developing Privacy and Terms of Service agreements for Social Media Strategies.

These solutions to Social Media problems are embodied in four tenets: Simplification, Clarification, Transparency, and Consumer Privacy.

Recommendation - Data Ownership : Clear Ownership Guidelines

Networks and Users should have a clear and simple understanding on data ownership.

Users should own their uploaded and created content, and networks should receive the right to publish that content in the original context for the originally intended use and reasonable advertising.

Networks must abandon attempts to assume complete ownership -- especially in efforts to gain the abilities to sell T-Shirts, take out ads using the likenesses of their subscribers, or otherwise exploit their userbase. Efforts like that do nothing short of abusing users' trust, create animosity/worry/bad press, and cost the firm countless dollars to maintain legal teams and contracts.

The majority of users, if asked to be featured in a media campaign for a brand they like, will gladly jump at the opportunity for little or no compensation. Users like the attention and popularity advertising can bring them, the 'cool factor' of affinity with brands they admire, and are eager for the chance of having their content featured. Instead of spending hundreds of thousands of dollars on legal agreements and PR campaigns to explain them, networks should just use simple, clear, standardized TOS agreements -- then send a quick email offering a chance to be featured in exchange for signing a simple, clear, standardized model release or licensing form. It's simple, it's fair, it's the way business should be done. Networks should put an end to consumer-antagonistic small text in legal documents immediately.

Networks should clearly state who owns activity based content, such as metric.

Recommendation - Data Access (Exporting Content & Portability): Freely Export Fair Content to User

Users feel they should own their content, whether they created it, uploaded data, or even 'click-incorporated' non-ownable content.

Let's forget about legal issues, let's talk about ethics: Anything a user puts into your system or generates through direct action, they should be able to pull out. Networks should just give this content to them.

Networks should allow content to be freely exported to users. If people want the ability to move their content and associations between networks , networks should just let them do it.

A network should not be required or expected to ease the business models of its competitors or violate the privacy of its own users.

The corporate interests that are pushing for interoperation are smaller networks looking to co-opt users, service providers looking to monetize the transition, and large shops like Google and Comcast who can use the content within larger advertising and data mining services.

There exists no legal or moral reason why a network should be expected to service these competitive corporate interests.

There exists every moral reason for a network to allow their users to download in-network messages, postings, comments, photos, etc - in some sort of readable format. This should be considered Fair Content -- data and content they directly created or were given.

Finding balance is simple: there is no need for a global standard. Networks should not be required to make APIs or data available to machines or other networks, and they should be allowed to throttle the frequency and bandwidth of this sort of action within reason limits unless payment is received.

The goals of portability are not Corporate Based but user based.

If a user wants to transfer or link their accounts, they should be able to easily do that -- however that ability does not require one network opening up to others or 'account transportation' specialists. Users can easily download data themselves and upload information to other services themselves. Expecting or mandating the outgoing network to handle this is unnecessary and slightly ridiculous.

Networks should simply offer the ability to download user materials for their own use. The other networks and groups can handle easing the import issues as they see fit -- and they gladly will.

Users, developers, and networks must acknowledge that this sort of transfer is happening at the cost of bandwidth, server resources, and development/product-management hours of the outgoing network. While this is the user's data in spirit or law , the outgoing network is effectively paying for the standardization and transfer of this data.

Networks also track other data - favorites, listening/usage patterns , recommendations, etc -- information and content that is tied into their business logic. It's not a reasonably fair expectation of the user to demand this content be portable too. It would be a great statement of a network to allow the export of this data - and many do already - but expecting this to be exportable is a stretch of reason. Business logic content is not fair content in terms of portability, and people shouldn't expect it as such.

Recommendation - Account Destruction : Retraction Prohibitive Licensing Agreements

User Generated Content that is posted online should be thought of similarly to how Open Source projects are distributed online: Once a given item is published and in accordance with a specific license or terms-of-service, that item is effectively 'put in the wild' where anyone has access to view, cache, and possibly redistribute.

The copyright holder of an Open Source project can not retroactively rescind a license - they may only create new licensing terms for future downloads/versions.

For example: if I were to publish the source code to an application that I release under the liberal MIT license, I can not change my mind , re-release the software under the GPL and force a retroactive relicensing through an alert to downloaders "I changed my mind! This product is now under the GPL" . I may only: - stop distributing the version packaged with the MIT license - apply the new license to new downloads and future versions

However-- anyone who has downloaded my software under the original MIT license has the right to use and redistribute the software under the original MIT license with which it was obtained.

Similarly, once I publicly distribute content online and allow people to access it -- literally downloading exact copies for their own use -- I lose certain behaviors that I may be accustom to. This is not to suggest that I no longer own the content or the credit for it -- I most certainly do, and enjoy the full rights as such -- it just means that I have publicly published this content and it is 'in the wild'.

Social Networks should require users to either grant irrevocable licenses or use Creative Commons licenses to ensure communities can still engage with content after their creators have left or expressed the desire to destroy the content.

People must remember that if they were publishing these same contents in a 'letter to the editor' of a magazine or a book, they would exist hundreds of years later in archives and libraries. Completely erasing public digital footprints is a silly expectation when one is publishing content to the world or engaging in community dialog.

Networks should also not have to worry about the complexities and legal ramifications of retracting content throughout caching distribution chains. Retraction Prohibitive Licensing Agreements would eliminate their needs, while ensuring continuity and experience of others.

Users should maintain the ability to destroy their profile and content that has not been referred to / interacted with - but they must realize that in most cases they are publicly publishing content -- and you can not unsay words or undo actions.

Content retraction and destruction of Social Media should be handled by Time Machines , Shooting Stars and Fairy Godmothers -- not network policies or obligations.

Recommendation - Social Graph Portability : Allow Portability ; Enforce Strict User Privacy

Networks should allow users to download and migrate their Social Graphs -- but not at the cost of privacy to other users in the system.

If email addresses or other contact information are viewable to users of a social network -- by default or through explicit opt-in -- that information should be made portable.

Under no conditions should any form of contact information or the global/unique identifier of a user be made visible to other users of a system or 3rd party networks without the explicit knowledge and consent of the user.

In practice, this means that sites like MySpace.com that do not publish the email addresses of contacts should never publish lists of email addresses to ease portability -- doing so would grossly violate the consumer's expectation of privacy. Additionally, sites should not ease portability attempts by indicating the presence of a user through non-publicly published information. For example: if my AOL IM address is marked as 'friends only' or 'unsearchable' on Facebook, non-friends should not be able to search for me through that address on the network and derive my in-network account identifier or presence. The network could issue me a request/warning behind-the-scenes that someone was searching for me by that method - but it should never make my private/privileged association with that data public , or even mention that I belong.

Corporations have recently rushed towards something I call "Data Sportability" -- the implementation of hastily conceived portability programs in an effort to be ahead of the competition. These programs have only resulted in glorious sounding press-releases and huge privacy violations and gaffes. Before offering new portability measures, social media properties should - thoroughly test new systems for security - notify all users of the new systems and policies in advance - require users to opt-in - provide users with a set of "Common-Sense Best-Practices" in safeguarding personal data and maintaining privacy with the new features.

The recommendation to achieve social graph portability remains twofold:

1- Allow users to export access-cleared information (email address, etc) and search/add contacts on new networks based on the viewability of that information on the new network. If only nodes are available, networks should encourage users to list their nodes on other networks for discovery.

2- Use third-party tools ( such as FindMeOn's technologies, wink wink ) that allow users to centrally manage this sort of node information and privacy settings and port it across networks.

The adding of friends across networks should always be a manual opt-in task -- unless both parties have explicitly declared that they want their social graphs on each network migrated. If two contacts on MySpace were to port their graphs to Facebook, Facebook should prompt the users to decide which contacts to request friendship with -- not automatically create an association. If both contacts explicitly declare on import action or their data-source that they want the entirety or sections of their graph automatically added without confirmation, the network should feel free to honor that request. If only one contact requests that behavior, the network should never honor the request.

Copyright & Licensing

Copyright 2008 Jonathan Vanasco

This essay is released under the Creative Commons Attribution-No Derivative Works 3.0 United States License.


This Document was authored in reStructuredText