Social Butterfly: Databases, Graphs and Connection-based Marketing

by

“You’re nobody till somebody friends you…”

That could be the theme song of the twenty first century. The explosion of social networking has transformed entertainment, commerce, and most certainly, publishing. To someone (like me!) who remembers the world BF (Before Facebook) the proliferation of social networking platforms can be bewildering and disturbing. However, the revealed wisdom from the MP’s (Media Pundits) states that you cannot succeed as an author without a well-orchestrated plan for marketing yourself through Facebook, Twitter, Google+, Triberr, LinkedIn, Goodreads, and so on ad nauseum.

This so-called truth generates huge amounts of anxiety in many authors I know. One of my objectives in this Naughty Bits column is to allay some of that anxiety. Another is to give you an idea of how social network platforms work. In fact, if you’ve been following my columns, you already know most of the basic principles. I’ll be talking about a couple of additional technical concepts on which social networks depend. Finally, I want to suggest that core concepts associated with social media marketing do not depend on today’s platforms. You can apply what I call connection-based marketing even if you’re as Facebook-phobic as I am.

Let me start with my own revealed truth: the purpose of social networks is to make money for the people who run them. This money can come from selling ads to sponsors or services to members. It can come from selling data about members to third parties. The money may also come from sources (for instance, investors) who view the attention of a large group of members – the “eyeballs”, in media parlance – as a pool for future sales of products or services.

Facebook may make it possible for families to keep in closer touch, sharing photos, news, and love over long distances Twitter may help emergency responders find affected individuals in an earthquake or assist political activists in opposing what they perceive as tyranny. There is no question that social networking can have social benefits. Just don’t forget: that’s not the fundamental objective, at least for the big, for-profit sites. At the end of the day, it’s all about money.

Furthermore, the more members a network has, and the more active they are, the larger the potential payoff.

I will leave you to work out the implications on your own. Let’s move on to the tech stuff.

How Do Social Networks Work?

At the most basic level, Facebook is nothing more than a very complicated web site. Like any other web site, i t’s based on HTTP and HTML – and you already know how they work. (HTML 101: Web Basics for Authors) Of course, social network sites involve huge amounts of active content (Did the earth move?
A quick tour of active web content
). The specific approaches used will vary, but most of the capabilities of FB and similar sites depend on server-side programs. That is, most of the computations are happening somewhere inside FB’s computers, which create new content and send it back to your browser. The responsive user interface, which pops up a dialog when your mouse moves over a certain area, or provides a list of possible completions when you start to type in a search string, most likely depends on Javascript programs which run inside your browser and manipulate the appearance of the page.

What distinguishes a social network from other web applications? The core idea behind social networking is the notion of connections between users – “friending”. Much of the appeal of social networking derives from these connections. When you add some information to your page, or engage in other activities on the site, people whom you have friended may be notified – and vice versa. You can also in many cases add information or record your opinions on the pages belonging to your friends. You can share all sorts of material, including images, videos or links, and you can make recommendations to the people with whom you are connected. The possibly interactions aren’t limited by time and distance, and in fact go far beyond the capabilities of face-to-face conversation.

Furthermore, the site itself constantly suggests new users whom you might want to add as your friends. These are frequently individuals who have some indirect relationship with you, via a common friend, workplace, educational institution, and so on. Sometimes these suggestions will “magically” unearth someone from your past, someone you haven’t heard from in decades. Since having a large number of friends is often viewed as a measure of status or popularity, members are quite likely to act on at least some of these suggestions.

In order to handle these social functions, social networking platforms rely on two types of computational structures: relational data bases and graphs.

Relational Data Bases

Even if you’re seriously tech-challenged, you probably have an intuitive idea about data bases. A data base is simply a collection of information, stored in some persistent form. Your address book could be considered to be a data base. For our Oh Get a Grip blog, we have a data base of previous topics with the year we used them, so that we can avoid repetitions.

Many authors, including me, maintain a spreadsheet to keep track of work we’ve submitted. These are data bases, too. Mine has columns labeled Title, Submitted To, Submission Date, Status, and Publication Date.

Many data bases are like my spreadsheet. They can viewed as a set of rows, each of which represents an example of a particular sort of object or thing – in computer jargon, an entity. Each column in the row stores the value of some property or attribute of the entity, so a row in some sense “describes” a specific entity. In my submissions data base, each row is a “submission”, that is, an event in which I sent a book or story to someone in an attempt to get it published. As I get news about a particular submission (for instance, if a story is rejected), I try to remember to update the relevant column in that row.

Social network applications (as well as ecommerce sites and many other types of computer programs) use a somewhat more complex variant on this scheme, called a relational data base (or RDB). As suggested by the name, relational data bases don’t just store information about entities. They also keep track of relationships between entities, especially entities that represent different kinds of objects or concepts. So usually a relational data base has more than one set of rows (more than one table, in RDB terminology). Relationships are represented by matching up information in the different tables.

Let’s consider an example. Suppose we have a table of information about authors. (I apologize for making up some of the information below about my good friends Ashley Lister and M. Christian…!)

AuthorID Name Nationality FirstPubbed BirthYear
A1000 Lisabet Sarai American 1999 1953
A1001 Ashley Lister British 1997 1965
A1002 M. Christian American 1993 1962
A1003 ….

This is fine, but we’d really like to know more about these authors. For example, what genres do they write? We can create a second table (let’s call it AuthorGenre) to store that information.

AuthorID Genre
A1000 Erotica
A1000 GLBT
A1000 BDSM
A1000 Paranormal
A1000 ScienceFiction
A1000 Romance
A1001 Erotica
A1001 Horror
A1001 Nonfiction
A1001 Humor
A1001 ScienceFiction
A1001 Mystery
A1002 BDSM
A1002 Horror
A1002 GLBT
A1002 SocialCommentary
A1002 LiteraryFiction

Why don’t we just store the genre information as part of the Author table? Mainly because many authors write in more than one genre. We could create columns named Genre1, Genre2, etc. but how many slots should we provide? Storing the relationship between author and genre in a separate table allows us to have an unlimited number of genres for any author. Furthermore, we can add new genres at any time. Supposed that I wrote a novel targeted at young adults. We’d just add one more row to the AuthorGenre table, as follows.

AuthorID Genre
A1000 YoungAdult

And why are we using these funny “author ID” strings like “A1000” instead of the author’s name? The primary reason is that author names aren’t guaranteed to be unique. There could be another author somewhere named “Lisabet Sarai”. (Although I hope not!) If we added her to our data base, she’d get a new row in the Authors table, with a brand new, unique AuthorID value. There will be no possibility that anyone will confuse her sweet romances with my steamy smut.

So what else might we want to store about our authors? Well, clearly it would be nice to know something about the books they’ve written. Let’s add two new tables, Books and BookAuthor.

BookID Title Year
Published
Pages Primary
Genre
Secondary
Genre
….
B1000 Raw Silk 1999 220 Erotica BDSM
B1001 Bodies of Light 2011 70 Romance ScienceFiction
B1002 Quarantine 2012 198 Romance GLBT
B1003 Swingers: Female Confidential 2010 167 Nonfiction
B1004 Death by Fiction 2010 360 Mystery Horror
B1005 The Bachelor Machine 2003 270 Erotica ScienceFiction
B1006 Finger’s Breadth 2011 284 Erotica Horror
B1007 Coming Together Presents: M. Christian 2010 210 Erotica GLBT
B1008 ….

 

BookID AuthorID
B1000 A1000
B1001 A1000
B1002 A1000
B1003 A1001
B1004 A1001
B1005 A1003
B1006 A1003
B1007 A1003
B1007 A1000

The Books table records information about individual books. The BookAuthor table stores the relationship between books and authors.

Why don’t we just store the authors in the book table? Well, many books have more than one author. Since I edited and wrote the introduction to Coming Together Presents: M. Christian, I’ve put two rows in the BookAuthor table for that book, one linking the book to M. Christian, and one to me.

There’s another, more fundamental reason for storing relationships separately, however. Relational data bases, if structured correctly, make it fairly easy to do complicated searches that combine information from multiple tables. In techie terms, this is called “querying the data base”. For instance, I might want to know the names of all the authors in the data base who have written science fiction books. We could execute this query in several steps as follows:

  • Find the BookId for every book in the Books table that has “ScienceFiction” as the primary or secondary genre.
  • For each BookId returned from the first search, find the corresponding AuthorId values in the BookAuthor table.
  • Look up each of these AuthorId values in the Authors table and print the Name.

Relational data bases provide the ability to make this kind of request in a single “question”, using something called Structured Query Language, or SQL. In SQL, we could specify this query as follows:

Select Author.Name from Author where Author.AuthorID = BookAuthor.AuthorID
and BookAuthor.BookID = Books.BookID
and Books.PrimaryGenre = ‘ScienceFiction’
orBooks.SecondaryGenre = ‘ScienceFiction’

Don’t worry too much if you don’t follow this. The point is that by matching up the values of columns in different tables, we can pull out exactly the information we want from our four tables. For example, we could answer the following questions:

  • Who wrote BDSM books that were published between 1999 and 2003?
  • What is the longest (in terms of pages) book that was written by Ashley Lister?
  • What genres do M. Christian and Lisabet Sarai have in common?

It turns out that this works much better if we store all relationships between our primary entities (Books, Authors and Genres) in separate tables from the entities themselves.

Of course, we could expand our data base by adding more tables. Suppose we added a Readers table, and then a BookReader table, that linked each reader to the books he or she had read. Then we could ask:

  • Who has read BDSM books by Lisabet Sarai?

For each reader R returned by the first query, we could then ask

  • What BDSM books by Lisabet Sarai has reader R not read?

As you see, I’ve already used our data base to do targeted marketing, by discovering readers who seem to like BDSM and have sampled some of my work in that genre but not all of it. Imagine how much more intelligent our suggestions could be if we also knew what ratings the reader had assigned to each book she had read, or which books she had “liked”.

I think you can see how relational data bases can support some of the features of social networks, like finding and suggesting other people who went to school with you (Facebook) or suggesting new books by authors you’ve already read (Goodreads). But what about the connections between people? How does Facebook figure out that I might know Bob, from the fact that Bob is friends with Alice and Alice is my friend?

We could store friend connections in a relational data base. We would have a Member table that assigned PersonIDs to every member of the site, and then a Friends table that held pairs of PersonID values.

Member

PersonID Name … (more columns)
P00001 Bob
P00002 Alice
P00003 Fred
P00004 Lisabet
P00005 June

Friends

Person1 Person2
P00001 P00002
P00001 P00004
P00002 P00004
P00002 P00005
P00005 P00003
…. …..

The Friends table would quickly become pretty enormous, however. If there are 1000 members of the network, there could be as many as 999,000 rows in the table. More important, however, is the fact that this structure makes it difficult (complicated and time consuming) to answer questions like the following:

  • Who is friends with both Alice and June?
  • Which of Lisabet’s friends has friends who are not also friends of Lisabet?

Instead of using a relational database, this sort of network of connections is usually represented as a graph.

A Brief Introduction to Graphs

A graph is a structure for holding information about connections. A graph is composed of nodes (also called vertices) and links (also called edges). (I’ll use the node/link terminology from now on.)

A node is an object, person, thing or concept. A link indicates a relationship or connection of some type between two nodes.

You could use a graph to represent a subway system, in which case the nodes would be subway stations and the links would be the tracks running between them.

Boston Subway

Note that junction stations, where you can switch from one subway line to another (e.g. Park Street, Downtown Crossing), will have more incoming/outgoing links than stations that belong to a single line (e.g. Arlington, South Station).

A graph might represent something more abstract, like the steps in some process. The graph below represents the dependencies a guy faces when getting dressed.

Getting Dressed

The nodes are the various items of clothing (which are also stages in getting dressed). The links (arrows) mean that our hero must put on the item at the start of the arrow before the item at the end of the arrow. If he dons his shoes before his pants, he’s going to have problems! Notice that in this case, unlike the subway example, links are directional. Also there is more than one sequence of steps that will work; that is, there is more than one way to “traverse the graph”.

In a social network, the nodes are members. The links indicate friend relationships. The diagram below shows a graph that holds the same information as our Friends table in the previous section.

Social Network

I’ve drawn the links as bidirectional. In most social networks, if Lisabet is Bob’s friend, that means that Bob is also Lisabet’s friend.

When we store graphs in a computer, we don’t represent the relationships one by one, as we did in our Friends table. Instead, for each node (each member, in our social network), we store a list of all the other nodes that are adjacent – that is, directly connected – to that node. It turns out that this makes it much easier to follow a path through the set of connections. You don’t need to know the details, but it is simple and efficient to start at one node (e.g. Lisabet) and find all the friends of her friends.

Real social networks probably use multiple sets of graphs for different kinds of relationships. They perhaps add numerical weights to the links that indicate the strength of a relationship, based on the frequency of interaction. For example, if Lisabet frequently visits Alice’s page or comments on Alice’s photos, but rarely interacts with Bob, the system might be more likely to suggest that Lisabet should add Alice’s friends to her own set of connections, than Bob’s friends. There are also computational techniques to identify mutually connected clusters in graphs, such as the group of Bob, Alice and Lisabet. A social network might use this kind of information to decide what email notifications to send to whom.

Marketing Using Social Networks

Why have social networks become important tools for marketing? I believe there are two reasons (other than hype):

  • The number of people who might potentially see your marketing material is very large – in the hundreds of millions.
  • The connections between individuals and the ease with which information can be shared means that one direct marketing impression may result in many indirect impressions. If someone sees your book cover, likes it, and sends the link to her friends, your marketing activity has a larger chance to make an impact on your popularity and sales.

Social networks facilitate connection-based or referral-based marketing. They can be a highly efficient method for diffusing information about you and your work. However, to market effectively using social media, you need to attract the attention of the right people in the first place: people who might like your books and whose friends are also target readers.

Ultimately, there’s nothing special about social network platforms that helps with this critical point. If anything, all the random noise on Facebook — the billions of transactions that occur daily — may interfere with your message. You need to focus your own efforts on the readers most likely to buy your books and then spread the word. But how do you find them?

If I knew the answer to this question, I’d be a lot more popular than I am. However, I believe that you can use the concept of connection-based marketing outside of the social networking environment. The basic message needs to be: if you like my books, tell your friends.

I blog frequently, both at Beyond Romance and at as a guest at other blogs. There’s a cadre of readers who follow me around, leaving comments – especially when I’m running a contest! They’ve been known to rave about my books, and of course, I’m delighted when they do so, because the other readers see their opinions and consider giving my books a try.

(I had a reader comment recently: “If you published your Grocery List I’d buy because that’s how talented a Writer you are.” Now how do I find more like her?)

I do at least two or three giveaways a month, where the prize is some title from my back list. I’ve tried contests where you get extra credit for getting someone else to enter. I’m always hoping to spark the interest of a new reader. And I do add fans, slowly, but surely, even though I have neither a personal nor a fan page on Facebook.

There’s another reason I haven’t invested much time or energy in social media. Today’s hot site can easily become tomorrow’s digital ghost town. My abandoned page on MySpace is testimony to the fickleness of the crowd. After its disastrous IPO, there have been some indications that Facebook is losing popularity. What will be the next big thing? I won’t venture to predict that. However, I’m certain that whatever it is, there will be plenty of self-styled experts exhorting us poor harried authors to jump onto the band wagon.

It’s up to you to decide whether or not to listen. Either way, I hope that this column has help demystify the technology and made you realize there’s no magic going on — just the usual bits.

Lisabet Sarai
August 2012


“Naughty Bits: The Erotogeek’s Guide for the Technologically Challenged Author” © 2012 Lisabet Sarai. All rights reserved. Content may not be copied or used in whole or part without written permission from the author.

Tip Archives

Pin It on Pinterest