We are all connected, no matter if you want to or not, you probably been interacting with some other individual in the course of your life. In fact, the average person knows about 1,000 people throughout his life (Jordan Peterson), if you think about it, it means that if you take each person you know and you look at who they know, you could get into 1,000,000 people (each 1,000 knows 1,000), and if you do it again, the third time you get into a 1,000,000,000 people, Which is about 15% of the world population!
We are much more connected than we think. Especially today in the era of social media, a piece of content you produce could get viral very quickly, according to what we saw, if a friend of a friend would share your content you could potentially reach 15% of the world’s population assuming that each person has a social media account in the platform you have created the content in.
All this connectivity between individuals, items, business constructs, points of interests that are connected or interact with each other is part of filed which is called social network analysis (SNA), which is basically derived from the mathematical branch of Graph theory. In social network analysis our goal is to analysis our graph in order to get incites regarding our points of interest. To use SNA first we must define three key definitions:
1. Node – an instance of the object we are viewing. That could be a any object you could think of that interacts with other objects, like for example: students, teachers, universities.
2. Edge – a connection between two nodes. The edge symbolizes the interaction itself by an arc.
3. Graph – a collection of nodes and edges that represent our world of interest.
Each node and edge could have any attribute we wish, for example if a node would define a instance of a person, then some attributes could be: name, weight, height, age, eye color. And for an edge, if for an example a node is a person than a relation could be friends, and the attributes for this relation could be: amount of years they know each other, place where first mate, number of fights they had.
We can see in the image we have multiple nodes and between each node that interests with another we have an edge, by looking at the
image we could have a visual sense of how complex a social network might get. Another think that we can see is that the nodes in the graph don’t have to represent the same object, we could have nodes that represent different objects if there is a connection between them (an edge).
In one of my projects I had to construct a look alike model to recommend new connections to existing nodes, without getting to specific details, what we did was to use the power of SNA for that and the results where inevitably great. Now, to firmly understand one of the uses of SNA we could look at the following example: Let’s say manage a social media platform and we have a network of friends, our goal is to suggest to each person in the network a list of suggested friends. One simple, naïve and minimal solution could be the following:
1. Define our graph with the following definitions:
a. Nodes – each node represent a person
b. Edges – and edge between two people mean that they are friends.
c. Edge attribute – number of likes they done for each other’s content.
2. For each person, rank their friends by the number of mutual friends they have with the person, so we could see how many mutual friends a person has with each person in his friends list.
3. Sort the friends list in a descent order, so that the first friend is the one we have the most mutual friends with.
4. For a specific person, P, suggest a friend as the following:
a. Look at the first friend in the list of friends of P (the list we made in phase 2), let’s call this friend in the letter A.
b. suggest the first friend in A friends list that was made in phase 2 for A, which is not a friend of the first person, let’s call that person Friend AA.
c. remove friend AA from the list.
d. Repeat until we finished to look at all the friends in A friend list.
e. Remove friend A from list of suggestions.
f. Repeat stages a-e until we finished to go through all the friends list for person P
g. Repeat staged a-f for every person in the network.
In the end of this algorithm we will have a suggestion of friend for each person in the network. Of course, that we could take more parameters into to consideration to make better and better suggestions, this is just a simple example of using a SNA. You could see how you could implement this for your own business goals. There are some R and Python packages that you could use for solving this kind of problems, like “network” for python and “sna”. If you have big data you could graph databases like neo4j, or use GraphFrames with spark over a cluster, this will enable you to solve SNA problems on scale.