Use AWK to find the most popular status
To do this analysis efficiently, we’ll use the command line language called awk
, a tool that allows you to filter, extract and transform data files. awk is a very useful tool to put in your bag of tricks. To start, let’s look at a very simple awk program to output every line of our facebook.csv file, where we specify the delimiter of the file 'comma' using the -F option. You should see the entire file being output to the screen. To only output the status ids (e.h., column 1 and 2), use the dollar sign ( $
) to denote columns as follows:
-
Input
However, since the dataset has quoted ("text"
) cells we will use csvcut
to extract the columns, e.g., we want to extract the column 1,8-15
into a file called fbreactions.csv
. The idea is to sum-up all the reactions (columns 8 + ... + 15
) on each FB status and then find the status which had the maximum number of reactions.
csvcut -c 2,8-15 facebookdata.csv > fbreactons.csv

To calculate the total number of reactions on each entry (status), all we need to do is horizotally add up all the numbers from the columns #8-15 and we do this easily with awk, as follows:
-
Input
Let's pay attention to the awk
statetment, which not only sums up the columns side by side, but also on each line prints two output (status id
and total
number of reaction on that row). Finally, at the end of each iteration, it nulls the total=0
.
To get the status with max
reactions, next, we sort the status ids, based on the number of reactions (column 2) using the sort -n -r -t"," -k 2
function, which tells the system to sort out the piped (|
) output numerically (-n
), on the column 2 (-k 2
) wich is delimited by a commma (,
):
-
Input

The final output, tells us that the status id: 7331091005_10154089857531006
had the maximum number of reaction of total 668121
. If we now use grep
, we can easily find the message which had the largest number of reactions.
-
Input

Remarkbly, using AWK we've now found that the status "LeBron and the Cavs are tired of being bullied" was the most popular one on that FB page!