Big Data: Google and Dengue Tracking

By Wilson Chua

The image shows Dengue-infected cities in the country.

Is FOI alive?

On Sept 12 2016 I sent a formal FOI (Freedom of Information) request to the Office of the Secretary of Health (DOH) for more recent data on Dengue infections in Pangasinan. The DOH as of this writing has not replied or even acknowledged the FOI email request. What options can we take if we don’t have access to the needed data?

Plan B

Our workaround is patterned after “Google Flu Trends”. Google has shown (at least in the initial years) that there was a correlation between searches for flu related words and the actual outbreak of flu in the US.

Data Understanding

Google tracks your search histories. It has data about keyword searches done by users in the Philippines for the “dengue” keyword going back to 2003. So every time you enter “Dengue” in the search field, Google counts that and stores that in its servers. Imagine a million users searching on Google for “Dengue” and Google will store those million searches.

The great thing is that Google allows us to view and extract that data. If we then use this dataset and visualize this together with actual published dengue cases in the Philippines, this is what we get:

Dengue-PH-2016 (1)

The X axis plots the week numbers in years. While the Y axis  plots the number of searches and actual dengue cases. The blue line shows the Google search trends for “Dengue” while the orange dots represent actual reported cases (based on published online data).

Notice that while I would have wanted more data points to compare it, there is “some” correlation between the actual number of reported dengue cases and the search trends on Google for “Dengue”.  We can also drill down to the top cities in the Philippines where the most number of “Dengue” keyword searches occurred.

In the graphic, Baguio City and Imus have some of the highest searches for “Dengue”. Could this mean that there’s a spike in Dengue cases in Baguio city and Imus? “Confirmed” says my friends on Facebook Anton Raphael Orpilla and Humprey Cogay based on their personal experiences.

How to do this Yourself?

Google Dengue search history can help us plot the trends and geographically, down to the nearest city. If any other epidemic, in this case, Zika starts to spread, and the DOH is slow to publish their actual data, you can go on to Google Trends and extract the keyword searches for “Zika”. You restrict the geography to “Philippines”, and limit the data points to the “Last 90 days”. The resulting Google graph will give you a rough idea of how concerned people are with “zika”.

You have just done big data analytics using Google’s data and servers. Congratulations.

