Archive for the ‘Trends’ Category
Swine flu is in. In the rush to cover this latest possible pandemic, newswires are alive with activity, blogs and social networking sites are buzzing, and the CDC and WHO are back in the limelight. This despite the fact that the number of cases are limited (only 40 confirmed infections have occurred in the US).
The rush of news has been accompanied by a rush to track that news. The WSJ, amongst others, has a tracking website, including a map of infections in North America. Best of all, Google has a map showing how the infection is traveling.
This rush was started by Google Flu Trends, a website that tracks flu-related search queries to estimate influenza levels in different US states. Further studies suggested the same approach might work for other diseases as well.
Analyzing Google Trends
So how has Google Trends, the broader application of the Flu Trends concept, performed in the current scenario? A quick analysis shows that Google search results did in fact increase over the past few days (see chart – source: Google Trends).
A quick analysis shows three items worth mentioning:
- First, while Google Trends does show an increase in search activity on “swine flu,” the first uptick in activity only occurred on April 23. By contrast, the first news stories appeared on April 21 when two cases were confirmed in California.
- Second, Google Trends reports that the majority of search queries were from New Zealand, USA, UK, Canada, and Australia. Only a very small minority were from Mexico. Yet, Mexico is the country supposedly at the heart of the pandemic.
Explaining the Discrepencies
I had used a Google Trends like methodology two years ago to track the evolution of climate change as an issue in news coverage. Having worked on that, I can propose a few general reasons that explain why Google Trends is limited in this case.
First, it appears that Google Trends follows with some time lag, actual infections. This should not be surprising, as people are not likely to search for a disease before having had some exposure to it. This does not mean that it is not a useful tool for tracking diseases over the long term. At the very least, the response time of a system based on GT might be lower.
Second, the current scenario shows that Google Trends is highly susceptible to “noise.” Prior to this outbreak, swine flu was probably not a commonly known disease, and queries on it were extremely rare (if not non-existent). Thus, even the slightest uptick in search activity would show up as a major change. That uptick was provided by the highly charged media coverage of the subject. Given this, one wonders if the search results are more “noise” and less people with a genuine interest in the subject. So, Google Trends is likely to be more accurate where general knowledge of a subject (the baseline) is high, and media coverage (noise) is low.
Finally, and most interestingly, why is it that most of the search results came from the US, while Mexico is more exposed to it? Not surprisingly, this methodology only works where both a large number of the population and media are on the internet.
What Next for Google Trends?
When discussing why most search queries occurred in the US, it is worth noting another fact about the swine flu outbreak – that it has traveled extremely fast. Originating in Mexico, it has been carried to the USA, Spain, and New Zealand. This brings into question the validity of using the geographic source of search queries as a reliable indicator of where the disease actually is.
Still, it may also offer a way to enhance Google Trends. What if Google Trends data was combined with travel data on the number of people traveling from a “hotspot” of an infectious disease. It would be logical to assume that popular destinations, or ones which receive travel groups, would be the most likely next locations for further infections. Thus, a map could potentially be created of not only where the disease is generating interest, but where it might be headed.
Of course, Google does not have access to such data – though at some point it may decide to acquire a travel operator. But the general lesson is simply that to make Google Trends more useful, search query data needs to be looked at together with real-world data (such as travel data or hospital records).
It is still early days for the swine flu outbreak, but some commentators are already suggesting the “social web” has actually created hysteria rather than help track the disease. That may be true, but it is hardly a problem of the “social web.” As a reader on the FP pointed out, “Twitter is only a natural extension of a typical neighborhood.”
So, in this “typical neighborhood,” what the swine flu outbreak has done is illustrate where Google Trends does well – in tracking general interest amongst heavy Internet users. But it also exposes limitations – the methodology is (not surprisingly) susceptibility to “noise” from media coverage and is biased towards countries and issues that are online. This does not mean that the idea itself is flawed. Just that it must be taken with a pinch of salt, and that it needs work – especially interfacing it with real-world data streams – to make it really useful.
First, a bit of housekeeping – we are tinkering with the look of the blog and considering moving it to another platform, if you have any feedback about what you like and don’t like, let us know.
Published today in the CMAJ, Early detection of disease outbreaks using the Internet, is worth skimming:
“The Internet…is revolutionizing how epidemic intelligence is gathered, and it offers solutions to some of these challenges. Freely available Web-based sources of information may allow us to detect disease outbreaks earlier with reduced cost and increased reporting transparency. Because Web-based data sources exist outside traditional reporting channels, they are invaluable to public health agencies that depend on timely information flow across national and subnational borders. These information sources, which can be identified through Internet-based tools, are often capable of detecting the first evidence of an outbreak, especially in areas with a limited capacity for public health surveillance.”
The limitations section includes the below list, but I wish they went into much more detail about what the internet is not good for (probably detecting trends among the elderly for example) and more examples of misinterpreting the data. On a related note to using ICTs for surveillance, Jaspal wrote a fairly detail post on Google Flu Trends that you should also check out.
The WHO has decided to focus this World Health Day on hospital infrastructure during times of emergency. The folks over at Global Health Progress have a good round of what some bloggers are saying and include health journalism folks as well as thoughts from the AvianFlu diary. I thought I would go off theme and briefly throw out some thoughts on the bigger picture and encourage you to use this day to think about what is the future of global health? In this context of thinking about the future in 10, 20 or 30 years, the world is in turmoil and we are questioning the fundamental nature of market driven economies, why not use this as an opportunity to do the same for global health in a forward looking way? Think about where we are and whether we are prioritizing the right things and moving in the right directions?
Approximately 10 (only TEN!) years ago there was no Google, Kiva, Gates Foundation or knowledge about the cost differences between generic and brand name drugs (see this great talk on the Future of Global Health by Jim Yong Kim and his discussion of how they reduced the price of treating MDR TB patients by 80-90% in 1999) amongst major care organizations (absolutely stunning). Mobile phone penetration was less than 1% in developing countries and social entrepreneurship wasn’t hot, the vast majority of us probably hadn’t even heard of that term.
Where we were ten years ago is arguably a profoundly different world from where we are today and per the video below “we are living in exponential times“. To give you further inspiration to think differently today definitely watch the below (via 2173):
The acceleration of technology for social change and global health is going to increase, in this decade alone the convergence of movements in philanthropy, entrepreneurship and technology all enabled by the internet and mobile phone revolution have allowed people to collaborate, innovate and communicate on an entirely different level. I don’t know what the future of global health is – but I wonder how open source collaborations will contribute to solutions and whether twittering for global health will be around in five years and for whom and what purpose? Or will we just be doing more of the same. I wonder if we will be doing entire marketing and health education campaigns via mobile phones and how this will evolve. Will there be convergence of people and ideas working on global and domestic health? Will the flow of innovation and products from “South” to “North” become the next hot topic? I wonder if we will have a TED just for Global Health?
We might face a global crisis in 2030 but we will also be better equipped to face that crisis.Today is a day we should be thinking about what all the possibilities are and how we can get there in the fastest way possible. The last idea I will throw out as food for thought is to think about what have been the top 10 biggest developments in global health in the last decade and how will these shape the future?
For Feb 2009 TrendWatching.Com focuses on “generation G” – the giving, generous generation that they think is baked in due to the ubiquitous development of online culture. I don’t agree with everything they have spotted, but it’s a really interesting piece worth checking out:
“GENERATION G | Captures the growing importance of ‘generosity’ as a leading societal and business mindset. As consumers are disgusted with greed and its current dire consequences for the economy—and while that same upheaval has them longing more than ever for institutions that care—the need for more generosity beautifully coincides with the ongoing (and pre-recession) emergence of an online-fueled culture of individuals who share, give, engage, create and collaborate in large numbers.”
In fact, for many, sharing a passion and receiving recognition have replaced ‘taking’ as the new status symbol. Businesses should follow this societal/behavioral shift, however much it may oppose their decades-old devotion to me, myself and I.”
Here is the outline of the piece:
1. Recession and consumer disgust
2. Longing for institutions that care
3. For individuals, giving is already the new taking and sharing is the new giving
8 Ways for corporations to join Generation G: co-donate, eco-generosity, free love… read the rest here.
A few days back Aman wrote a post about Google Flu Trends. Thought I’d add a few thoughts here after reading the draft manuscript that the Google-CDC team posted in advance of its publication in Nature.
By the way, here’s what Nature says: Because of the immediate public-health implications of this paper, Nature supports the Google and the CDC decision to release this information to the public in advance of a formal publication date for the research. The paper has been subjected to the usual rigor of peer review and is accepted in principle. Nature feels the public-health consideration here makes it appropriate to relax our embargo rule
Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Draft manuscript for Nature. Retrieved 14 Nov 2008.
Assuming that few folks will read the manuscript or the article, here’s some highlights. I should say I appreciated that the article was clearly written. If you need more context, check out Google Flu Trends How does this work?…
- Targets health-seeking behavior of Internet users, particularly Google users [not sure those are different anymore], in the United States for ILI (influenza-like illness)
- Compared to previous work attempting to link online activity to disease prevalence, benefits from volume: hundreds of billions of searches over 5 years
- Key result – reduced reporting lag to one day compared to CDC’s surveillance system of 1-2 weeks
- Spatial resolution based on IP address goes to nearest big city [for example my current IP maps to Oakland, California right now], but the system is right now only looking to the level of states – this is more detailed CDC’s reporting, which is based on 9 U.S. regions
- CDC data was used for model-building (linear logistic regression) as well as comparison [for stats nerds – the comparison was made with held-out data]
- Not all states publish ILI data, but they were still able to achieve a correlation of 0.85 in Utah without training the model on that state’s data
- There have attempted to look at disease outbreaks of enterics and arboviruses, but without success.
- For those familiar with GPHIN and Healthmap, two other online , the major difference is in the data being examined – Flu Trends looks at search terms while the other systems rely on news sources, website, official alerts, and the such
- There is a possibility that this will not model a flu pandemic well since the search behavior used for modeling is based on non-pandemic variety of flu
- The modeling effort was immense – “450 million different models to test each of the candidate queries”
So what does this mean for developing world applications?
Here’s what the authors say: “Though it may be possible for this approach to be applied to any country with a large population of web search users, we cannot currently provide accurate estimates for large parts of the developing world. Even within the developed world, small countries and less common languages may be challenging to accurately survey.”
The key is whether there are detectable changes in search in response to disease outbreaks. This is dependent on Internet volume, health-seeking search behavior, and language. And if there is no baseline data, like with CDC surveillance data, then what is the best strategy for model-building? How valid will models be from one country to another? That probably depends on the countries. Is it perhaps possible to have a less refined output, something like a multi-level warning system for decision makers to followup with on-the-ground resources? Or should we be focusing on news+ like GPHIN and Healthmap?
Another thought is that we could mine SMS traffic for detecting disease outbreaks. The problem becomes more complicated, since we’re now looking at data that is much more complex than search queries. And there is often segmentation due to the presence of multiple phone providers in one area. Even if the data were anonymized, this raises huge privacy concerns. Still it could be a way to tap in to areas with low Internet penetration and to provide detection based on very real-time data.
In case you missed this in the NY Times today – fascinating experiment with a new Google tool on the frontiers of diseases surveillance and global health trends. Remains to be seen how useful this will be and lots of validation needs to be done, but this is yet another example of people outside of traditional health/public health communities who are on the leading edge of public health innovation:
“What if Google knew before anyone else that a fast-spreading flu outbreak was putting you at heightened risk of getting sick? And what if it could alert you, your doctor and your local public health officials before the muscle aches and chills kicked in? That, in essence, is the promise of Google Flu Trends.
“Google Flu Trends (www.google.org/flutrends) is the latest indication that the words typed into search engines like Google can be used to track the collective interests and concerns of millions of people, and even to forecast the future.”
We have discussed before how data indexed on the web can used for all sorts of fascinating things. We had a previous posts on global health job trends and also on publications that use the terms global health and private sector. The graphs below show a large increase in both areas, however there are dozens of caveats with this kind of trend analysis and the below graphs have to be taken with a grain of salt:
1. Global Health Job Trends (see for full post)
2. Trends: Development/Global Health in the Business Press (see for full post)