The recent tragedy in Japan got me thinking about the role of data in a disaster. How do we use data to help prepare us for these events? Are we using all the data we have available to us to understand the impact of a disaster and how are we using data after the disaster has occurred? Do we have the right data and the right tools available to save more lives or are we doing all we can?
The United States Geological Survey (USGS) is using historical data and predictive analytics to predict the probability of an earthquake in San Francisco. The amount of data collected by USGS is incredible and the uses are limited only by our imagination (and understanding of geology). There are RSS feeds reporting every earthquake in the country in near real time and Google Earth mash ups visualizing this data. Soil type and water table data is available to understand how far and where seismic waves will travel, giving us insight into how much things will shake.
Collecting and analyzing all this data helps architects build stronger buildings and guides urban planners to favor one location over another. This data allows us to make smarter preventative decisions and yet even with all this data, we can only predict the probability that an earthquake will occur within a 20-30 year window of time. This is a great start but we still simply don't have enough data to be more precise in our predictions. As we collect more data and refine our models accuracy will improve, but as with most things geological we must be patient.
Using the Data we Have
Disasters like the shooting at Virginia Tech have driven us to organize and deliver data in new ways. Why didn't every student receive a warning text message and an email during the shootings? In hindsight it seems so obvious; of course the university has the data. The reality was that even though they had the data it wasn't readily available and there was no process in place to put it to use. Learning to manage and use the data we are collecting is as important as our ability to collect and store it. The IDC claimed that in 2007 the world generated 161 exabytes of information. How much of that is being used four years later, and how much more would be used if we could easily access it? Taking it one step further, even if we understand the data and know how to access it can we do so in a timely manner?
There are people in every profession across the world that are collecting and analyzing massive amounts of data. This data helps keep the power on, warns us when bad weather approaches, shuts down reactors during emergencies and keeps our financial markets in check. All this is made possible by the timely collection and analysis of large amounts of data. The difficult part is knowing how and when to use the data we collect. Sometimes it takes a disaster before we learn how to use information to prevent one.