Local Data on a Global Scale

Posted by Daniel D. Gutierrez on Fri, Apr 11, 2014

Hyperlocal data smallWith the help of technology and a hankering for local Big Data, Hyperlocal Data lets you be there without going there.

In a world where most things are only a click away, humans have a newfound appreciation for local knowledge. Farmers congregate at the local grain elevator to talk corn prices. Local growers and bakers gather to sell their goods (and eye the competition) at local farmers’ markets. You can judge a melon’s quality by the sound it makes when you tap it or a pineapple’s ripeness by how easy it is to pluck out a top leaf. There’s a lot of value in being in a place, in person, using all your senses. 

And while technology will never be able to replace the in-person experience, we are seeing impressive new ways companies are culling local knowledge and images to provide more color (literally and figuratively) around the goods they’re interested in. It’s called hyperlocal data collection and analysis, and it offers local views on a global scale using pictures and notes taken at various locations around the world. Hyperlocal data has two major dimensions: geography and time.

 Now, with the help of modern Big Data science technologies, the utilization of hyperlocal data is rapidly becoming a reality. With the help of a small army of data collectors around the world, technology companies are seeing a surge in interest surrounding hyperlocal data. They are organizing this information and delivering it to those who can analyze it. We can now quickly obtain global data and trends about specific crops like corn, potatoes, or tomatoes and tie this information to weather forecasts and rainfall totals to yield actionable intelligence that grocery chain buyers or even stockbrokers could use. Hyperlocal data collection can be seen as a probing global inflation monitor, a method of viewing food security, or a way a manufacturer can assess what kind of shelf space its products are getting. 

Creating a well-rounded index starts with the data collectors. These contributors around the globe use a smartphone application to photograph food and other goods in public markets. The photos are analyzed for pricing information, quality, and product placement. Then that information is matched to other data to build real-time indexes. Those indexes are then sold to large consumer product industry players, hedge funds, and financial industry giants — just about any entity hungry for insightful data.

This means a bakery owner can look through pictures to gauge the price of wheat, see whether a certain location has a good supply of raisins, or that nuts from her usual source are substandard quality and then make decisions accordingly. By having this knowledge early or in near-real time, business owners, (and others) can help cushion the blow of sudden price spikes or product shortages or have enough time to plan an alternative offering without impacting current revenue or local food security.  

Hyperlocal Data — From Theory to Reality
One company that is building a network of hyperlocal data is Premise, a company that’s a cross between Google Street View and the Consumer Price Index. Premise is striving to build the machinery to improve global economic transparency. It’s recruiting hyperlocal data collectors who are paid to visit a handful of stores daily and take pictures of food, drinks, and other products. Currently, Premise employs 700 people in 25 developing countries who are helping build a real-time stock market of world prices.

There are already a number of actual and potential success stories. Premise claims to have spotted broad national inflation in India months ahead of the government by looking at onion prices in a couple of markets. Consumers of hyperlocal data are able to find novel patterns and relationships among different kinds of information using unsupervised machine learning techniques like clustering and be able to make intelligent and timely decisions. Ultimately, many businesses could act like Amazon by making real-time predictions and instantaneous price adjustments.

Crowdsourced Data Collection
The crowdsourced workers for Premise are recruited by managers in each country and receive eight to ten cents per picture. Premise also gathers geography and time information from the smartphones, along with notes on things like conditions at the marketplace. Price data extracted from the photos gets merged with price information obtained from other sources such as thousands of e-commerce sites. The data is then aggregated in the form of national inflation indexes and price maps for specific markets. 

Subscribers of the Premise hyperlocal data service include hedge funds and large consumer goods companies like Procter & Gamble. Subscriptions to the service range from $1,500 to more than $15,000 a month. There is also a version that offers free data to educational institutions and nonprofit groups. 

Another crowdsourced resource is Numbeo, which touts itself as the world’s largest database of user-contributed data about cities and countries worldwide. Numbeo provides current information on world living conditions, including cost of living, housing indicators, health care, traffic, crime, and pollution from over 4,000 cities. 

The Value of Hyperlocal Data
Premise’s photos create a portrait of local consumer culture that offers unique details for producers, investors, and policymakers. For example, if an analyst wants to track all of the Walmarts in Mexico City to gauge consumer activity, the pictures can do that. The crowdsourced work force can take pictures at those Walmarts that will tell how many shoppers are exiting and entering at specific times of the day, how many of them are carrying bags, the length of the check-out lines, the number of people who walked past the butcher counter, how many customers bought meat, and so on.

Premise can also build custom indexes that reveal the very different lives of each country's citizens. Hyperlocal data can be tapped to understand how the poor are getting by in Rio, how the rich are faring in Mumbai, or what's really happening behind the inexplicable official figures in China. In the thousands of photos uploaded to Premise each day, literally seeing where shampoo is out of stock in, say, Hong Kong, or which Buenos Aires bodegas have lines stretching out the door, can help policymakers see the beginnings of scarcity.

Open Government Economic Data
Hyperlocal data can be combined with open government data to expand the data’s reach. To get a more holistic view of economic trends, a large multi-national network that combines open government indicators and hyperlocal data is required. Currently, 40 national governments now offer data on population, land use, and other matters, and they’re using hyperlocal data (in part) to do so. The U.S. government alone has 90,925 sets of open data. Further, government data can be paired with sensors on smartphones, jet engines, and even electric vehicle charging stations. These data generators upload information from across the physical world into the cloud and contribute to valuable analytics.

Traditionally, government and private data weren’t collected particularly fast or well, but recent drops in sensor prices coupled with pervasive connectivity have enabled change. Furthermore, real-time analytics using the data has finally become a reality with the advent of new technologies such as Apache Spark, which handles rapid cluster computing.

Hyperlocal data represents an entirely new level of insight into global economic data. Forward-thinking companies that use this data in an intelligent manner, possibly coupled with open government data, stand to benefit in significant ways. Premise is even working on photo-recognition algorithms using machine-learning methods like neural networks that can tell when store shelves are emptying of staples quickly or when produce has turned bad across an entire neighborhood, potentially signaling a food security crisis. As the hyperlocal data is sifted, cleaned, and aggregated and then sold to hedge funds, government agencies, consumer goods companies, and NGOs, the process becomes very compelling. Big Data meets analytics meets crowdsourced econometrics. 


Daniel D. Gutierrez is a Los Angeles–based data scientist working for a broad range of clients through his consultancy AMULET Analytics. He’s been involved with data science and Big Data since long before it came in vogue, so imagine his delight when the Harvard Business Review deemed “data scientist” as the sexiest profession for the 21st century. He is also a recognized Big Data journalist and is working on a new machine-learning book due out in later this year.

Topics: Big Data