Welcome to my blogspot

Share your thoughts

Search This Blog

Monday, October 5, 2009

Operating a Data Warehouse with Hive, Amazon Elastic MapReduce and Amazon SimpleDB

This article shows how to use Amazon Elastic MapReduce and Hive to process logs uploaded to Amazon S3 from a fleet of boxes which are serving online advertising. The logs are processed and the resulting information is stored in a collection of relational tables persisted in Amazon S3 and queryable using Hive. Summaries of the data are pushed to Amazon SimpleDB where they are accessible to monitoring tools.

http://ow.ly/sHp9

No comments:

Post a Comment