This article shows how to use Amazon Elastic MapReduce and Hive to process logs uploaded to Amazon S3 from a fleet of boxes which are serving online advertising. The logs are processed and the resulting information is stored in a collection of relational tables persisted in Amazon S3 and queryable using Hive. Summaries of the data are pushed to Amazon SimpleDB where they are accessible to monitoring tools.
http://ow.ly/sHp9
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment