Apache Flume: Distributed Log Collection for Hadoop - Second Edition
Design and implement a series of Flume agents to send streamed data into HadoopIn DetailApache Flume is a distributed, reliable, and available service used to efficiently collect, aggregate, and move large amounts of log data. It is used to stream logs from application servers to HDFS for ad hoc ana...
Saved in:
Main Author: | |
---|---|
Format: | eBook |
Language: | English |
Published: |
Packt Publishing
2015
|
Edition: | Second Edition. |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Design and implement a series of Flume agents to send streamed data into HadoopIn DetailApache Flume is a distributed, reliable, and available service used to efficiently collect, aggregate, and move large amounts of log data. It is used to stream logs from application servers to HDFS for ad hoc analysis.This book starts with an architectural overview of Flume and its logical components. It explores channels, sinks, and sink processors, followed by sources and channels. By the end of this book, you will be fully equipped to construct a series of Flume agents to dynamically transport your stream data and logs from your systems into Hadoop.A step-by-step book that guides you through the architecture and components of Flume covering different approaches, which are then pulled together as a real-world, end-to-end use case, gradually going from the simplest to the most advanced features.What You Will LearnUnderstand the Flume architecture, and also how to download and install open source Flume from ApacheFollow along a detailed example of transporting weblogs in Near Real Time (NRT) to Kibana/Elasticsearch and archival in HDFSLearn tips and tricks for transporting logs and data in your production environmentUnderstand and configure the Hadoop File System (HDFS) SinkUse a morphline-backed Sink to feed data into SolrCreate redundant data flows using sink groupsConfigure and use various sources to ingest dataInspect data records and move them between multiple destinations based on payload contentTransform data en-route to Hadoop and monitor your data flows |
---|---|
ISBN: | 9781784392178 1784392170 |