Hadoop: the definitive guide

副标题:无

作   者:Tom White[著]

分类号:

ISBN:9787564138936

微信扫一扫,移动浏览光盘

简介

《Hadoop权威指南(影印版第3版修订版)》的内容包括:使用Hadoop分布式文件系统(HDFS)保存大数据集;使用MapReduce运行分布式计算;使用Hadoop的数据和I/O构件实现压缩、数据完整性、序列化(包括Avro)和持久化;了解常见的陷阱和高级特性,以编写实用的MapReduce程序;设计、构建和管理专用的Hadoo...显示全部信息

目录

Foreword
Preface
1. Meet Hadoop
Data!
Data Storage and Analysis
Comparison with Other Systems
Rational Database Management System
Grid Computing
Volunteer Computing
A Brief History of Hadoop
Apache Hadoop and the Hadoop Ecosystem
Hadoop Releases
What's Covered in This Book
Compatibility
2. MapReduce
A Weather Dataset
Data Format
Analyzing the Data with Unix Tools
Analyzing the Data with Hadoop
Map and Reduce
Java MapReduce
Scaling Out
Data Flow
Combiner Functions
Running a Distributed MapReduce Job
Hadoop Streaming
Ruby
Python
Hadoop Pipes
Compiling and Running
3. The Hadoop Distributed Filesystem
The Design of HDFS
HDFS Concepts
Blocks
Namenodes and Datanodes
HDFS Federation
HDFS High-Availability
The Command-Line Interface
Basic Filesystem Operations
Hadoop Filesystems
Interfaces
The Java Interface
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API
Writing Data
Directories
Querying the Filesystem
Deleting Data
Data Flow
Anatomy of a File Read
Anatomy of a File Write
Coherency Model
Data Ingest with Flume and Sqoop
Parallel Copying with distcp
Keeping an HDFS Cluster Balanced
Hadoop Archives
Using Hadoop Archives
Limitations
4. Hadoop I/O
Data Integrity
Data Integrity in HDFS
LocalFileSystem
ChecksumFileSystem
Compression
Codecs
Compression and Input Splits
Using Compression in MapReduce
Serialization
The Writable Interface
Writable Classes
Implementing a Custom Writable
Serialization Frameworks
Avro
Avro Data Types and Schemas
In-Memory Serialization and Deserialization
Avro Datafiles
Interoperability
Schema Resolution
Sort Order
Avro MapReduce
Sorting Using Avro MapReduce
Avro MapReduce in Other Languages
File-Based Data Structures
SequenceFile
MapFile
5. Developing a MapReduce Application
The Configuration API
Combining Resources
Variable Expansion
Setting Up the Development Environment
Managing Configuration
GenericOptionsParser, Tool, and ToolRunner
Writing a Unit Test with MRUnit
Mapper
Reducer
Running Locally on Test Data
Running a Job in a Local Job Runner
Testing the Driver
Running on a Cluster
Packaging a Job
Launching a Job
The MapReduce Web UI
Retrieving the Results
Debugging a Job
Hadoop Logs
Remote Debugging
Tuning a Job
Profiling Tasks
MapReduce Workflows
Decomposing a Problem into MapReduce Jobs
JobControl
Apache Oozie
6. How MapReduce Works
Anatomy of a MapReduce Job Run
Classic MapReduce (MapReduce 1)
YARN (MapReduce 2)
Failures
Failures in Classic MapReduce
Failures in YARN
Job Scheduling
The Fair Scheduler
The Capacity Scheduler
Shuffle and Sort
The Map Side
The Reduce Side
Configuration Tuning
Task Execution
The Task Execution Environment
Speculative Execution
Output Committers
Task JVM Reuse
Skipping Bad Records
7. MapReduceTypes and Formats
MapReduce Types
The Default MapReduce Job
Input Formats
Input Splits and Records
Text Input
Binary Input
Multiple Inputs
Database Input (and Output)
Output Formats
Text Output
Binary Output
Multiple Outputs
Lazy Output
Database Output
8. MapReduce Features
Counters
Built-in Counters
User-Defined Java Counters
……
9. Settinq Up a Hadoop Cluster
10. Administering Hadoop
11. Pig
12. Hive
13. HBase
14. ZooKeeper
15. Sqoop
16. Case Studies
A. Installing Apache Hadoop
B. Cloudera's Distribution Including Apache Hadoop
C. Preparing the NCDC Weather Data
Index

已确认勘误

次印刷

页码 勘误内容 提交人 修订印次

Hadoop: the definitive guide
    • 名称
    • 类型
    • 大小

    光盘服务联系方式: 020-38250260    客服QQ:4006604884

    意见反馈

    14:15

    关闭

    云图客服:

    尊敬的用户,您好!您有任何提议或者建议都可以在此提出来,我们会谦虚地接受任何意见。

    或者您是想咨询:

    用户发送的提问,这种方式就需要有位在线客服来回答用户的问题,这种 就属于对话式的,问题是这种提问是否需要用户登录才能提问

    Video Player
    ×
    Audio Player
    ×
    pdf Player
    ×
    Current View

    看过该图书的还喜欢

    some pictures

    解忧杂货店

    东野圭吾 (作者), 李盈春 (译者)

    loading icon