Hadoop: The Definitive Guide

Tom White

出版社

O'Reilly Media, Inc.

出版时间

2009-01-01

ISBN

9780596521998

评分

★★★★★

标签

web 编程

书籍介绍

Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The book includes case studies that illustrate how Hadoop solves specific problems.

Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The Definitive Guide provides you with the key for unlocking the wealth this data holds. Hadoop is ideal for storing and processing massive amounts of data, but until now, information on this open-source project has been lacking -- especially with regard to best practices. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters.

With case studies that illustrate how Hadoop solves specific problems, this book helps you:

* Learn the Hadoop Distributed File System (HDFS), including ways to use its many APIs to transfer data

* Write distributed computations with MapReduce, Hadoop's most vital component

* Become familiar with Hadoop's data and IO building blocks for compression, data integrity, serialization, and persistence

* Learn the common pitfalls and advanced features for writing real-world MapReduce programs

* Design, build, and administer a dedicated Hadoop cluster

* Use HBase, Hadoop's database for structured and semi-structured data

And more. Hadoop: The Definitive Guide is still in progress, but you can get started on this technology with the Rough Cuts edition, which lets you read the book online or download it in PDF format as the manuscript evolves.

用户评论

Hadoop挫逼一定是Java的错！

comprehensive and informative, though, outdated.

太多细节英文第三版

我读过最淫荡的技术书籍，虽然第三版覆盖的配置都已经过时了

Introduction to Hadoop// http://proquest.safaribooksonline.com/book/software-engineering-and-development/9781449328917

MapReduce讲的挺详细的，其他组件或框架或许还要找对应书籍再深入看，算是大数据框架入门了。

第三版

英文版和中文版的评价能分开吗一个时代的结束

大概13年左右看的当初学的这个之后的工作中派上了用场

大致翻过。谷歌三驾马车的开源实现，讲得比论文详细。