Wir verwenden Cookies und Analyse-Tools, um die Nutzerfreundlichkeit der Internet-Seite zu verbessern und für Marketingzwecke. Wenn Sie fortfahren, diese Seite zu verwenden, nehmen wir an, dass Sie damit einverstanden sind. Zur Datenschutzerklärung.
Guide to High Performance Distributed Computing
Details
This timely text/reference describes the development and implementation of large-scale distributed processing systems using open source tools and technologies. Comprehensive in scope, the book presents state-of-the-art material on building high performance distributed computing systems, providing practical guidance and best practices as well as describing theoretical software frameworks. Features: describes the fundamentals of building scalable software systems for large-scale data processing in the new paradigm of high performance distributed computing; presents an overview of the Hadoop ecosystem, followed by step-by-step instruction on its installation, programming and execution; Reviews the basics of Spark, including resilient distributed datasets, and examines Hadoop streaming and working with Scalding; Provides detailed case studies on approaches to clustering, data classification and regression analysis; Explains the process of creating a working recommender system using Scalding and Spark.
Provides a guide to the distributed computing technologies of Hadoop and Spark, from the perspective of industry practitioners Supports the theory with case studies taken from a range of disciplines, including data mining, machine learning, graph processing and image processing Supplies working source code to aid understanding through step-by-step implementation Includes supplementary material: sn.pub/extras
Inhalt
Part I: Programming Fundamentals of High Performance Distributed Computing.- Introduction.- Getting Started with Hadoop.- Getting Started with Spark.- Programming Internals of Scalding and Spark.- Part II: Case studies using Hadoop, Scalding and Spark.- Case Study I: Data Clustering using Scalding and Spark.- Case Study II: Data Classification using Scalding and Spark.- Case Study III: Regression Analysis using Scalding and Spark.- Case Study IV: Recommender System using Scalding and Spark.
Weitere Informationen
- Allgemeine Informationen
- GTIN 09783319383477
- Herausgeber Springer International Publishing
- Anzahl Seiten 324
- Lesemotiv Verstehen
- Genre IT Encyclopedias
- Auflage Softcover reprint of the original 1st edition 2015
- Gewicht 493g
- Untertitel Case Studies with Hadoop, Scalding and Spark
- Größe H235mm x B155mm x T18mm
- Jahr 2016
- EAN 9783319383477
- Format Kartonierter Einband
- ISBN 3319383477
- Veröffentlichung 06.10.2016
- Titel Guide to High Performance Distributed Computing
- Autor Anil Kumar Muppalla , K. G. Srinivasa
- Sprache Englisch