Johns Hopkins University
Big Data Processing Using Hadoop Specialization
Johns Hopkins University

Big Data Processing Using Hadoop Specialization

Master Big Data Processing with Hadoop. Gain hands-on experience with Hadoop tools and techniques to efficiently process, analyze, and manage big data in real-world applications.

Karthik Shyamsunder

Instructor: Karthik Shyamsunder

Access provided by New York State Department of Labor

Get in-depth knowledge of a subject
Intermediate level

Recommended experience

3 months
at 5 hours a week
Flexible schedule
Learn at your own pace
Get in-depth knowledge of a subject
Intermediate level

Recommended experience

3 months
at 5 hours a week
Flexible schedule
Learn at your own pace

What you'll learn

  • Gain expertise in Hadoop ecosystem components like HDFS, YARN, and MapReduce for big data processing and management across various tasks.

  • Learn to set up, configure, and utilize tools like Hive, Pig, HBase, and Spark for efficient data analysis, processing, and real-time management.

  • Develop advanced programming techniques for MapReduce, optimization methods, and parallelism strategies to handle large-scale data sets effectively.

  • Understand the architecture and functionality of Hadoop and its components, applying them to solve complex data challenges in real-world scenarios.

Details to know

Shareable certificate

Add to your LinkedIn profile

Taught in English
Recently updated!

January 2025

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Advance your subject-matter expertise

  • Learn in-demand skills from university and industry experts
  • Master a subject or tool with hands-on projects
  • Develop a deep understanding of key concepts
  • Earn a career certificate from Johns Hopkins University
Coursera Career Certificate

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV

Share it on social media and in your performance review

Coursera Career Certificate

Specialization - 4 course series

What you'll learn

  • Define Big Data, explore its relevance in analytics and data science, and understand trends shaping modern data processing technologies.

  • Examine Hadoop architecture, its ecosystem, and subprojects, distinguishing distributions and their roles in Big Data solutions.

  • Acquire practical skills to install, configure, and run Hadoop on a Linux virtual machine, enabling effective Big Data processing.

Skills you'll gain

Category: Apache Hadoop
Category: Big Data
Category: Distributed Computing
Category: Linux
Category: Data Infrastructure
Category: Scalability
Category: Analytics
Category: Open Source Technology
Category: Software Installation
Category: Data Science
Category: System Configuration
Category: Data Processing

What you'll learn

  • Understand HDFS architecture, components, and how it ensures scalability and availability for big data processing.

  • Learn to configure Hadoop for Java programming and perform file CRUD operations using HDFS APIs.

  • Master advanced HDFS programming concepts like compression, serialization, and working with specialized file structures like Sequence and Map files.

Skills you'll gain

Category: Apache Hadoop
Category: File Systems
Category: Scalability
Category: Data Storage
Category: Distributed Computing
Category: Data Structures
Category: Infrastructure Architecture
Category: Systems Architecture
Category: Java Programming
Category: Application Programming Interface (API)
Category: File Management
Category: Big Data
Category: Development Environment
Category: Data Processing

What you'll learn

  • Learn the fundamentals of YARN and MapReduce architectures, including how they work together to process large-scale data efficiently.

  • Understand and implement Mapper and Reducer parallelism in MapReduce jobs to improve data processing efficiency and scalability.

  • Apply optimization techniques such as combiners, partitioners, and compression to enhance the performance and I/O operations of MapReduce jobs.

  • Explore advanced concepts like multithreading, speculative execution, input/output formats, and how to avoid common MapReduce anti-patterns.

Skills you'll gain

Category: Apache Hadoop
Category: Data Processing
Category: Distributed Computing
Category: Data Architecture
Category: Software Architecture
Category: Performance Tuning
Category: Scalability
Category: Programming Principles
Category: Debugging

What you'll learn

  • Learn to set up and configure Hive, Pig, HBase, and Spark for efficient big data analysis and processing within the Hadoop ecosystem.

  • Master Hive’s SQL-like queries for data retrieval, management, and optimization using partitions and joins to enhance query performance.

  • Understand Pig Latin for scripting data transformations, including the use of operators like join and debug to process large datasets effectively.

  • Gain expertise in NoSQL databases with HBase for real-time read/write operations, and use Spark’s core programming model for fast data processing.

Skills you'll gain

Category: Data Analysis Software
Category: Data Transformation
Category: NoSQL
Category: Apache Hadoop
Category: Query Languages
Category: Data Processing
Category: Apache Spark
Category: Apache Hive
Category: SQL
Category: Data Management
Category: Data Manipulation
Category: Big Data
Category: Databases

Instructor

Karthik Shyamsunder
Johns Hopkins University
4 Courses431 learners

Offered by

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Coursera Plus

Open new doors with Coursera Plus

Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription

Advance your career with an online degree

Earn a degree from world-class universities - 100% online

Join over 3,400 global companies that choose Coursera for Business

Upskill your employees to excel in the digital economy