Kurzy a certifikace Cloudera
Developer Training For Spark and Hadoop
This four-day hands-on training course delivers the key concepts and expertise participants need to ingest and process data on a Hadoop cluster using the most up-to-date tools and techniques. Employing Hadoop ecosystem projects such as Spark, Hive, Flume, Sqoop, and Impala, this training course is the best preparation for the real-world challenges faced by Hadoop developers. Participants learn to identify which tool is the right one to use in a given situation, and will gain hands-on experience in developing using those tools.
- This course is designed for developers and engineers who have programming experience.
Through instructor-led discussion and interactive, hands-on exercises, participants will learn Apache Spark and how it integrates with the entire Hadoop ecosystem, learning:
- How data is distributed, stored, and processed in a Hadoop cluster
- How to use Sqoop and Flume to ingest data
- How to process distributed data with Apache Spark
- How to model structured data as tables in Impala and Hive
- How to choose the best data storage format for different data usage patterns
- Best practices for data storage
Read the entire course outline for more details.
- Apache Spark examples and hands-on exercises are presented in Scala and Python, so the ability to program in one of those languages is required
- Basic familiarity with the Linux command line is assumed. Basic knowledge of SQL is helpful
- Prior knowledge of Hadoop is not required