Job Description
Design and implement distributed data processing pipelines using Spark, Hive, Sqoop, Python, and other tools and languages prevalent in the Hadoop ecosystem. Design and implement end-to-end solutions. Build utilities, user-defined functions, and frameworks to better enable data flow patterns. Research, evaluate, and utilize new technologies/tools/frameworks centered around Hadoop and other elements in the Big Data space. Define and build data acquisitions and consumption strategies. Build and incorporate automated unit tests, participate in integration testing efforts. Work with teams to resolve operational and performance issues. Work with architecture/engineering leads and other teams to ensure quality solutions are implemented and engineering best practices are defined and adhered to.
Qualifications
6+ years’ experience in large-scale software development 1+ year experience in Hadoop Strong Java programming, Python, shell scripting, and SQL Strong development skills around Hadoop, Spark, MapReduce, and Hive Strong understanding of Hadoop internals Good understanding of file formats including JSON, Parquet, Avro, and others Experience with databases like Oracle Experience with performance/scalability tuning, algorithms, and computational complexity Experience (at least familiarity) with data warehousing, dimensional modeling, and ETL development Ability to understand ERDs and relational database schemas Proven ability to work with cross-functional teams to deliver appropriate resolution
Please send resumes to sankar@Aroghia.com
Additional Information
All your information will be kept confidential according to EEO guidelines.