ETL Hadoop Developer

Posted 13 Sep 2018

University of California Office of the President

San Francisco, CA United States

etl hadoop xml

ETL Hadoop Developer

University of California Office of the President

Requisition Number: 20180395

Appointment Type: Staff - Career

Personnel Program: MSP

Work Hours: Monday-Friday, 8:00am-5:00pm

Percentage of Time: 100

Organizational Area: Information Technology Services- Data Warehouse and Corporate Systems

Location: Oakland, CA

Posting Salary: Salary to commensurate with experience

Position Summary:

Join a small team of experts and make a significant contribution to the University of California. Take ownership and build on your career in this complex cross functional enterprise system.

The Enterprise Risk Management Information System (ERMIS) & Medical center data management system (DMS) was implemented to strategically manage risk to reduce the chance of loss, create greater financial stability, and protect UC resources. The data warehouse serves as the data repository for risk and controls related information from claims data, corporate data, human capital, risk data and many other information sources.

Under the direction of the ERMIS Supervisor, and other members of the ITS Data Services team, the incumbent is responsible for development, support and maintenance of current platform, and the design and implementation of new Extract Transform and Load processes to load data into the ERMIS data warehouse from various sources. Responsible for the ongoing operational stability of the ETL processes to ensure they are properly monitored and audited to provide data integrity, accuracy and timeliness of delivery. The sr. Developer/Lead is responsible for leading and mentoring developers and includes data analysis, source to target data mappings, job scheduling, and development and testing of ETL programs.


Duty 1: Ensure that all development standards are being followed and work closely with the BI Functional Lead and the Solutions Architect. Serve as a hands on Sr. Developer/lead to a team of ETL developers. Oversee code review functions for applications programs. Mentor developers in technical matters regarding the implementation of ITS data standards, guidelines and industry best practices.

Understand and applies industry practices, community standards and department/unit policies and procedures in depth.

Develop load and transformation processes in support of the requirements, validate that they meet business and technical specifications, manage ongoing maintenance of the system and data, and make recommendations for process improvements to optimize data movement from source to target.

Provide the ERMIS Supervisor/Manager with status and progress reports on development and maintenance activities as required. Participate in project planning and estimation.

Function:Technical Leadership

Percent: 40

Duty 2: Provide production and operational support to existing ETL jobs. Monitor and manage production ETL jobs to verify execution and measure performance to assure ongoing data quality and optimization of the system to manage scalability and performance and identify improvement opportunities for key ETL processes.

Development and execution of ETL test plans. Ensure that any issues raised during testing are addressed prior to migration into production.

Function:Production and Operational Support

Percent: 40

Duty 3: Application upgrades, maintenance and patches. Resource, capacity planning, and performance tuning. Provide access security rights. Work with SDSC data center on code migration and version control across environments.

Function:DataStage Administration

Percent: 20

Job Requirements

Bachelor's degree in computer science, information management or a related field.

Minimum 10+ years of experience in data related work with at least 4+ years as a senior ETL developer/Lead for large, complex decision support systems with emphasis on Hadoop implementation. Experience leading a small team in the design, development, testing, and implementation of ETL solutions using enterprise ETL tools.

Experience with dataware house and big data applications using Hadoop and relational database management systems required.

Strong knowledge of programming and scripting languages such as Java, Python, Scala, Unix, Shell.

Experience with major big data technologies specifically with Cloudera Distribution on Hadoop, Hive, Impala, MapReduce, Spark, SQL/HQL, UDFs, Maven Jenkins (Continuous integration), XML (Parsing), Hive XML Serde, Solr.

Extensive experience in developing, testing and implementing ETL processes, data profiling, data quality, metadata management, and data intake for large complex data systems. Skill modifying, performance tuning and documenting existing application ETL code. Ability to analyze, verify and document the accuracy of the developed ETL code through self-directed testing.

Ability to write technical documentation in a clear and concise manner. Ability to write a high level design/ functional design. Ability to interpret data modeling/data workflow diagrams (conceptual, logical and physical).

Ability to plan and organize technical work and deliverables. Ability to follow guidelines and adhere to the established software development standards and conventions.

Self motivated and independent. Able to work with minimum supervision and to work well with stakeholders and project staff. Ability to prioritize and multi-task across numerous work streams.

Strong interpersonal skills; ability to work on cross-functional teams. Strong verbal and written communication skills with an ability to express complex technical concepts in business terms and complex business concepts in technical terms. Ability to lead teams to consensus decisions on complex business and technical data challenges.

Deep knowledge of best practices through relevant experience across data-related disciplines and technologies particularly for enterprise wide data architectures and data warehousing/BI.

Works independently and as part of a team. Demonstrated problem-solving skills. Ability to learn effectively and meet deadlines. Demonstrated skill leading technical teams, including organizing workflow and scheduling assignments.

License Certifications:

Cloudera Certification


Experience with Data stage, DB2, Pig, HBase, Kafka, Oozie, Flume, ZooKeeper is good to have.

Experience with enterprise risk management data is preferred.


How to Apply

For complete job description and application instructions, visit:

About us

The University of California, one of the largest and most acclaimed institutions of higher learning in the world, is dedicated to excellence in teaching, research and public service. The University of California Office of the President is the corporate headquarters to the ten campuses, five medical centers and three Department of Energy National Labs and enrolls premier students from California, the nation and the world.

The University of California is an Equal Opportunity/Affirmative Action Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, age or protected veteran status.

Copyright ©2017 Inc. All rights reserved.


Job Source: Stackoverflow
Job Source: Stackoverflow

© Techie Jobs 2017. All rights reserved.