We are seeking a talented, self-directed Data Engineer to design, develop, implement, test, document, and operate large-scale, high-volume, high-performance data structures for our business stakeholders. Implement data structures using best practices in data modeling and ETL/ELT processes.
Gather business and functional requirements and translate these requirements into robust, scalable, operable solutions that work well within the overall data architecture. Analyze source data systems and drive best practices in source teams.
Participate in the full development life cycle, end-to-end, from design, implementation and testing, to documentation, delivery, support, and maintenance. Produce comprehensive, usable dataset documentation and metadata.
Evaluate and make decisions around dataset implementations designed and proposed by peer data engineers. Evaluate and make decisions around the use of new or existing software products and tools.
- Design and implement data pipelines in Hadoop platform
- Understand business requirement and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements
- Fine-tuning of new and existing data pipelines
- Schedule and maintain data pipelines
- Drive optimization, testing and tooling to improve data quality
- Assemble large, complex data sets that meet functional / non-functional business requirements
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, etc
- Build robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users
- Work with data scientist and business analytics team to assist in data ingestion and data related technical issues
The ideal candidate should have/ be:
- Bachelor's degree in IT, Computer Science, Software Engineering, Business Analytics or equivalent.
- Minimum 4 years of experience in data warehousing / distributed system such as Hadoop
- Experience with relational SQL and NoSQL DB
- Experience in building and optimizing 'big data' data pipelines, architectures and data sets
- Excellent experience in Scala or Python
- Experience in ETL and / or data wrangling tools for big data environment
- Ability to troubleshoot and find complex performance issues with queries on the Spark platform
- Knowledgeable on structured and unstructured data design / modeling, data access and data storage techniques
- Experience in DevOps environment
- Highly organized, self-motivated, pro-active, and able to plan
- Ability to analyze and understand complex problems
- Ability to explain technical information in business terms
- Ability to communicate clearly and effectively, both verbally and in writing
- Strong in User Requirements Gathering, Maintenance and Support
- Good experience managing users and vendors
- Familiar with Agile Methodology
For more information about this role please contact our Singapore office Spencer Ogden Energy Pte Ltd Agency License Number: 13C6321