公司简介
 
                                Design and Develop ETL Processes:
-Lead the design and implementation of ETL processes using all kinds of batch/streaming tools to extract, transform, and load data from various sources into GCP.
-Collaborate with stakeholders to gather requirements and ensure that ETL solutions meet business needs.
Data Pipeline Optimization:
-Optimize data pipelines for performance, scalability, and reliability, ensuring efficient data processing workflows.
-Monitor and troubleshoot ETL processes, proactively addressing issues and bottlenecks.
Data Integration and Management:
-Integrate data from diverse sources, including databases, APIs, and flat files, ensuring data quality and consistency.
-Manage and maintain data storage solutions in GCP (e.g., BigQuery, Cloud Storage) to support analytics and reporting.
GCP Dataflow Development:
-Write Apache Beam based Dataflow Job for data extraction, transformation, and analysis, ensuring optimal performance and accuracy.
-Collaborate with data analysts and data scientists to prepare data for analysis and reporting.
Automation and Monitoring:
-Implement automation for ETL workflows using tools like Apache Airflow or Cloud Composer, enhancing efficiency and reducing manual intervention.
-Set up monitoring and alerting mechanisms to ensure the health of data pipelines and compliance with SLAs.
-Reliability Strategy: Develop and implement strategies to improve the reliability and performance of applications and infrastructure, focusing on service level objectives (SLOs) and service level indicators (SLIs).
-Incident Management: Oversee incident response processes, ensuring timely resolution of incidents and minimizing downtime. Conduct post-mortem analyses to identify root causes and implement preventive measures.
-Automation and Tooling: Drive the automation of operational tasks and processes, leveraging tools and technologies to enhance efficiency and reduce manual intervention.
-Monitoring and Alerting: Establish comprehensive monitoring and alerting systems to proactively identify and address performance issues, ensuring system health and availability.
-Documentation: Maintain clear and comprehensive documentation of systems, processes, and procedures to facilitate knowledge sharing and compliance.
-Security and Compliance: Ensure that all systems and processes adhere to security best practices and regulatory requirements, collaborating with security teams as needed.
-ontinuous Improvement: Identify opportunities for process improvements and lead initiatives to enhance the overall reliability and performance of systems and services.
-Bachelor's degree in Computer Science, Information Technology, or a related field.
-Proven experience in designing and managing ETL solutions, including data modeling, data warehousing, and SQL development.
-Proven experience in Site Reliability Engineering, DevOps, or a related field, with a strong understanding of system architecture and cloud technologies.
-Experience of cloud-based solutions, especially in GCP, cloud certified candidate is preferred.
-Experience and knowledge of Bigdata data processing in batch mode and streaming mode, proficient in Bigdata eco systems, e.g. Hadoop, HBase, Hive, MapReduce, Kafka, Flink, Spark, etc.
-Familiarity with Java & Python for data manipulation on Cloud/Bigdata platform.
-Experience with incident management and response, including post-mortem analysis and root cause identification.
-Excellent problem-solving skills and the ability to work under pressure in a fast-paced environment.
-Strong communication skills, both verbal and written, with the ability to convey technical concepts to diverse audiences.
-Proven experience in AI and machine learning, preferably in the banking or financial services industry