公司简介
Design and Build Data Processing Systems:
• Collaborate with cross-functional teams to understand data requirements and design efficient data pipelines.
• Implement data ingestion, transformation, and enrichment processes using GCP services (such as BigQuery, Dataflow, and Pub/Sub).
• Ensure scalability, reliability, and performance of data processing workflows.
Data Ingestion and Processing:
• Collect and ingest data from various sources (both batch and real-time) into GCP.
• Cleanse, validate, and transform raw data to ensure its quality and consistency.
• Optimize data processing for speed and efficiency.
Data Storage and Management:
• Choose appropriate storage solutions (e.g., Bigtable, Cloud Storage) based on data characteristics and access patterns.
• Create and manage data warehouses, databases, and data lakes.
• Define data retention policies and archival strategies.
Data Preparation for Analysis:
• Prepare data for downstream analytics, reporting, and machine learning.
• Collaborate with data scientists and analysts to understand their requirements.
• Ensure data is accessible, well-organized, and properly documented.
Automation and Monitoring:
• Automate data workflows using tools like Apache Airflow or Cloud Composer.
• Monitor data pipelines, troubleshoot issues, and proactively address bottlenecks.
• Implement alerting mechanisms for data anomalies.
Security and Compliance:
• Apply security best practices to protect sensitive data.
• Ensure compliance with industry regulations (e.g., GDPR, HIPAA) and internal policies.
• Collaborate with security teams to address vulnerabilities.
Cloud Infrastructure Management:
• Manage and monitor Google Cloud Platform (GCP) services and components.
• Design and implement automated deployment pipelines for application releases.
• Ensure high availability, scalability, and security of cloud resources.
CI/CD Pipeline Implementation:
• Build and maintain continuous integration and continuous deployment (CI/CD) pipelines.
• Collaborate with developers to streamline application deployment processes.
• Automate testing, deployment, and rollback procedures.
Infrastructure as Code (IaC):
• Use tools like Terraform to define and manage infrastructure.
• Maintain version-controlled infrastructure code.
• Ensure consistency across development, staging, and production environments.
Monitoring and Troubleshooting:
• Monitor system performance, resource utilization, and application health.
• Troubleshoot and resolve issues related to cloud infrastructure and deployment pipelines.
• Implement proactive monitoring and alerting mechanisms.
Security and Compliance:
• Apply security best practices to protect cloud resources.
• Ensure compliance with industry standards and internal policies.
• Collaborate with security teams to address vulnerabilities.
Documentation and Knowledge Sharing:
• Document data pipelines, architecture, and processes.
• Share knowledge with team members through documentation, training sessions, and workshops.
• Bachelor’s degree in Computer Science, Information Systems, or a related field.
• Minimum of 3 years of industry experience, including at least 1 year designing and managing solutions using Google Cloud.
• amiliarity with GCP services (BigQuery, Dataflow, Pub/Sub, etc.) and related technologies.
• Experience with data modeling, ETL processes, and data warehousing, SQL
• Familiarity with GCP services (Compute Engine, Kubernetes Engine, Cloud Storage, etc.).
• Exposure to containerization (Docker, Kubernetes) and microservices architecture.
• GenAI prompt engineering, Infra as code, master in CI/CD pipelines, containerization and orchestration, IAM, compliance awareness, design for scalability, cost optimization, multi-cloud and hybrid strategies.
What additional skills will be good to have?
• Proficiency in Python for data manipulation and scripting.
• Strong knowledge of programming languages such as Java
• Knowledge of Terraform for infrastructure as code (IaC).
• Familiarity with Jenkins for continuous integration and deployment.
• Strong understanding of DevOps principles and practices
• Experience with GKE (Google Kubernetes Engine) for container orchestration.
• Understanding of event streaming platforms (e.g., Kafka, Google Cloud Pub/Sub).
• Strong problem-solving skills and attention to detail.
• Proven experience in AI and machine learning, preferably in the banking or financial services industry