Data Engineer

Москва, Россия
Миддл • Сеньор
Аналитика, Data Science, Big Data • OLAP • Инженер • Разработчик • Data Science • Python • Hadoop • ClickHouse • Map Reduce • OLAP • Vertica
Удаленная работа
Опыт работы от 3 до 5 лет
от 370 000 до 430 000 ₽
Есть файл резюме (защищен)
О себе

На данный момент Data Engineer.

Мои компетенции и опыт

  1. RUBBLES, Senior Data Engineer | Moscow, Russia 
    • Development, implementation, and maintenance of ETL pipelines using AirFlow, including integration with various data sources such as FireBird, MSSQL, and PostgreSQL. 
    • Management of a multi-tiered data architecture in a centralized PostgreSQL-based data warehouse, ensuring reliable data storage and access. 
    • Data processing, filtering, and transformation to create data marts, providing high-quality datasets for machine learning and analytics needs. 
    • Full migration of all ETL processes to the Kubernetes (k8s) environment, including integration with Nexus and transitioning to KubernetesPodOperator for enhanced solution flexibility and scalability. 
    • Support of a FastAPI-based web server for internal analysts, reducing average data access time by 30 minutes. 
    • Development of a monitoring and alerting system to track DAG execution in AirFlow, as well as the state and performance of the web server, including metrics such as health checks, RPS, and the number of successful requests (200), using Prometheus. 
    • Optimization of complex SQL queries in PostgreSQL using tools like EXPLAIN, ANALYZE, VACUUM, and others, reducing execution time for some queries from 20 to 5 minutes. 
    • Optimization of Docker image storage on the file system to improve resource efficiency.
    • Execution of ad-hoc tasks requiring rapid analysis and decision-making. 
    • Participation in technical task discussions and preparation of technical documentation (specifications) for development, coordinating requirements with stakeholders.
     
  2. MAKING SCIENCE, Senior Data Engineer | Madrid, Spain 
    • Completing the entire ETL cycle including design, documentation, implementation, testing and deploying. 
    • Improvement and optimization of the image generation service have reduced its operation time from 3 minutes to 1 minute. 
    • Testing and development of new features for the image generation service, testing, and deployment via Terraform to production 
    • Conducting code reviews and introductory lessons for interns.
     
  3. BERESNEV Games, Middle Data Engineer | Prague, Czech Republic
    • Maintaining and building ELT pipelines, from source to output, using Airflow, Python, Docker. 
    • Monitoring, optimizing and maintaining the technical platform (Clickhouse) for scalability using Grafana. 
    • Optimised main data pipeline from 3 hours to 30 minutes using indexes, projections and variety of MergeTree engines in Clickhouse. 
    • Extracting, loading, transofrming, providing quality to data from variety of sources: API (Facebook, Google, AppStore, FireBird, TikTok, Pangle etc), Amazon S3. 
    • Desiging and developing dashboards aiming to be clear and impactful using Tableau, which helps to access and understand the data for BI-analytics and saves 3-5 hours. 
    • Developing web service using FastAPI for advertising monetization analytics - ’waterfall’, which saved more than 5 hours per week for analytics. 
    • Providing single source of truth for core KPIs and managing data quality in common with business, that increased analytics department performance by 20%. 
    • Created Python library to parse and reformat data from external vendors, reducing error rate in the data pipeline by 30%. 
    • Administrating, maintaining, supporting and improving Clickhouse cluster.
     
  4. OZON, Junior Data Engineer | Moscow, Russia
    • Designing ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability. 
    • Designing, improving and maintaining a large (billions of rows) datamarts on Vertica/Clickhouse, which was used by entire BI- department and helped to increase KPI metrics by 10%. 
    • Deploying and maintaining data services such as Airflow using Docker. 
    • Communication with the BI-analytics department, discussion of tasks, development of technical specifications. 
    • Designing data warehouse (Snowlake/Star schemes).


Есть файл резюме (защищен)


Интересные кандидаты