Pearson Data Engineer in Poznan, Poland

Data Engineer


  • Designing and implementing ETL processes in 3rd party cloud solutions (mostly AWS)

  • Transition of legacy processes into cloud services to minimise technical debt

  • Maintenance of existing ETL processes

  • Integration of the data coming from a variety of sources

  • Contribution to the Pearson digital transformation long-term strategy


  • Good knowledge of Python and popular libraries, e.g. virtualenv, pip, argparse, logging, datetime, imp, psycopg2, mysql-connector, pymongo, boto3, pandas, numpy, Anaconda

  • Awareness of best practices in software development

  • Ability to write clean and efficient code

  • Good knowledge of SQL and relational databases

  • Experience with git and git-based code review process

  • Hands-on experience with Big Data technologies like Spark and cloud services like AWS or Google Cloud Platform

  • At least basic understanding of AWS services architecture

  • Knowledge of Data Modeling

  • Experience in working in the Linux environment

  • Good spoken English

Nice to have:
  • Good knowledge of AWS services: S3, EC2, EMR, Athena, Lambda, SNS, Glue

  • Knowledge of NoSQL databases

  • Experience in working with TensorFlow or similar technologies, e.g. PyTorch

  • Experience with Jupiter/Zeppelin Notebook

  • Experience in working in an international environment

Primary Location: PL-PL-Poznan

Work Locations: PL-Poznan-77 Dabrowskiego Dąbrowskiego 77 Poznan 60-529

Job: Research and Development

Organization: Global Product

Job Type: Standard

Shift: Day Job

Job Posting: Aug 7, 2018

Job Unposting: Ongoing

Schedule: Full-time Regular

Req ID: 1811273

Equal Opportunity Employer Minorities/Women/Protected Veterans/Disabled