N-iX is looking for an experienced Middle Data Engineer to join the ambitious Data Scientists and Data Engineers with a unique blend of Business, Scientific, and Machine Learning expertise that seeks to unleash the full potential of our data assets by bringing technology and data and insights to the forefront of decision making via fit-for-purpose data analytics solutions.
About the role:
The Data Engineer is a key technical role to enable the application of advanced analytics techniques across clinical development processes. Enables Data flow, Data Wrangling, and Data Modeling in support of advanced analytics techniques like Machine Learning and Visual Analytics. Leverages standards and best practices to ensure consistency. Accountable for gathering cross-functional input. Ensures data integrity and quality prior to analysis.
- Design, development, testing, and documentation of complex programs from approved specifications, and subsequent iterations, using the operating model process and best practices
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL/NoSQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
Nice to have:
- 2+ years of experience with Python, Spark (PySpark), SQL
- Hands-on Experience with at least one of each, relational and NoSQL database, including Postgres, S3, Redshift/Spectrum, Cassandra, Dynamo DB, etc.
- Experience with at least one of the data pipeline and workflow management tools: Airflow, Oozie, etc.
- Good communication and presentation skills, being able to explain complex problems and the solutions applied, and being comfortable in presenting technical solutions to a non-technical audience
- BS or MS in statistics, mathematics, analytics, data/computer sciences, or equivalent field
- Upper-intermediate or higher level of English
- Biotech / Pharma experience is a plus
- Experience with other big data technologies: Hadoop, HDFS, EMR, Cloudera, etc.
- Experience in SOAP, REST, JSON, XML technologies, and knowledge of Unix/Linux shell scripts
- Experience with AWS cloud services as a platform and as an infrastructure
- Experience with stream-processing systems: Storm, Spark-Streaming, Kinesis, etc.
- Understanding of Statistics and machine learning concepts
Implementation experience of DevOps CICD automation
- Flexible working hours
- A competitive salary and good compensation package
- Best hardware
- A masseur and a corporate doctor
- Healthcare & sport benefits
- An inspiring and comfy office
- Challenging tasks and innovative projects
- Meetups and events for professional development
- An individual development plan
- Mentorship program
- Corporate events and outstanding parties
- Exciting team buildings
- Memorable anniversary presents
- A fun zone where you can play video games, foosball, ping pong, and more.