Senior Big Data Engineer
Final is a world leader in trading algorithms and trade execution technologies development. Our multi-disciplinary teams have developed a unique and highly successful machine learning algorithmic based HFT platform that delivers excellent results. In a world increasingly dominated by learning machines and artificial intelligence, we at Final are especially proud of our humans. Our elite team of exceptional people are the soul of our company, and it is our top priority to provide them with a professionally fulfilling environment that supports healthy work-life balance. Our employees are encouraged to pursue their passions outside of work and we are proud to offer them a variety of opportunities, multiple resources and an agile work environment which promotes their well-being.
We are searching for an innovative and experienced Senior Big Data Engineer that will join us and be part of our big data group.
As a Senior Big Data Engineer, you will:
- Lead the architecture and development of mission-critical, diverse and large-scale data pipelines over both public and on-prem cloud solutions.
- Derive policies for data storage and versioning over peta-bytes scale complex storage solutions of multiple data lakes.
- Design and develop smart ML infrastructure in order to build Agile ML processes (MLOps).
- Work with Data Scientists to understand needs and design ML infrastructure solutions
- Be a part of a collaborative heterogeneous data team which includes developers, data scientists, data engineers, MLOps and data scouts.
Requirements:
- BSc / MSc degree in Computer Science/ Engineering / Mathematics or Statistics
- At least 5 years of experience working as a Data Engineer or Data Architect or ML Engineer
- At least 3 years of experience working in high level software development, preferably in Python
- Proven hands-on experience in designing, developing and optimizing complex solutions that move and/or manipulate large volumes of data
- Sound understanding of different big data file formats such as Parquet, Arrow, HDF5, etc.
- Experience with Docker, Linux, CI/CD tools and concepts, Kubernetes.
- Experience with data pipelining tools such as Airflow, Kubeflow, MLFlow and similar technologies.
- Hands-on experience with large-scale distributed computing systems and related technologies (e.g., Spark, Hive)
- Understanding of ML concepts and processes.
- Result oriented, self-motivated, fast learner, can do attitude and a great team player
Advantage:
- Experience and understanding of various storage solutions (NFS, S3, software defined distributed storage, etc.)
- Experience working with data versioning tools such as DVC, LakeFS, GitLFS or similar technologies.
- Experience working with AWS data processing tools and concepts.
- Experience with large, offline on-premises production deployments.
- Experience with Pandas, Numpy, Jupyter notebook.
- Hands-on experience in lower-level programming languages such as C++ or RUST.