Senior Data Engineer

Job Title:

Senior Data Engineer

Location:

Remote (Preference for Ukraine-based candidates)

Reporting To:

Engineering Manager

About us:

Are you passionate about safeguarding countries, societies, and businesses from online manipulation and disinformation? Look no further! Our rapidly scaling company is passionately committed to this crucial mission, offering you a privileged opportunity to collaborate with some of the world's most influential organizations, including NATO and the EU. As a Ukrainian team, we are determined to deliver our meaningful change.

If you're ready to join a dynamic team working towards a common vision of a safer digital world, we invite you to be a part of our journey. Shape the future with us and help defend against online threats with purpose and innovation.

Role Overview:

We are seeking a seasoned Senior Data Engineer with a profound mastery of data warehousing and processing to enhance our data infrastructure. You will be responsible for the design and implementation of a data architecture of a high-loaded system, working with textual and media content, its vectorised representation (embeddings), real-time statistical data and graph data (connections between different actors).

Your data solutions will feed our customer-facing AI platform, enhancing our search, AI data enrichment, RAG and more.

If you are passionate about data engineering, thrive in optimizing data workflows, and excel in architecting scalable solutions for intricate systems, this opportunity is tailor-made for you.

Responsibilities:

• Implement and support data lake, data warehouse and relevant data storages for various types of data: texts, vectors, statistics, graphs..
•  Implement ETL pipelines to populate different storages with enriched data and guarantee data consistency.
•  Implement tools for standardized internal and external querying of the data in various formats and places, including full-text search, vector similarity search, analytical aggregations and graph search.
• Collaborate with the data fetching team on the data format and ingestion process
•  Break down complex problems into executable tasks.
•  Monitor the performance of the system, make sure the data consistency and update latency meet the product requirements.

Required Skills and Abilities:

•  Expertise in Python programming language and data-processing libraries, like pandas, numpy.
•  Experience with Elasticsearch or similar technology for full text search.
•  Production experience with embeddings and high-loaded Vector Storages (Milvus, Quadrant, Pinecone, or similar).
• Proficiency in SQL and analytical DBs Postgresql, Aurora DB / Snowflake / Redshift / DynamoDB.
•  Experience in building RAG pipelines.
• Experience in deployments with Docker containers into K8s.
•  Strong sense of ownership and ability to deal with complex abstract problems.
•  Ability to write efficient and scalable code and unit tests to ensure consistency across the codebase.
•  Team spirit and ability to effectively collaborate with others.
•  Ability to articulate a clear strategy; map and execute the necessary steps to help accelerate the company toward its strategic goals.

Minimum Qualifications:

•  Bachelor's or Master's degree in Computer Science, Engineering, or related field.
•  Minimum 5 years experience as a Data Engineer or similar role, particularly working on tasks related to data processing and storage with a strong understanding of complex data architectures.
•  Experience in building data pipelines with related understanding of data ingestion, transformation of structured, semi-structured and unstructured data across cloud services.
•  Experience in implementing data solutions which work under high load and process large volumes of data.
•  Experience with text embedding models and embedding storages.
•  Familiarity with containerization (Docker) and orchestration systems (Kubernetes), Linux and Shell Scripting.

Preferred Qualifications:

•  Experience of work with media data will be an advantage.
•  Experience in streaming pipelines.
•  Experience with GCP and Google BigQuery will be an advantage.
•  Experience in a product company or startup is preferred.
•  Experience with neo4j or similar graphDB will be an advantage

What We Offer:

• Opportunity to present your product and assist prestigious clients, such as governments and leading NGOs, in combating information threats and security challenges.
•  A chance to foster your career growth and to step into a leadership role.
•  Autonomy and freedom to drive experiments and bring your own ideas to life.
•  Becoming a key contributor, you'll be also rewarded with our stock compensation program.
•  Flexibility of fully remote work while having the option to leverage our vibrant co-working space in Kyiv, Ukraine.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Cookie Policy for more information.