About the Role
We are looking for an experienced Data Service Module Engineer to develop and deploy the data service module for the HPC modeling project.
This role focuses on implementing high-performance data storage and retrieval systems using HDF5 or similar, with parallel and concurrent I/O capabilities.
The ideal candidate will have expertise in designing scalable data services optimized for HPC or distributed workflows, ensuring low latency and high throughput.
Key Responsibilities
- Module Development and Deployment: Design and implement the data service module using HDF5 for efficient data storage and retrieval.
Develop parallel and concurrent I/O mechanisms to optimize performance for large-scale datasets.
Ensure the module is tightly integrated with HPC and visualization workflows.
- Performance Optimization: Optimize I/O operations for CPU/GPU-based workflows to minimize bottlenecks.
Implement caching, compression, and other strategies to enhance performance.
- Data Management: Design data structures and schemas suitable for storing 3D grid data and other simulation outputs.
Ensure data integrity and consistency during concurrent read/write operations.
- Testing and Validation: Develop and execute test cases to validate module performance and reliability under various load conditions.
Conduct benchmarking to ensure scalability across different hardware configurations.
- Documentation and Support: Document the architecture, APIs, and usage guidelines for the data service module.
Provide technical support to the development and visualization teams for data integration.
Qualifications
Education
Bachelor’s or Master’s degree in Computer Science, Software Engineering, or related fields.
Experience
- 3+ years of experience in developing and deploying data services for HPC or similar systems.
Proven expertise with HDF5 or similar, in parallel I/O operations.
Equivalent experience in distributed systems is also applicable.
Technical Skills
- Programming: Strong proficiency in at least one of C++, Python, GoLang, or Fortran.
- HDF5 Expertise: In-depth knowledge of HDF5 APIs and advanced features like parallel HDF5.
- Parallel I/O: Experience with MPI I/O, POSIX I/O, or similar frameworks for concurrent/parallel data access.
- Performance Optimization: Skills in profiling and optimizing I/O operations for large datasets.
- Proficiency in SQL and experience with any RDMS.
- Nice to have: knowledge of at least one orchestration and scheduling tool (e.g., Airflow, Prefect, Dagster).
Soft Skills
- Strong problem-solving skills and ability to work in a multidisciplinary team.
- Excellent communication skills for cross-team collaboration and documentation.
Preferred Qualifications
- Familiarity with data formats used in scientific computing, 3D visualization, and simulation workflows.
We offer
- Flexible working format - remote, office-based, or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
- Not applicable for freelancers
Seniority level
Not Applicable
Employment type
Full-time
Job function
- Information Technology and Engineering
Industries
We’re unlocking community knowledge in a new way.
Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr