Designs and maintains the company’s data infrastructure, including database systems, data pipelines, and secure data handling. Ensures that raw biological data is transformed into structured, usable formats for analysis and product development.
- Design and implement database schemas for longitudinal and multi-dimensional biological data
- Build and maintain data ingestion and processing pipelines (ETL workflows)
- Clean, standardize, and organize raw data from multiple sources into structured formats
- Ensure data integrity, consistency, and scalability of the system
- Implement data access controls and security measures to protect sensitive data
- Manage cloud infrastructure (e.g., AWS, GCP) for data storage and processing
- Support internal teams by enabling efficient data querying and access
- Contribute to backend systems and APIs as needed for future product development
- Strong proficiency in SQL and database design
- Experience with Python or similar languages for data processing
- Familiarity with cloud platforms (AWS, GCP, or Azure)
- Understanding of data security and access control principles
- Ability to build practical, scalable systems in early-stage environments
- Master's degree in a field related to data science and database development; OR 3+ years related experience
- Relevant experience in constructing a database/infrastructure from raw data
- Language requirement: English (required), Mandarin (preferred, not required)
- Full-time ability to work in-person with our team in NYC
- Please send resume to: careers@spearmint.bio
Pay: $97,702.72 - $147,663.49 per year
Benefits:
- 401(k)
- Dental insurance
- Health insurance
- Paid time off
- Parental leave
- Relocation assistance
- Vision insurance
People with a criminal record are encouraged to apply
Education:
Language:
- English (Required)
- Mandarin (Preferred)
Ability to Commute:
- New York, NY 10019 (Required)
Work Location: In person