SemanticBits is looking for a talented Data Engineer who is eager to apply computer science, software engineering, databases, and distributed/parallel processing frameworks to prepare big data for the use of data analysts and data scientists. You will deliver data acquisition, transformations, cleansing, conversion, compression, and loading of data into data and analytics models. You will work in partnership with data scientists and analysts to understand use cases, data needs, and outcome objectives. You are a practitioner of advanced data modeling and optimization of data and analytics solutions at scale. Expert in data management, data access (big data, data marts, etc.), programming, and data modeling; and familiar with analytic algorithms and applications (like machine learning).
SemanticBits is a leading company specializing in the design and development of digital health services, and the work we do is just as unique as the culture we’ve created. We develop cutting-edge solutions to complex problems for commercial, academic, and government organizations. The systems we develop are used in finding cures for deadly diseases, improving the quality of healthcare delivered to millions of people, and revolutionizing the healthcare industry on a nationwide scale. There is a meaningful connection between our work and the real people who benefit from it; and, as such, we create an environment in which new ideas and innovative strategies are encouraged. We are an established company with the mindset of a startup and we feel confident that we offer an employment experience unlike any other and that we set our employees up for professional success every day.
- Bachelor’s degree in Computer Science (or a related field)
- Three or more years in data engineering
- At least two years working with Scala and Spark
- Strong knowledge of computer science fundamentals: object-oriented design and programming, data structures, algorithms, databases (SQL and relational design), networking
- Demonstrable experience engineering scalable data processing pipelines.
- Demonstrable expertise with Scala, Spark, and wrangling of various data formats – Parquet, CSV, XML, JSON.
- Experience with the following technologies is highly desirable: Teradata, AWS EMR, AWS EC2, AWS S3, Airflow, SAS, Hadoop, Java, Spring Boot, Angular
- Experience with Agile methodology, using test-driven development.
- Excellent command of written and spoken English
- Self-driven problem solver
- Generous base salary
- Three weeks of PTO
- Excellent health benefits program (Medical, dental and vision)
- Education and conference reimbursement
- 401k retirement plan. We contribute 3% of base salary irrespective of employee’s contribution
- 100% paid short-term and long-term disability
- 100% paid life insurance
- Flexible Spending Account (FSA)
- Casual working environment
- Flexible working hours