Are you a high performing data engineer or data scientist looking to take part in some of the most cutting edge research and production projects? Do you enjoy reading and investigating advancements in various applied machine learning architectures and solution white papers? Would you like to take part or drive the creation of publishable advancements in machine learning across various disciplines? You could be a great match for a Machine Learning Data Engineer role at Capital One's Center for Machine Learning (C4ML).
C4ML is a highly technical team focused on consulting, research, and building machine learning products for the enterprise. We have the highest executive support for acting as a catalyst of machine learning across Capital One. In this role as a Machine Learning Data Engineer in C4ML, you will work on one of our cutting-edge products that lies at the intersection of machine learning, distributed computing, and DevOps. In this role you will leverage technologies like AWS, Kubernetes, Docker, TensorFlow, Spark, and Kafka to build a containerized platform for deploying distributed frameworks. Our goal is to be able to handle the machine learning and big data infrastructure needs of the entire company. You will also have opportunities to work in the consulting and research branches of the team.
What you will bring to the role:
- Excellent communication skills evidenced by multiple white papers (internal proprietary or externally published).
- Demonstrated ability to build full stack systems architected for speed and distributed computing.
- Demonstrated ability to quickly learn new tools and paradigms to deploy cutting edge solutions.
- Experience mentoring junior engineers.
- Adept at simultaneously working on multiple projects, meeting deadlines, and managing expectations.
What you will do in the role:
- Act as an advisor to various lines of business to help create or improve projects.
- Develop both deployment architecture and scripts for automated system deployment in AWS.
- Code new machine learning paradigms, sometimes from first principles, for integration into production systems.
- Learn and work with subject matter experts to create large scale deployments using newly researched methodologies.
- Construct data staging layers and fast real-time systems to feed machine learning algorithms.
- Create white papers, attend conferences, and contribute to open source software.
- Bachelor’s Degree or Military Experience
- At least 2 years of experience designing and building full stack solutions utilizing distributed computing.
- At least 2 years of experience working with Python, Scala, or Java.
- At least 2 years of experience with distributed file systems or multi-node database paradigms.
- Master’s Degree or PhD
- At least 2 years of experience deploying production applications to a cloud services provider, such as AWS.
- At least 2 years of experience with machine learning or deep learning frameworks, such as TensorFlow, PyTorch or H2O.
- At least 2 years of experience with distributed data movement frameworks, such as Spark, Kafka, or Dask.
- At least 3 years of experience with a container orchestration platform, such as Kubernetes.
- At least 5 years of experience with CI/CD technologies, such as Ansible, Cloud Formation, or Jenkins.
- At least 5 years of experience leading teams in code development
At this time, Capital One will not sponsor a new applicant for employment authorization for this position.