The Client provides expert advisory and implementation services for open source big data solutions. As the first and only pure-play big data services firm, their Data Scientists and Engineers are trusted advisors to the world's most innovative companies. Their experienced teams combine a distinctive methodology and a proven framework that includes tested design patterns and pre-built components, to help clients build applications faster. The Client helps Customers leverage Big Data analytics by integrating open source platforms, such as Hadoop, NoSQL and Streaming Engines, with best-of-breed data warehousing environments. Service offers include: a Big Data roadmap, Data Engineering, Data Lake and Analytic Operations, Training and ongoing Big Data Solution Support.
The Clients Data Science team delivers insights and value to clients from heterogeneous data sets with solutions that integrate into engineering and decision-making processes. Additionally, their team enables big data analytics for their clients through advisory services including use case prioritizations, tool selection and training, and capability definitions. Their success as a services firm relies on their experts' ability to be more than technologists and statisticians.
The Senior Data Scientist will be responsible for utilising advanced statistical and machine learning methods to answer business questions and deliver insightful solutions to complex problems. The ideal candidate has excellent interpersonal and communication skills and can interact with business and technology stakeholders where necessary.
Specific Responsibilities
The Senior Data Scientist will:
Customer workshops:
Agile cross-functional teamwork:
Documentation and coding standards:
The following are a list of relevant skills expected from the successful candidate:
Must
Data analysis and visualisation tools and workbenches
Analysing structured, semi-structured and unstructured data
Data query languages, e.g. SQL, HiveQL or similar
Should
Spark
Scikit-learn and Pandas
Desirable
Distributed systems
Hadoop ecosystem
Cloud-based machine-learning APIs