BIG Data Solutions Architect
This opportunity has been closed.
The position is no longer available. We will continue to work to offer you better suited opportunities.
Description
We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across our clients.
In this task, you will lead a team of data engineers and data scientists to:
- Assess existing big data ecosystem: architecture, data sourcing, systems integration required to monetise the data
- Understand specific requirements to evolve existing big data ecosystem and perform gap analysis/roadmap definition
- Implement specific big data use-cases by sourcing internal and external data, building data layers, developing analytics algorithms, models and visualisation and overseeing the implementation of 3rdparty tools to monetise data
Specifically:
- Selecting and integrating any Big Data tools and
frameworks required to provide requested capabilities
- Gather and process raw data at scale (including
writing scripts, web scraping, calling APIs, write SQL queries, etc.)
- Process unstructured data into a form suitable for
analysis and then execute analysis
- Support business decisions with ad hoc analysis as
needed
- Work closely with client engineering team to
integrate your innovations and algorithms into production systems
Professional background
- Experience processing large amounts of structured
and unstructured data. MapReduce experience is a plus
- Proficient understanding of distributed computing
principles
- Management of Hadoop cluster, with all included
services
- Proficiency with Hadoop v2, MapReduce, HDFS
- Experience with building stream-processing systems,
using solutions such as Storm or Spark-Streaming
- Good knowledge of Big Data querying tools, such as
Pig, Hive, and Impala
- Experience with Spark
- Experience with integration of data from multiple
data sources
- Experience with NoSQL databases, such as HBase,
Cassandra, MongoDB
- Knowledge of various ETL techniques and frameworks,
such as Flume
- Experience with various messaging systems, such as
Kafka or RabbitMQ
- Experience with Big Data ML toolkits, such as
Mahout, SparkML, or H2O
- Good understanding of Lambda Architecture, along
with its advantages and drawbacks
- Experience with either of Cloudera/MapR/Hortonworks
Hiring Talent?
Get a quick view of our unique benefits