Main Functions & Responsibilities (duties to be performed by this position)
General (Architect)
Finding the best technical solution amongst various possible options in order to solve business problems or meet business requirements
Selecting solution/project technology stack
Specify the structure, characteristics, behavior and related aspects of the technology solution to project stakeholders or for project proposals
Lead solution design activities and support end-to-end project/product life cycle
Manage architectural design and lead key decision making to ensure smooth integration of bespoke systems and successful delivery of all projects
Undertake solution sizing, capacity planning, effort estimations, performance benchmarking, high availability and business continuity assurance
Engage with clients to understand business requirements and propose solutions to fulfill their business objectives
General (Engineer and Architect)
Providing specifications according to which the solution is defined, managed, developed and delivered
Defining features, phases, integrations and solution requirements
Conduct systems design, feasibility and cost studies, to recommend cost-effective solutions.
Develop solution design and project proposal presentations, present solution design and proposals to clients in order to satisfy client requirements
Act as a bridge between the development team and clients w.r.t. technical requirements, align solution throughout the project lifecycle in coordination with project managers
Specific (Engineer and Architect)
Selecting and integrating any big data tools and frameworks required to provide required capabilities in technical solutions
Responsible for solution design and development in Hadoop environment for large-scale data analysis (Peta Byte scale)
Design infrastructure required for optimal extraction, transformation, and loading of data from a variety of data sources
Collecting, parsing, managing, analyzing and visualizing large sets of data
Implementing ETL process, including preprocessing using relevant technologies
Develop solutions based on large, complex data sets that meet functional and non-functional business requirements.
Develop code for map, reduce and shuffle, etc. steps to process large data sets in Hadoop ecosystem
Undertake optimization of code and Hadoop clusters to improve processing, querying and analysis performance
Monitoring performance and advising any necessary infrastructure changes
Collaborate with other development and research teams
Installation, configuration and management of Hadoop clusters
Develop prototypes, proof of concepts for the designed solutions
Undertake comparative analysis of designed solutions with alternative designs
Any other task assigned by the Supervisor
Job Specification
Minimum 03 years of experience in software engineering/development
Minimum 02 years of development experience in big data analytics
Minimum 02 years of design and architecture experience in big data systems (for Architect role only)
Hands-on experience with Hadoop and technologies in the Hadoop ecosystem.
Experience in pre-processing data and designing efficient ETL workflows in Hadoop ecosystem
Prior experience of working with large-scale data infrastructures.
Experience with non-relational and relational databases
Ability to independently undertake design and development of Hadoop clusters.
Experience with Hadoop ecosystem including MapReduce, HDFS, Hive, Spark, Pig, Impala, etc.
Experience with Cloudera/MapR/HortonWorks will be considered a plus
Experience in designing Hadoop clusters for specific big data analytics of real-world data will be considered a big plus
Proficient understanding of distributed computing principles
Ability to design solutions independently based on high-level architecture.
Installation, configuration and management of Hadoop clusters
Knowledge of ETL techniques and frameworks e.g. Flume
Collecting, parsing, managing, analyzing and visualizing large sets of data
Should be able to develop prototypes and proof of concepts for the designed solutions
Should have extensive knowledge in different programming languages and computing environments e.g. Java, C++, PHP, Ruby, Python and/or R, Linux, Cloud environments