- New York,NY
- Job Category
- Technical and Skilled Trades
- Job Type
The Big Data Hadoop Architect / Lead position will be part of the Insight Core Hadoop platform team within Global Banking and Markets. The role is expected to lead deliverables around platform design & configuration, capacity planning, incident management, monitoring, and business continuity planning
•Responsible for developing, enhancing, modifying and/or maintaining a multi–tenant big data platform •Functionally lead a team of developers located on and off shore and collaborate with Product Owners, Quants, and other technology teams to deliver data/applications/tools
•Work closely with the Business Stakeholders, Management Team, Development Teams, Infrastructure Management and support partners
•Use your in-depth knowledge of development tools and languages towards design and development of applications to meet complex business requirements
•Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc
•Design and implement scalable data platforms for our customer facing services
•Deploy and scale Hadoop infrastructure
•Hadoop / HDFS maintenance and operations
•Data cluster monitoring and troubleshooting
•Hadoop capacity planning
•OS integration and application installation
•Partner with program management, network engineering, site reliability operations, and other related groups
•Willingness to participate in a 24x7 on-call rotation for escalations
•Bachelor’s Degree in Information/Computer Science or related field OR equivalent professional experience
•Deep understanding of UNIX and network fundamentals
•Expertise with Hadoop and its ecosystem Hive, Pig, Spark, HDFS, HBase, Oozie, Sqoop, Flume, Zookeeper, Kerberos, Sentry, Impala etc.
•Experience designing multi-tenant, containerized Hadoop architectures for memory/CPU management/sharing across different LOBs
•5+ years managing clustered services, secure distributed systems, production data stores
•3+ years of experience administering and operating Hadoop clusters
•Cloudera CHD4 /CDH5 cluster management and capacity planning experience
•Ability to rapidly learn new software languages, frameworks and APIs quickly
•Experience scripting for automation and config management (Chef, Puppet)
•Multi-datacenter, multi-tenant deployment experience, a plus
•Strong troubleshooting skills with exposure to large scale production systems
•Hands on development experience and high proficiency in Java / Python
•Skilled in data analysis, profiling, data quality and processing to create visualizations
•Experience working with Agile Methodology
•Good SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases (Hive, Impala, Kudu a plus).
•Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
•Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
•Strong analytic skills related to working with unstructured datasets.
•Build processes supporting data transformation, data structures, metadata, dependency and workload management.
•Experience supporting and working with cross-functional teams in a dynamic environment.