Role: Teradata Lead
Band: C2
Experience level: Minimum 10 years
Job Description:
This role would be leading the DBA teams of multiple experience level DBAs for a mix of – Teradata, Oracle and SQL.
Skill Set:
Minimum 10 years of relevant Database and Datawarehouse experience.
Hands on experience of administrating Teradata.
Leading the performance analysis, capacity planning and supporting the batchops and users with their jobs.
Drive implementation of standards and best practices to optimize database utilization and availability.
Hands on with AWS Cloud infrastructure services such as EC2, S3 and network services.
Proficient in Linux system administration relevant to Teradata management.
Teradata Specific (Mandatory)
Manage and Operate 24x7 production as well as development databases to ensure maximum availability of system resources.
Responsible for operational activities of a Database Administrator such as System monitoring, User Management, Space Management, Troubleshooting, and Batch/user support.
Perform DBA related tasks in key areas of Performance Management & Reporting, workload management using TASM.
Manage Production/Development databases in areas like Capacity Planning, Performance Monitoring & Tuning, Strategies Defined for Backup/Recovery Techniques, Space/ User/ Security management along With Problem determination and resolution.
Experience with Teradata Workload management & monitoring and query optimization.
Expertise with system monitoring using viewpoint and logs.
Proficient in analysing the performance and optimizing at different levels.
Ability to create advanced system-level capacity reports as well as root cause analysis.
Oracle Specific (Optional)
Database Administration Installation of Oracle software on Unix/Linux platform.
Database Lifecycle Management - Database creation, setup decommissioning.
Database event alert monitoring, space management, user management.
Database upgrades migrations, cloning.
Database backup restore recovery using RMAN.
Setup and maintain High-Availability and Disaster Recovery solutions.
Proficient in Standby and Data Guard technology.
Hands on with the OEM CC.
Mandatory Certification:
- Teradata Vantage Certified Administrator
- ITIL Foundation
About Capgemini
Similar jobs
Hiring Developer with multiple combinations with experience of 3 to 6 years.
Hands-on experience with SQL ,Java and Python
Knowledge of these tools DBT, ADF, Snowflakes , Databricks would be added advantage for our current project
ML and AWS would be a plus
We need people who can work from our Chennai branch.
Do share your profile to gayathrirajagopalan @jmangroup.com
Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
● Experience with big data tools: Hive/Hadoop, Spark, Kafka, Hive etc.
● Experience with querying multiple databases SQL/NoSQL, including
Oracle, MySQL and MongoDB etc.
● Experience in Redis, RabbitMQ, Elastic Search is desirable.
● Strong Experience with object-oriented/functional/ scripting languages:
Python(preferred), Core Java, Java Script, Scala, Shell Scripting etc.
● Must have debugging complex code skills, experience on ML/AI
algorithms is a plus.
● Experience in version control tool Git or any is mandatory.
● Experience with AWS cloud services: EC2, EMR, RDS, Redshift, S3
● Experience with stream-processing systems: Storm, Spark-Streaming,
etc
We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.
Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree
Job Responsibilities:
• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team
• Have minimum 3 years of AWS Cloud experience.
• Well versed in languages such as Python, PySpark, SQL, NodeJS etc
• Has extensive experience in Spark ecosystem and has worked on both real time and batch processing
• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.
• Experience with modern Database systems such as Redshift, Presto, Hive etc.
• Worked on building data lakes in the past on S3 or Apache Hudi
• Solid understanding of Data Warehousing Concepts
• Good to have experience on tools such as Kafka or Kinesis
• Good to have AWS Developer Associate or Solutions Architect Associate Certification
• Have experience in managing a team
Data Engineer_Scala
Job Description:
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
Responsibilities:
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
Skill Sets:
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
● Able to contribute to the gathering of functional requirements, developing technical
specifications, and test case planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● 60% hands-on coding with architecture ownership of one or more products
● Ability to articulate architectural and design options, and educate development teams and
business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Mentor and guide team members
● Work cross-functionally with various bidgely teams including product management, QA/QE,
various product lines, and/or business units to drive forward results
Requirements
● BS/MS in computer science or equivalent work experience
● 8-12 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data EcoSystems.
● Past experience with Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra,
Kafka, Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Ability to lead and mentor technical team members
● Expertise with the entire Software Development Life Cycle (SDLC)
● Excellent communication skills: Demonstrated ability to explain complex technical issues to
both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Business Acumen - strategic thinking & strategy development
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
● Experience with Agile Development, SCRUM, or Extreme Programming methodologies
● Able contribute to the gathering of functional requirements, developing technical
specifications, and project & test planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● Roughly 80% hands-on coding
● Generate technical documentation and PowerPoint presentations to communicate
architectural and design options, and educate development teams and business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Work cross-functionally with various bidgely teams including: product management,
QA/QE, various product lines, and/or business units to drive forward results
Requirements
● BS/MS in computer science or equivalent work experience
● 2-4 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data Eco Systems.
● Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra, Kafka,
Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Strong leadership experience: Leading meetings, presenting if required
● Excellent communication skills: Demonstrated ability to explain complex technical
issues to both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
Minimum of 4 years’ experience of working on DW/ETL projects and expert hands-on working knowledge of ETL tools.
Experience with Data Management & data warehouse development
Star schemas, Data Vaults, RDBMS, and ODS
Change Data capture
Slowly changing dimensions
Data governance
Data quality
Partitioning and tuning
Data Stewardship
Survivorship
Fuzzy Matching
Concurrency
Vertical and horizontal scaling
ELT, ETL
Spark, Hadoop, MPP, RDBMS
Experience with Dev/OPS architecture, implementation and operation
Hand's on working knowledge of Unix/Linux
Building Complex SQL Queries. Expert SQL and data analysis skills, ability to debug and fix data issue.
Complex ETL program design coding
Experience in Shell Scripting, Batch Scripting.
Good communication (oral & written) and inter-personal skills
Expert SQL and data analysis skill, ability to debug and fix data issue Work closely with business teams to understand their business needs and participate in requirements gathering, while creating artifacts and seek business approval.
Helping business define new requirements, Participating in End user meetings to derive and define the business requirement, propose cost effective solutions for data analytics and familiarize the team with the customer needs, specifications, design targets & techniques to support task performance and delivery.
Propose good design & solutions and adherence to the best Design & Standard practices.
Review & Propose industry best tools & technology for ever changing business rules and data set. Conduct Proof of Concepts (POC) with new tools & technologies to derive convincing benchmarks.
Prepare the plan, design and document the architecture, High-Level Topology Design, Functional Design, and review the same with customer IT managers and provide detailed knowledge to the development team to familiarize them with customer requirements, specifications, design standards and techniques.
Review code developed by other programmers, mentor, guide and monitor their work ensuring adherence to programming and documentation policies.
Work with functional business analysts to ensure that application programs are functioning as defined.
Capture user-feedback/comments on the delivered systems and document it for the client and project manager’s review. Review all deliverables before final delivery to client for quality adherence.
Technologies (Select based on requirement)
Databases - Oracle, Teradata, Postgres, SQL Server, Big Data, Snowflake, or Redshift
Tools – Talend, Informatica, SSIS, Matillion, Glue, or Azure Data Factory
Utilities for bulk loading and extracting
Languages – SQL, PL-SQL, T-SQL, Python, Java, or Scala
J/ODBC, JSON
Data Virtualization Data services development
Service Delivery - REST, Web Services
Data Virtualization Delivery – Denodo
ELT, ETL
Cloud certification Azure
Complex SQL Queries
Data Ingestion, Data Modeling (Domain), Consumption(RDMS)
ABOUT EPISOURCE:
Episource has devoted more than a decade in building solutions for risk adjustment to measure healthcare outcomes. As one of the leading companies in healthcare, we have helped numerous clients optimize their medical records, data, analytics to enable better documentation of care for patients with chronic diseases.
The backbone of our consistent success has been our obsession with data and technology. At Episource, all of our strategic initiatives start with the question - how can data be “deployed”? Our analytics platforms and datalakes ingest huge quantities of data daily, to help our clients deliver services. We have also built our own machine learning and NLP platform to infuse added productivity and efficiency into our workflow. Combined, these build a foundation of tools and practices used by quantitative staff across the company.
What’s our poison you ask? We work with most of the popular frameworks and technologies like Spark, Airflow, Ansible, Terraform, Docker, ELK. For machine learning and NLP, we are big fans of keras, spacy, scikit-learn, pandas and numpy. AWS and serverless platforms help us stitch these together to stay ahead of the curve.
ABOUT THE ROLE:
We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations, clinical named entity recognition, improving patient health, clinical suspecting and information extraction from clinical notes.
This is a role for highly technical data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.
You will be responsible for setting an agenda to develop and ship data-driven architectures that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company and help build a foundation of tools and practices used by quantitative staff across the company.
During the course of a typical day with our team, expect to work on one or more projects around the following;
1. Create and maintain optimal data pipeline architectures for ML
2. Develop a strong API ecosystem for ML pipelines
3. Building CI/CD pipelines for ML deployments using Github Actions, Travis, Terraform and Ansible
4. Responsible to design and develop distributed, high volume, high-velocity multi-threaded event processing systems
5. Knowledge of software engineering best practices across the development lifecycle, coding standards, code reviews, source management, build processes, testing, and operations
6. Deploying data pipelines in production using Infrastructure-as-a-Code platforms
7. Designing scalable implementations of the models developed by our Data Science teams
8. Big data and distributed ML with PySpark on AWS EMR, and more!
BASIC REQUIREMENTS
-
Bachelor’s degree or greater in Computer Science, IT or related fields
-
Minimum of 5 years of experience in cloud, DevOps, MLOps & data projects
-
Strong experience with bash scripting, unix environments and building scalable/distributed systems
-
Experience with automation/configuration management using Ansible, Terraform, or equivalent
-
Very strong experience with AWS and Python
-
Experience building CI/CD systems
-
Experience with containerization technologies like Docker, Kubernetes, ECS, EKS or equivalent
-
Ability to build and manage application and performance monitoring processes
- We are looking for a Data Engineer with 3-5 years experience in Python, SQL, AWS (EC2, S3, Elastic Beanstalk, API Gateway), and Java.
- The applicant must be able to perform Data Mapping (data type conversion, schema harmonization) using Python, SQL, and Java.
- The applicant must be familiar with and have programmed ETL interfaces (OAUTH, REST API, ODBC) using the same languages.
- The company is looking for someone who shows an eagerness to learn and who asks concise questions when communicating with teammates.
Quant/Data Scientist
We are a nascent quantitative hedge fund led by an MIT PhD and Math Olympiad medallist, offering opportunities to grow with us as we build out the team. Our fund has world class investors and big data experts as part of the GP, top-notch ML experts as advisers to the fund, plus has equity funding to grow the team, license data and scale the data processing.
We are interested in researching and taking in live a variety of quantitative strategies based on historic and live market data, alternative datasets, social media data (both audio and video) and stock fundamental data.
You would join, and, if qualified, lead a growing team of data scientists and researchers, and be responsible for a complete lifecycle of quantitative strategy implementation and trading.
Requirements:
- Atleast 3 years of relevant ML experience
- Graduation date : 2018 and earlier
- 3-5 years of experience in high level Python programming.
- Master Degree (or Phd) in quantitative disciplines such as Statistics, Mathematics, Physics, Computer Science in top universities.
- Good knowledge of applied and theoretical statistics, linear algebra and machine learning techniques.
- Ability to leverage financial and statistical insights to research, explore and harness a large collection of quantitative strategies and financial datasets in order to build strong predictive models.
- Should take ownership for the research, design, development and implementation of the strategy development and effectively communicate with other team mates
- Prior experience and good knowledge of lifecycle and pitfalls of algorithmic strategy development and modelling.
- Good practical knowledge in understanding financial statements, value investing, portfolio and risk management techniques.
- A proven ability to lead and drive innovation to solve challenges and road blocks in project completion.
- A valid Github profile with some activity in it
Bonus to have:
- Experience in storing and retrieving data from large and complex time series databases
- Very good practical knowledge on time-series modelling and forecasting (ARIMA, ARCH and Stochastic modelling)
- Prior experience in optimizing and back testing quantitative strategies, doing return and risk attribution, feature/factor evaluation.
- Knowledge of AWS/Cloud ecosystem is an added plus (EC2s, Lambda, EKS, Sagemaker etc.)
- Knowledge of REST APIs and data extracting and cleaning techniques
- Good to have experience in Pyspark or any other big data programming/parallel computing
- Familiarity with derivatives, knowledge in multiple asset classes along with Equities.
- Any progress towards CFA or FRM is a bonus
- Average tenure of atleast 1.5 years in a company