We are looking for an exceptionally talented Lead data engineer who has exposure in implementing AWS services to build data pipelines, api integration and designing data warehouse. Candidate with both hands-on and leadership capabilities will be ideal for this position.
Qualification: At least a bachelor’s degree in Science, Engineering, Applied Mathematics. Preferred Masters degree
Job Responsibilities:
• Total 6+ years of experience as a Data Engineer and 2+ years of experience in managing a team
• Have minimum 3 years of AWS Cloud experience.
• Well versed in languages such as Python, PySpark, SQL, NodeJS etc
• Has extensive experience in the real-timeSpark ecosystem and has worked on both real time and batch processing
• Have experience in AWS Glue, EMR, DMS, Lambda, S3, DynamoDB, Step functions, Airflow, RDS, Aurora etc.
• Experience with modern Database systems such as Redshift, Presto, Hive etc.
• Worked on building data lakes in the past on S3 or Apache Hudi
• Solid understanding of Data Warehousing Concepts
• Good to have experience on tools such as Kafka or Kinesis
• Good to have AWS Developer Associate or Solutions Architect Associate Certification
• Have experience in managing a team
About Top 3 Fintech Startup
Similar jobs
Job Title – Data Scientist (Forecasting)
Anicca Data is seeking a Data Scientist (Forecasting) who is motivated to apply his/her/their skill set to solve complex and challenging problems. The focus of the role will center around applying deep learning models to real-world applications. The candidate should have experience in training, testing deep learning architectures. This candidate is expected to work on existing codebases or write an optimized codebase at Anicca Data. The ideal addition to our team is self-motivated, highly organized, and a team player who thrives in a fast-paced environment with the ability to learn quickly and work independently.
Job Location: Remote (for time being) and Bangalore, India (post-COVID crisis)
Required Skills:
- At least 3+ years of experience in a Data Scientist role
- Bachelor's/Master’s degree in Computer Science, Engineering, Statistics, Mathematics, or similar quantitative discipline. D. will add merit to the application process
- Experience with large data sets, big data, and analytics
- Exposure to statistical modeling, forecasting, and machine learning. Deep theoretical and practical knowledge of deep learning, machine learning, statistics, probability, time series forecasting
- Training Machine Learning (ML) algorithms in areas of forecasting and prediction
- Experience in developing and deploying machine learning solutions in a cloud environment (AWS, Azure, Google Cloud) for production systems
- Research and enhance existing in-house, open-source models, integrate innovative techniques, or create new algorithms to solve complex business problems
- Experience in translating business needs into problem statements, prototypes, and minimum viable products
- Experience managing complex projects including scoping, requirements gathering, resource estimations, sprint planning, and management of internal and external communication and resources
- Write C++ and Python code along with TensorFlow, PyTorch to build and enhance the platform that is used for training ML models
Preferred Experience
- Worked on forecasting projects – both classical and ML models
- Experience with training time series forecasting methods like Moving Average (MA) and Autoregressive Integrated Moving Average (ARIMA) with Neural Networks (NN) models as Feed-forward NN and Nonlinear Autoregressive
- Strong background in forecasting accuracy drivers
- Experience in Advanced Analytics techniques such as regression, classification, and clustering
- Ability to explain complex topics in simple terms, ability to explain use cases and tell stories
- Job Title- Java + AWS Developer
- Experience - 5+ Years
- Location - Pune
- Work Mode - Hybrid
- Qualification - Any Computer/Engineering Degree
- Job Description
- Experience with AWS services such as EKS, ECR, Aurora, S3, KVS, SQS
- Experience with Java technology stack, including Java SE, Java EE, JDBC Spring, Spring Boot, Micro services, Hibernate
- Experience with Eclipse, GIT
- Experience with SQL, No-SQL databases, messaging systems
- Understanding of MQTT & AMQP, experience with RabbiMQ
- Understanding of CI/CD (continuous integration/continuous delivery) tools, frameworks and deployment processes
- Thorough understanding of OOP, SOLID, and RESTful services
- Thorough understanding of multi-threading best practices, especially with regard to Java
- Thorough understanding of database query optimization and Java code optimization
- Thorough understanding of dependency injection, cloud development and maintaining a large-scale cloud platform
Job Description: Data Engineer
We are looking for a curious Data Engineer to join our extremely fast-growing Tech Team at StanPlus
About RED.Health (Formerly Stanplus Technologies)
Get to know the team:
Join our team and help us build the world’s fastest and most reliable emergency response system using cutting-edge technology.
Because every second counts in an emergency, we are building systems and flows with 4 9s of reliability to ensure that our technology is always there when people need it the most. We are looking for distributed systems experts who can help us perfect the architecture behind our key design principles: scalability, reliability, programmability, and resiliency. Our system features a powerful dispatch engine that connects emergency service providers with patients in real-time
.
Key Responsibilities
● Build Data ETL Pipelines
● Develop data set processes
● Strong analytic skills related to working with unstructured datasets
● Evaluate business needs and objectives
● Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery
● Interpret trends and patterns
● Work with data and analytics experts to strive for greater functionality in our data system
● Build algorithms and prototypes
● Explore ways to enhance data quality and reliability
● Work with the Executive, Product, Data, and D esign teams, to assist with data-related technical issues and support their data infrastructure needs.
● Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Key Requirements
● Proven experience as a data engineer, software developer, or similar of at least 3 years.
● Bachelor's / Master’s degree in data engineering, big data analytics, computer engineering, or related field.
● Experience with big data tools: Hadoop, Spark, Kafka, etc.
● Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
● Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
● Experience with Azure, AWS cloud services: EC2, EMR, RDS, Redshift
● Experience with BigQuery
● Experience with stream-processing systems: Storm, Spark-Streaming, etc.
● Experience with languages: Python, Java, C++, Scala, SQL, R, etc.
● Good hands-on with Hive, Presto.
- Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
- A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
- Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
- Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
- Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
- Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
- Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
- Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
- Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
- Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
- Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
- Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
- Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
- Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
- In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
- Hands on expertise in real time analytics with Apache Spark.
- Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
- Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
- Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
- Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
- Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
- Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
- Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
- Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
- Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis.
- Generated various kinds of knowledge reports using Power BI based on Business specification.
- Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
- Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
- Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
- Good experience with use-case development, with Software methodologies like Agile and Waterfall.
- Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
- Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
- Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
- Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
- Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Title: Platform Engineer Location: Chennai Work Mode: Hybrid (Remote and Chennai Office) Experience: 4+ years Budget: 16 - 18 LPA
Responsibilities:
- Parse data using Python, create dashboards in Tableau.
- Utilize Jenkins for Airflow pipeline creation and CI/CD maintenance.
- Migrate Datastage jobs to Snowflake, optimize performance.
- Work with HDFS, Hive, Kafka, and basic Spark.
- Develop Python scripts for data parsing, quality checks, and visualization.
- Conduct unit testing and web application testing.
- Implement Apache Airflow and handle production migration.
- Apply data warehousing techniques for data cleansing and dimension modeling.
Requirements:
- 4+ years of experience as a Platform Engineer.
- Strong Python skills, knowledge of Tableau.
- Experience with Jenkins, Snowflake, HDFS, Hive, and Kafka.
- Proficient in Unix Shell Scripting and SQL.
- Familiarity with ETL tools like DataStage and DMExpress.
- Understanding of Apache Airflow.
- Strong problem-solving and communication skills.
Note: Only candidates willing to work in Chennai and available for immediate joining will be considered. Budget for this position is 16 - 18 LPA.
We are looking for a Big Data Engineer with java for Chennai Location
Location : Chennai
Exp : 11 to 15 Years
Job description
Required Skill:
1. Candidate should have minimum 7 years of experience as total
2. Candidate should have minimum 4 years of experience in Big Data design and development
3. Candidate should have experience in Java, Spark, Hive & Hadoop, Python
4. Candidate should have experience in any RDBMS.
Roles & Responsibility:
1. To create work plans, monitor and track the work schedule for on time delivery as per the defined quality standards.
2. To develop and guide the team members in enhancing their technical capabilities and increasing productivity.
3. To ensure process improvement and compliance in the assigned module, and participate in technical discussions or review.
4. To prepare and submit status reports for minimizing exposure and risks on the project or closure of escalation
Regards,
Priyanka S
7P8R9I9Y4A0N8K8A7S7
- Handling Survey Scripting Process through the use of survey software platform such as Toluna, QuestionPro, Decipher.
- Mining large & complex data sets using SQL, Hadoop, NoSQL or Spark.
- Delivering complex consumer data analysis through the use of software like R, Python, Excel and etc such as
- Working on Basic Statistical Analysis such as:T-Test &Correlation
- Performing more complex data analysis processes through Machine Learning technique such as:
- Classification
- Regression
- Clustering
- Text
- Analysis
- Neural Networking
- Creating an Interactive Dashboard Creation through the use of software like Tableau or any other software you are able to use.
- Working on Statistical and mathematical modelling, application of ML and AI algorithms
What you need to have:
- Bachelor or Master's degree in highly quantitative field (CS, machine learning, mathematics, statistics, economics) or equivalent experience.
- An opportunity for one, who is eager of proving his or her data analytical skills with one of the Biggest FMCG market player.
Responsibilities
Independent ability to execute Upgrades, OS/DB migration projects, S/4 HANA conversion and resolve issues on your own. Lead the team technically on Production support and resolve day to day technical issues Gather requirements, design, architect and implement optimal solutions for SAP landscape based on client’s requirements Should be willing to work on USA time zone as project demands
SKILLS & EXPERIENCE
Complete architectural know-how of ABAP and JAVA stacks and of other BASIS activities
Very good knowledge and experience on Cloud architecture (AWS, Azure, GCP)
Ability to gather Client's business requirements and implement them with optimal technical solutions
Experience and deep knowledge of SAP systems, including S/4HANA, and the required architecture and infrastructure to support them
Provide recommendations and guidance on the SAP platform and tools
Should lead a BASIS team and act as a SPOC for customers for the production support and Project activities
Strong knowledge & hands-on experience in technical planning, implementation, upgrades, migration and maintenance of various SAP modules on premise and on cloud (preferable)
Must have strong SAP Basis skills on Operating Systems Linux/Unix, and Windows
Strong experience in most of the database like Oracle, HANA, MSSQL, DB2, Sybase and MaxDB. Table administration, Performance tuning, Backup and restore, point-in time recovery on these databases
Knowledge of infrastructure, especially cloud infrastructure is a big plus
Experience in OS/DB/Infrastructure migration especially HANA migration using SUM/DMO from any source database like Oracle/SQL/DB2
Experience in Unicode conversion and S/4 HANA migrations
Hands on experience in SAP systems sizing including HANA landscape
Good experience in HANA installation, administration and maintenance in scaled up and scaled out landscape
Expertise in configuration of SAP systems not limiting to ECC, CRM, SRM, SCM, Portal, BW, PI, GRC, GTS, MDG, MDM, HCM, EWM, Solution manager
Experience in most of these - TREX, Live cache, BEx reporting, NWDI, FIORI, Web dispatcher, BOBJ, BODS, IS, Content Server, SLT, Pre-calc, etc.
Experience in System refresh, system copy, Client copy, Client export/import
Experience in EHP upgrades, SP updates, Patches, kernel upgrades and so on
Experience with Solution manager 7.1/7.2 EWA configuration, Setup systems monitoring, Root cause analysis, Automated monitoring and alerting setup, E2E implementation of Solman
Experience in High availability setup in Unix and Windows with various clustering tools, and disaster recovery setup
Experience in security activities like central user administration, managing roles and profiles of various business systems, SSO configuration, integrating with active directory, etc.
Big plus to have automation and scripting skills on any of the BASIS activities like System Refresh, transports, monitoring, etc.
Mentor Junior staffs to train them in various support and project activities
Should possess strong problem solving and analytical skills and demonstrate strong technical leadership to solve customer problems
Excellent communication (verbal and written) & interpersonal skills
QUALIFICATION
15 years’ experience as SAP Basis Engineer
SAP Netweaver, HANA, OSDB migration, Cloud (AWS/Azure) certifications preferable
2+ full lifecycle SAP implementations, 2+ upgrades, 2+ migrations/conversions, 2+ support projects
Good infrastructure knowledge on Network, Storage, etc.
• Strong knowledge of SQL and ETL Testing
• Extensive experience in ETL/ Data warehouse backend testing and BI Intelligence reports testing
• Hands-on back-end testing skills and strong RDBMS and testing methodologies.
• Expertise in test management tools and defect tracking tools i.e HP Quality Center, Jira
• Proficient experience of working on SDLC & Agile Methodology
•Excellent Knowledge of Database Systems Vertica /Oracle/ Teradata
• Knowledge in security testing will be an added advantage.
• Experience in Business Intelligence testing in various reports Using Tableau
• Strong comprehension, analytical, and problem-solving skills
•Good interpersonal and communication skills, quick learner, and good troubleshooting capabilities.
• Good knowledge of Python Programming language.
• Working knowledge of AWS
To be considered as a candidate for a Senior Data Engineer position, a person must have a proven track record of architecting data solutions on current and advanced technical platforms. They must have leadership abilities to lead a team providing data centric solutions with best practices and modern technologies in mind. They look to build collaborative relationships across all levels of the business and the IT organization. They possess analytic and problem-solving skills and have the ability to research and provide appropriate guidance for synthesizing complex information and extract business value. Have the intellectual curiosity and ability to deliver solutions with creativity and quality. Effectively work with business and customers to obtain business value for the requested work. Able to communicate technical results to both technical and non-technical users using effective story telling techniques and visualizations. Demonstrated ability to perform high quality work with innovation both independently and collaboratively.