- 6+ years of recent hands-on Java development
- Developing data pipelines in AWS or Google Cloud
- Java, Python, JavaScript programming languages
- Great understanding of designing for performance, scalability, and reliability of data intensive application
- Hadoop MapReduce, Spark, Pig. Understanding of database fundamentals and advanced SQL knowledge.
- In-depth understanding of object oriented programming concepts and design patterns
- Ability to communicate clearly to technical and non-technical audiences, verbally and in writing
- Understanding of full software development life cycle, agile development and continuous integration
- Experience in Agile methodologies including Scrum and Kanban
About Intergral Add Science
Similar jobs
We are a rapidly expanding global technology partner, that is looking for a highly skilled Senior (Python) Data Engineer to join their exceptional Technology and Development team. The role is in Kolkata. If you are passionate about demonstrating your expertise and thrive on collaborating with a group of talented engineers, then this role was made for you!
At the heart of technology innovation, our client specializes in delivering cutting-edge solutions to clients across a wide array of sectors. With a strategic focus on finance, banking, and corporate verticals, they have earned a stellar reputation for their commitment to excellence in every project they undertake.
We are searching for a senior engineer to strengthen their global projects team. They seek an experienced Senior Data Engineer with a strong background in building Extract, Transform, Load (ETL) processes and a deep understanding of AWS serverless cloud environments.
As a vital member of the data engineering team, you will play a critical role in designing, developing, and maintaining data pipelines that facilitate data ingestion, transformation, and storage for our organization.
Your expertise will contribute to the foundation of our data infrastructure, enabling data-driven decision-making and analytics.
Key Responsibilities:
- ETL Pipeline Development: Design, develop, and maintain ETL processes using Python, AWS Glue, or other serverless technologies to ingest data from various sources (databases, APIs, files), transform it into a usable format, and load it into data warehouses or data lakes.
- AWS Serverless Expertise: Leverage AWS services such as AWS Lambda, AWS Step Functions, AWS Glue, AWS S3, and AWS Redshift to build serverless data pipelines that are scalable, reliable, and cost-effective.
- Data Modeling: Collaborate with data scientists and analysts to understand data requirements and design appropriate data models, ensuring data is structured optimally for analytical purposes.
- Data Quality Assurance: Implement data validation and quality checks within ETL pipelines to ensure data accuracy, completeness, and consistency.
- Performance Optimization: Continuously optimize ETL processes for efficiency, performance, and scalability, monitoring and troubleshooting any bottlenecks or issues that may arise.
- Documentation: Maintain comprehensive documentation of ETL processes, data lineage, and system architecture to ensure knowledge sharing and compliance with best practices.
- Security and Compliance: Implement data security measures, encryption, and compliance standards (e.g., GDPR, HIPAA) as required for sensitive data handling.
- Monitoring and Logging: Set up monitoring, alerting, and logging systems to proactively identify and resolve data pipeline issues.
- Collaboration: Work closely with cross-functional teams, including data scientists, data analysts, software engineers, and business stakeholders, to understand data requirements and deliver solutions.
- Continuous Learning: Stay current with industry trends, emerging technologies, and best practices in data engineering and cloud computing and apply them to enhance existing processes.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
- Proven experience as a Data Engineer with a focus on ETL pipeline development.
- Strong proficiency in Python programming.
- In-depth knowledge of AWS serverless technologies and services.
- Familiarity with data warehousing concepts and tools (e.g., Redshift, Snowflake).
- Experience with version control systems (e.g., Git).
- Strong SQL skills for data extraction and transformation.
- Excellent problem-solving and troubleshooting abilities.
- Ability to work independently and collaboratively in a team environment.
- Effective communication skills for articulating technical concepts to non-technical stakeholders.
- Certifications such as AWS Certified Data Analytics - Specialty or AWS Certified DevOps Engineer are a plus.
Preferred Experience:
- Knowledge of data orchestration and workflow management tools
- Familiarity with data visualization tools (e.g., Tableau, Power BI).
- Previous experience in industries with strict data compliance requirements (e.g., insurance, finance) is beneficial.
What You Can Expect:
- Innovation Abounds: Join a company that constantly pushes the boundaries of technology and encourages creative thinking. Your ideas and expertise will be valued and put to work in pioneering solutions.
- Collaborative Excellence: Be part of a team of engineers who are as passionate and skilled as you are. Together, you'll tackle challenging projects, learn from each other, and achieve remarkable results.
- Global Impact: Contribute to projects with a global reach and make a tangible difference. Your work will shape the future of technology in finance, banking, and corporate sectors.
They offer an exciting and professional environment with great career and growth opportunities. Their office is located in the heart of Salt Lake Sector V, offering a terrific workspace that's both accessible and inspiring. Their team members enjoy a professional work environment with regular team outings. Joining the team means becoming part of a vibrant and dynamic team where your skills will be valued, your creativity will be nurtured, and your contributions will make a difference. In this role, you can work alongside some of the brightest minds in the industry.
If you're ready to take your career to the next level and be part of a dynamic team that's driving innovation on a global scale, we want to hear from you.
Apply today for more information about this exciting opportunity.
Onsite Location: Kolkata, India (Salt Lake Sector V)
Location: Pune/Nagpur,Goa,Hyderabad/
Job Requirements:
- 9 years and above of total experience preferably in bigdata space.
- Creating spark applications using Scala to process data.
- Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
- Experience in spark job performance tuning and optimizations.
- Should have experience in processing data using Kafka/Pyhton.
- Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
- Should be proficient in writing SQL queries to process data in Data Warehouse.
- Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
- Experience on AWS services like EMR.
Experience : 3 to 7 Years
Number of Positions : 20
Job Location : Hyderabad
Notice : 30 Days
1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus
Hadoop and Hive requirements as good to have or understanding of is enough.
Python + Data scientist : |
• Build data-driven models to understand the characteristics of engineering systems |
• Train, tune, validate, and monitor predictive models |
• Sound knowledge on Statistics |
• Experience in developing data processing tasks using PySpark such as reading, merging, enrichment, loading of data from external systems to target data destinations |
• Working knowledge on Big Data or/and Hadoop environments |
• Experience creating CI/CD Pipelines using Jenkins or like tools |
• Practiced in eXtreme Programming (XP) disciplines |
Company Profile:
Easebuzz is a payment solutions (fintech organisation) company which enables online merchants to accept, process and disburse payments through developer friendly APIs. We are focusing on building plug n play products including the payment infrastructure to solve complete business problems. Definitely a wonderful place where all the actions related to payments, lending, subscription, eKYC is happening at the same time.
We have been consistently profitable and are constantly developing new innovative products, as a result, we are able to grow 4x over the past year alone. We are well capitalised and have recently closed a fundraise of $4M in March, 2021 from prominent VC firms and angel investors. The company is based out of Pune and has a total strength of 180 employees. Easebuzz’s corporate culture is tied into the vision of building a workplace which breeds open communication and minimal bureaucracy. An equal opportunity employer, we welcome and encourage diversity in the workplace. One thing you can be sure of is that you will be surrounded by colleagues who are committed to helping each other grow.
Easebuzz Pvt. Ltd. has its presence in Pune, Bangalore, Gurugram.
Salary: As per company standards.
Designation: Data Engineering
Location: Pune
Experience with ETL, Data Modeling, and Data Architecture
Design, build and operationalize large scale enterprise data solutions and applications using one or more of AWS data and analytics services in combination with 3rd parties
- Spark, EMR, DynamoDB, RedShift, Kinesis, Lambda, Glue.
Experience with AWS cloud data lake for development of real-time or near real-time use cases
Experience with messaging systems such as Kafka/Kinesis for real time data ingestion and processing
Build data pipeline frameworks to automate high-volume and real-time data delivery
Create prototypes and proof-of-concepts for iterative development.
Experience with NoSQL databases, such as DynamoDB, MongoDB etc
Create and maintain optimal data pipeline architecture,
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Evangelize a very high standard of quality, reliability and performance for data models and algorithms that can be streamlined into the engineering and sciences workflow
Build and enhance data pipeline architecture by designing and implementing data ingestion solutions.
Employment Type
Full-time
About Us |
|
upGrad is an online education platform building the careers of tomorrow by offering the most industry-relevant programs in an immersive learning experience. Our mission is to create a new digital-first learning experience to deliver tangible career impact to individuals at scale. upGrad currently offers programs in Data Science, Machine Learning, Product Management, Digital Marketing, and Entrepreneurship, etc. upGrad is looking for people passionate about management and education to help design learning programs for working professionals to stay sharp and stay relevant and help build the careers of tomorrow.
|
As an experienced Data Scientist you’ll join a team of data scientists, analysts, and software engineers
working to push the boundaries of data science in health care. We like to experiment, iterate, and
innovate with technology, from developing new algorithms specific to health care’s challenges, to
bringing the latest machine learning practices and applications developed in other industries into the
health care world. We know that algorithms are only valuable when powered by the right data, so we
focus on fully understanding the problems we need to solve, and truly understanding the data behind
them before launching into solutions – ensuring that the solutions we do land on are impactful and
powerful
Essential functions
• Research, conceptualize, and implement analytical approaches and predictive modeling to
evaluate scenarios, predict utilization and clinical outcomes, and recommend actions to impact
results.
• Manage and execute on the entire model development process, including scope definition,
hypothesis formation, data cleaning and preparation, feature selection, model implementation
in production, validation and iteration, using multiple data sources.
• Provide guidance on necessary data and software infrastructure capabilities to deliver a scalable
solution across partners and support the implementation of the team’s algorithms and models
• Contribute to the development and publication in major journals, conferences showcasing
leadership in healthcare data science.
• Work closely and collaborate with Data Scientists, Machine Learning engineers, IT teams and
Business stakeholders spread out across various locations in US and India to achieve business
goals
• Provide guidance to other Data Scientist and Machine Learning Engineers