Cutshort logo

50+ Spark Jobs in India

Apply to 50+ Spark Jobs on CutShort.io. Find your next job, effortlessly. Browse Spark Jobs and apply today!

icon
Mumbai
5 - 10 yrs
₹8L - ₹20L / yr
Data Science
Machine Learning (ML)
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+6 more


Data Scientist – Delivery & New Frontiers Manager 

Job Description:   

We are seeking highly skilled and motivated data scientist to join our Data Science team. The successful candidate will play a pivotal role in our data-driven initiatives and be responsible for designing, developing, and deploying data science solutions that drives business values for stakeholders. This role involves mapping business problems to a formal data science solution, working with wide range of structured and unstructured data, architecture design, creating sophisticated models, setting up operations for the data science product with the support from MLOps team and facilitating business workshops. In a nutshell, this person will represent data science and provide expertise in the full project cycle. Expectation of the successful candidate will be above that of a typical data scientist. Beyond technical expertise, problem solving in complex set-up will be key to the success for this role. 

Responsibilities: 

  • Collaborate with cross-functional teams, including software engineers, product managers, and business stakeholders, to understand business needs and identify data science opportunities. 
  • Map complex business problems to data science problem, design data science solution using GCP/Azure Databricks platform. 
  • Collect, clean, and preprocess large datasets from various internal and external sources.  
  • Streamlining data science process working with Data Engineering, and Technology teams. 
  • Managing multiple analytics projects within a Function to deliver end-to-end data science solutions, creation of insights and identify patterns.  
  • Develop and maintain data pipelines and infrastructure to support the data science projects 
  • Communicate findings and recommendations to stakeholders through data visualizations and presentations. 
  • Stay up to date with the latest data science trends and technologies, specifically for GCP companies 

 

Education / Certifications:  

Bachelor’s or Master’s in Computer Science, Engineering, Computational Statistics, Mathematics. 

Job specific requirements:  

  • Brings 5+ years of deep data science experience 

∙       Strong knowledge of machine learning and statistical modeling techniques in a in a clouds-based environment such as GCP, Azure, Amazon 

  • Experience with programming languages such as Python, R, Spark 
  • Experience with data visualization tools such as Tableau, Power BI, and D3.js 
  • Strong understanding of data structures, algorithms, and software design principles 
  • Experience with GCP platforms and services such as Big Query, Cloud ML Engine, and Cloud Storage 
  • Experience in configuring and setting up the version control on Code, Data, and Machine Learning Models using GitHub. 
  • Self-driven, be able to work with cross-functional teams in a fast-paced environment, adaptability to the changing business needs. 
  • Strong analytical and problem-solving skills 
  • Excellent verbal and written communication skills 
  • Working knowledge with application architecture, data security and compliance team. 


Read more
Databook

at Databook

5 candid answers
1 video
Nikhil Mohite
Posted by Nikhil Mohite
Mumbai
1 - 3 yrs
Upto ₹20L / yr (Varies
)
Data engineering
Python
Apache Kafka
Spark
Amazon Web Services (AWS)
+1 more

Lightning Job By Cutshort ⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered)

 

 

About Databook:-

- Great salespeople let their customers’ strategies do the talking.

 

Databook’s award-winning Strategic Relationship Management (SRM) platform uses advanced AI and NLP to empower the world’s largest B2B sales teams to create, manage, and maintain strategic relationships at scale. The platform ingests and interprets billions of financial and market data signals to generate actionable sales strategies that connect the seller’s solutions to a buyer’s financial pain and urgency.

 

The Opportunity

We're seeking Junior Engineers to support and develop Databook’s capabilities. Working closely with our seasoned engineers, you'll contribute to crafting new features and ensuring our platform's reliability. If you're eager about playing a part in building the future of customer intelligence, with a keen eye towards quality, we'd love to meet you!

 

Specifically, you'll

- Participate in various stages of the engineering lifecycle alongside our experienced engineers.

- Assist in maintaining and enhancing features of the Databook platform.

- Collaborate with various teams to comprehend requirements and aid in implementing technology solutions.

 

Please note: As you progress and grow with us, you might be introduced to on-call rotations to handle any platform challenges.

 

Working Arrangements:

- This position offers a hybrid work mode, allowing employees to work both remotely and in-office as mutually agreed upon.

 

What we're looking for

- 1-2+ years experience as a Data Engineer

- Bachelor's degree in Engineering

- Willingness to work across different time zones

- Ability to work independently

- Knowledge of cloud (AWS or Azure)

- Exposure to distributed systems such as Spark, Flink or Kafka

- Fundamental knowledge of data modeling and optimizations

- Minimum of one year of experience using Python working as a Software Engineer

- Knowledge of SQL (Postgres) databases would be beneficial

- Experience with building analytics dashboard

- Familiarity with RESTful APIs and/or GraphQL is welcomed

- Hand-on experience with Numpy, Pandas, SpaCY would be a plus

- Exposure or working experience on GenAI (LLMs in general), LLMOps would be a plus

- Highly fluent in both spoken and written English language

 

Ideal candidates will also have:

- Self-motivated with great organizational skills.

- Ability to focus on small and subtle details.

- Are willing to learn and adapt in a rapidly changing environment.

- Excellent written and oral communication skills.

 

Join us and enjoy these perks!

- Competitive salary with bonus

- Medical insurance coverage

- 5 weeks leave plus public holidays

- Employee referral bonus program

- Annual learning stipend to spend on books, courses or other training materials that help you develop skills relevant to your role or professional development

- Complimentary subscription to Masterclass

Read more
DataGrokr

at DataGrokr

4 candid answers
5 recruiters
Reshika Mendiratta
Posted by Reshika Mendiratta
Bengaluru (Bangalore)
5yrs+
Upto ₹30L / yr (Varies
)
Data engineering
Python
SQL
ETL
Data Warehouse (DWH)
+12 more

Lightning Job By Cutshort⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered).


About DataGrokr

DataGrokr (https://www.datagrokr.com) is a cloud native technology consulting organization providing the next generation of big data analytics, cloud and enterprise solutions. We solve complex technology problems for our global clients who rely on us for our deep technical knowledge and delivery excellence.

If you are unafraid of technology, believe in your learning ability and are looking to work amongst smart, driven colleagues whom you can look up to and learn from, you might want to check us out. 


About the role

We are looking for a Senior Data Engineer to join our growing engineering team. As a member of the team,

• You will work on enterprise data platforms, architect and implement data lakes both on-prem and in the cloud.

• You will be responsible for evolving technical architecture, design and implementation of data solutions using a variety of big data technologies. You will work extensively on all major public cloud platforms - AWS, Azure and GCP.

• You will work with senior technical architects on our client side to evolve an effective technology architecture and development strategy to implement the solution.

• You will work with extremely talented peers and follow modern engineering practices using agile methodologies.

• You will coach, mentor and lead other engineers and provide guidance to ensure the quality of and consistency of the solution.


Must-have skills and attitudes:

• Passion for data engineering, in-depth knowledge of some of the following technologies – SQL (expert level), Python (expert level), Spark (intermediate level), Big data stack of one of AWS/GCP.

• Hands on experience in data wrangling, data munging and ETL. Should be able to source data from anywhere and transform data to any shape using SQL, Python or Spark.

• Hands on experience working with variable data structures like XML/JSON/AVRO etc

• Ability to create data models and architect data warehouse components

• Experience with Version control (GIT/BIT BUCKET etc)

• Strong understanding of Agile, experience with CI/CD pipelines and processes

• Ability to communicate with technical as well as non-technical audience

• Collaborating with various stakeholders

• Have led scrum teams, participated in Sprint grooming and planning sessions, work / effort sizing and estimation


Desired Skills & Experience:

• At least 5 years of industry experience

• Working knowledge of any of the following - AWS Big Data Stack (S3, Redshift, Athena, Glue, etc.), GCP Big Data Stack (Cloud Storage, Workflow, Dataflow, Cloud Functions, Big Query, Pub Sub, etc.).

• Working knowledge of traditional enterprise data warehouse architectures and migrating them to the Cloud.

• Experience with Data Visualization tool (Tableau / Power BI etc)

• Experience with JIRA / Azure DevOps etc


How will DataGrokr support you in your growth:

• You will be groomed and mentored by senior leaders to take on leadership positions in the company

• You will be actively encouraged to attain certifications, lead technical workshops and conduct meetups to grow your own technology acumen and personal brand

• You will work in an open culture that promotes commitment over compliance, individual responsibility over rules and bringing out the best in everyone.

Read more
one-to-one, one-to-many, and many-to-many
Chennai
5 - 9 yrs
₹1L - ₹15L / yr
PowerBI
Python
Spark
Data Analytics
data brick

Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.

Key Responsibilities:

  1. Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
  2. Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
  3. Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
  4. Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
  5. Optimize and tune data pipelines for performance, scalability, and reliability.
  6. Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
  7. Implement security measures and best practices to protect sensitive data.
  8. Stay updated with emerging technologies and best practices in data engineering and data visualization.
  9. Document processes, workflows, and configurations to maintain a comprehensive knowledge base.

Requirements:

  1. Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
  2. Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
  3. Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
  4. Solid understanding of data analytics concepts and techniques.
  5. Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
  6. Proficiency in programming languages such as Python and SQL.
  7. Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
  8. Excellent analytical and problem-solving skills with attention to detail.
  9. Strong communication and collaboration skills to work effectively with cross-functional teams.
  10. Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.

Preferred Qualifications:

  • Advanced degree in Computer Science, Engineering, or related field.
  • Certifications in Power BI or related technologies.
  • Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
  • Knowledge of machine learning concepts and frameworks.


Read more
Vola Finance

at Vola Finance

1 video
5 recruiters
Reshika Mendiratta
Posted by Reshika Mendiratta
Bengaluru (Bangalore)
3yrs+
Upto ₹20L / yr (Varies
)
Amazon Web Services (AWS)
Data engineering
Spark
SQL
Data Warehouse (DWH)
+4 more

Lightning Job By Cutshort⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered)


Roles & Responsibilities


Basic Qualifications:

● The position requires a four-year degree from an accredited college or university.

● Three years of data engineering / AWS Architecture and security experience.


Top candidates will also have:

Proven/Strong understanding and/or experience in many of the following:-

● Experience designing Scalable AWS architecture.

● Ability to create modern data pipelines and data processing using AWS PAAS components (Glue, etc.) or open source tools (Spark, Hbase, Hive, etc.).

● Ability to develop SQL structures that support high volumes and scalability using

RDBMS such as SQL Server, MySQL, Aurora, etc.

● Ability to model and design modern data structures, SQL/NoSQL databases, Data Lakes, Cloud Data Warehouse

● Experience in creating Network Architecture for secured scalable solution.

● Experience with Message brokers such as Kinesis, Kafka, Rabbitmq, AWS SQS, AWS SNS, and Apache ActiveMQ. Hands-on experience on AWS serverless architectures such as Glue,Lamda, Redshift etc.

● Working knowledge of Load balancers, AWS shield, AWS guard, VPC, Subnets, Network gateway Route53 etc.

● Knowledge of building Disaster management systems and security logs notification system

● Knowledge of building scalable microservice architectures with AWS.

● To create a framework for monthly security checks and wide knowledge on AWS services

● Deploying software using CI/CD tools such CircleCI, Jenkins, etc.

● ML/ AI model deployment and production maintainanace experience is mandatory.

● Experience with API tools such as REST, Swagger, Postman and Assertible.

● Versioning management tools such as github, bitbucket, GitLab.

● Debugging and maintaining software in Linux or Unix platforms.

● Test driven development

● Experience building transactional databases.

● Python, PySpark programming experience .

● Must experience engineering solutions in AWS.

● Working AWS experience, AWS certification is required prior to hiring

● Working in Agile Framework/Kanban Framework

● Must demonstrate solid knowledge of computer science fundamentals like data structures & algorithms.

● Passion for technology and an eagerness to contribute to a team-oriented environment.

● Demonstrated leadership on medium to large-scale projects impacting strategic priorities.

● Bachelor’s degree in Computer science or Electrical engineering or related field is required

Read more
DeepIntent

at DeepIntent

2 candid answers
17 recruiters
Indrajeet Deshmukh
Posted by Indrajeet Deshmukh
Pune
3 - 5 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+5 more

About DeepIntent:

DeepIntent is a marketing technology company that helps healthcare brands strengthen communication with patients and healthcare professionals by enabling highly effective and performant digital advertising campaigns. Our healthcare technology platform, MarketMatch™, connects advertisers, data providers, and publishers to operate the first unified, programmatic marketplace for healthcare marketers. The platform’s built-in identity solution matches digital IDs with clinical, behavioural, and contextual data in real-time so marketers can qualify 1.6M+ verified HCPs and 225M+ patients to find their most clinically-relevant audiences and message them on a one-to-one basis in a privacy-compliant way. Healthcare marketers use MarketMatch to plan, activate, and measure digital campaigns in ways that best suit their business, from managed service engagements to technical integration or self-service solutions. DeepIntent was founded by Memorial Sloan Kettering alumni in 2016 and acquired by Propel Media, Inc. in 2017. We proudly serve major pharmaceutical and Fortune 500 companies out of our offices in New York, Bosnia and India.


What You’ll Do:

  • Establish formal data practice for the organisation.
  • Build & operate scalable and robust data architectures.
  • Create pipelines for the self-service introduction and usage of new data
  • Implement DataOps practices
  • Design, Develop, and operate Data Pipelines which support Data scientists and machine learning
  • Engineers.
  • Build simple, highly reliable Data storage, ingestion, and transformation solutions which are easy
  • to deploy and manage.
  • Collaborate with various business stakeholders, software engineers, machine learning
  • engineers, and analysts.

Who You Are:

  • Experience in designing, developing and operating configurable Data pipelines serving high
  • volume and velocity data.
  • Experience working with public clouds like GCP/AWS.
  • Good understanding of software engineering, DataOps, data architecture, Agile and
  • DevOps methodologies.
  • Experience building Data architectures that optimize performance and cost, whether the
  • components are prepackaged or homegrown
  • Proficient with SQL, Java, Spring boot, Python or JVM-based language, Bash
  • Experience with any of Apache open source projects such as Spark, Druid, Beam, Airflow
  • etc. and big data databases like BigQuery, Clickhouse, etc
  • Good communication skills with the ability to collaborate with both technical and non-technical
  • people.
  • Ability to Think Big, take bets and innovate, Dive Deep, Bias for Action, Hire and Develop the Best, Learn and be Curious

 

Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Pune, Hyderabad, Gurugram, Noida
5 - 11 yrs
₹20L - ₹36L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+7 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.


Role & Responsibilities:

Your role is focused on Design, Development and delivery of solutions involving:

• Data Integration, Processing & Governance

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Implement scalable architectural models for data processing and storage

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode

• Build functionality for data analytics, search and aggregation

Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 5+ years of IT experience with 3+ years in Data related technologies

2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc

6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Cloud data specialty and other related Big data technology certifications


Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes


Read more
Publicis Sapient

at Publicis Sapient

10 recruiters
Mohit Singh
Posted by Mohit Singh
Bengaluru (Bangalore), Gurugram, Pune, Hyderabad, Noida
4 - 10 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

Publicis Sapient Overview:

The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution 

.

Job Summary:

As Senior Associate L1 in Data Engineering, you will do technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution

The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. Having hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms will be preferable.


Role & Responsibilities:

Job Title: Senior Associate L1 – Data Engineering

Your role is focused on Design, Development and delivery of solutions involving:

• Data Ingestion, Integration and Transformation

• Data Storage and Computation Frameworks, Performance Optimizations

• Analytics & Visualizations

• Infrastructure & Cloud Computing

• Data Management Platforms

• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time

• Build functionality for data analytics, search and aggregation


Experience Guidelines:

Mandatory Experience and Competencies:

# Competency

1.Overall 3.5+ years of IT experience with 1.5+ years in Data related technologies

2.Minimum 1.5 years of experience in Big Data technologies

3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.

4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable

5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc


Preferred Experience and Knowledge (Good to Have):

# Competency

1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience

2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc

3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures

4.Performance tuning and optimization of data pipelines

5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality

6.Working knowledge with data platform related services on at least 1 cloud platform, IAM and data security

7.Cloud data specialty and other related Big data technology certifications


Job Title: Senior Associate L1 – Data Engineering

Personal Attributes:

• Strong written and verbal communication skills

• Articulation skills

• Good team player

• Self-starter who requires minimal oversight

• Ability to prioritize and manage multiple tasks

• Process orientation and the ability to define and set up processes

Read more
Arting Digital
Pragati Bhardwaj
Posted by Pragati Bhardwaj
Bengaluru (Bangalore)
4 - 6 yrs
₹10L - ₹15L / yr
databricks
Python
Spark
SQL
AWS Lambda

Title:- Senior Data Engineer 


Experience: 4-6 yrs

Budget: 24-28 lpa

Location: Bangalore 

Work of Mode: Work from office

Primary Skills: Data Bricks, Spark, Pyspark,Sql, Python, AWS

Qualification: Any Engineering degree


Responsibilities:

∙Design and build reusable components, frameworks and libraries at scale to support  analytics products. 

∙Design and implement product features in collaboration with business and Technology

 stakeholders. 

∙Anticipate, identify and solve issues concerning data management to improve data   quality.

∙Clean, prepare and optimize data at scale for ingestion and consumption.

∙Drive the implementation of new data management projects and re-structure of the   current data architecture.

∙Implement complex automated workflows and routines using workflow scheduling tools. 

∙Build continuous integration, test-driven development and production deployment

 frameworks.

∙Drive collaborative reviews of design, code, test plans and dataset implementation  performed by other data engineers in support of maintaining data engineering  standards.

∙Analyze and profile data for the purpose of designing scalable solutions. 

∙Troubleshoot complex data issues and perform root cause analysis to proactively resolve

 product and operational issues.

∙Mentor and develop other data engineers in adopting best practices.

 

Qualifications:

 

Primary skillset:


∙Experience working with distributed technology tools for developing Batch and  Streaming pipelines using 


  o SQL, Spark, Python, PySpark [4+ years],

  o Airflow [3+ years],

  o Scala [2+ years].


∙Able to write code which is optimized for performance.

∙Experience in Cloud platform, e.g., AWS, GCP, Azure, etc.

∙Able to quickly pick up new programming languages, technologies, and frameworks.

∙Strong skills building positive relationships across Product and Engineering.

∙Able to influence and communicate effectively, both verbally and written, with team  members and business stakeholders

∙Experience with creating/ configuring Jenkins pipeline for smooth CI/CD process for  Managed Spark jobs, build Docker images, etc.

∙Working knowledge of Data warehousing, Data modelling, Governance and Data   Architecture

 

Good to have:


∙Experience working with Data platforms, including EMR, Airflow, Databricks (Data

 Engineering & Delta Lake components, and Lakehouse Medallion architecture), etc.

∙Experience working in Agile and Scrum development process.

∙Experience in EMR/ EC2, Databricks etc.

∙Experience working with Data warehousing tools, including SQL database, Presto, and

 Snowflake

∙Experience architecting data product in Streaming, Serverless and Microservices  Architecture and platform.

Read more
Quinnox

at Quinnox

2 recruiters
MidhunKumar T
Posted by MidhunKumar T
Bengaluru (Bangalore), Mumbai
10 - 15 yrs
₹30L - ₹35L / yr
ADF
azure data lake services
SQL Azure
azure synapse
Spark
+4 more

Mandatory Skills: Azure Data Lake Storage, Azure SQL databases, Azure Synapse, Data Bricks (Pyspark/Spark), Python, SQL, Azure Data Factory.


Good to have: Power BI, Azure IAAS services, Azure Devops, Microsoft Fabric


Ø Very strong understanding on ETL and ELT

Ø Very strong understanding on Lakehouse architecture.

Ø Very strong knowledge in Pyspark and Spark architecture.

Ø Good knowledge in Azure data lake architecture and access controls

Ø Good knowledge in Microsoft Fabric architecture

Ø Good knowledge in Azure SQL databases

Ø Good knowledge in T-SQL

Ø Good knowledge in CI /CD process using Azure devops

Ø Power BI

Read more
DeepIntent

at DeepIntent

2 candid answers
17 recruiters
Indrajeet Deshmukh
Posted by Indrajeet Deshmukh
Pune
1 - 3 yrs
Best in industry
MongoDB
Big Data
Apache Kafka
Spring MVC
Spark
+1 more

With a core belief that advertising technology can measurably improve the lives of patients, DeepIntent is leading the healthcare advertising industry into the future. Built purposefully for the healthcare industry, the DeepIntent Healthcare Advertising Platform is proven to drive higher audience quality and script performance with patented technology and the industry’s most comprehensive health data. DeepIntent is trusted by 600+ pharmaceutical brands and all the leading healthcare agencies to reach the most relevant healthcare provider and patient audiences across all channels and devices. For more information, visit DeepIntent.com or find us on LinkedIn.


What You’ll Do:

  • Ensure timely and top-quality product delivery
  • Ensure that the end product is fully and correctly defined and documented
  • Ensure implementation/continuous improvement of formal processes to support product development activities
  • Drive the architecture/design decisions needed to achieve cost-effective and high-performance results
  • Conduct feasibility analysis, produce functional and design specifications of proposed new features.
  • Provide helpful and productive code reviews for peers and junior members of the team.
  • Troubleshoot complex issues discovered in-house as well as in customer environments.


Who You Are:

  • Strong computer science fundamentals in algorithms, data structures, databases, operating systems, etc.
  • Expertise in Java, Object Oriented Programming, Design Patterns
  • Experience in coding and implementing scalable solutions in a large-scale distributed environment
  • Working experience in a Linux/UNIX environment is good to have
  • Experience with relational databases and database concepts, preferably MySQL
  • Experience with SQL and Java optimization for real-time systems
  • Familiarity with version control systems Git and build tools like Maven
  • Excellent interpersonal, written, and verbal communication skills
  • BE/B.Tech./M.Sc./MCS/MCA in Computers or equivalent


The set of skills we are looking for:

  • MongoDB
  • Big Data
  • Apache Kafka 
  • Spring MVC 
  • Spark 
  • Java 


DeepIntent is committed to bringing together individuals from different backgrounds and perspectives. We strive to create an inclusive environment where everyone can thrive, feel a sense of belonging, and do great work together.

DeepIntent is an Equal Opportunity Employer, providing equal employment and advancement opportunities to all individuals. We recruit, hire and promote into all job levels the most qualified applicants without regard to race, color, creed, national origin, religion, sex (including pregnancy, childbirth and related medical conditions), parental status, age, disability, genetic information, citizenship status, veteran status, gender identity or expression, transgender status, sexual orientation, marital, family or partnership status, political affiliation or activities, military service, immigration status, or any other status protected under applicable federal, state and local laws. If you have a disability or special need that requires accommodation, please let us know in advance.

DeepIntent’s commitment to providing equal employment opportunities extends to all aspects of employment, including job assignment, compensation, discipline and access to benefits and training.

Read more
DeepIntent

at DeepIntent

2 candid answers
17 recruiters
Indrajeet Deshmukh
Posted by Indrajeet Deshmukh
Pune
3 - 8 yrs
Best in industry
MongoDB
Big Data
Apache Kafka
Spring MVC
Spark
+1 more

With a core belief that advertising technology can measurably improve the lives of patients, DeepIntent is leading the healthcare advertising industry into the future. Built purposefully for the healthcare industry, the DeepIntent Healthcare Advertising Platform is proven to drive higher audience quality and script performance with patented technology and the industry’s most comprehensive health data. DeepIntent is trusted by 600+ pharmaceutical brands and all the leading healthcare agencies to reach the most relevant healthcare provider and patient audiences across all channels and devices. For more information, visit DeepIntent.com or find us on LinkedIn.


What You’ll Do:

  • Ensure timely and top-quality product delivery
  • Ensure that the end product is fully and correctly defined and documented
  • Ensure implementation/continuous improvement of formal processes to support product development activities
  • Drive the architecture/design decisions needed to achieve cost-effective and high-performance results
  • Conduct feasibility analysis, produce functional and design specifications of proposed new features.
  • Provide helpful and productive code reviews for peers and junior members of the team.
  • Troubleshoot complex issues discovered in-house as well as in customer environments.


Who You Are:

  • Strong computer science fundamentals in algorithms, data structures, databases, operating systems, etc.
  • Expertise in Java, Object Oriented Programming, Design Patterns
  • Experience in coding and implementing scalable solutions in a large-scale distributed environment
  • Working experience in a Linux/UNIX environment is good to have
  • Experience with relational databases and database concepts, preferably MySQL
  • Experience with SQL and Java optimization for real-time systems
  • Familiarity with version control systems Git and build tools like Maven
  • Excellent interpersonal, written, and verbal communication skills
  • BE/B.Tech./M.Sc./MCS/MCA in Computers or equivalent


The set of skills we are looking for:

  • MongoDB
  • Big Data
  • Apache Kafka 
  • Spring MVC 
  • Spark 
  • Java 


DeepIntent is committed to bringing together individuals from different backgrounds and perspectives. We strive to create an inclusive environment where everyone can thrive, feel a sense of belonging, and do great work together.

DeepIntent is an Equal Opportunity Employer, providing equal employment and advancement opportunities to all individuals. We recruit, hire and promote into all job levels the most qualified applicants without regard to race, color, creed, national origin, religion, sex (including pregnancy, childbirth and related medical conditions), parental status, age, disability, genetic information, citizenship status, veteran status, gender identity or expression, transgender status, sexual orientation, marital, family or partnership status, political affiliation or activities, military service, immigration status, or any other status protected under applicable federal, state and local laws. If you have a disability or special need that requires accommodation, please let us know in advance.

DeepIntent’s commitment to providing equal employment opportunities extends to all aspects of employment, including job assignment, compensation, discipline and access to benefits and training.


Read more
xyz

at xyz

Agency job
via HR BIZ HUB by Pooja shankla
Bengaluru (Bangalore)
4 - 6 yrs
₹12L - ₹15L / yr
Java
Big Data
Apache Hive
Hadoop
Spark

Job Title Big Data Developer

Job Description

Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.

Solid Experience of software development experience and leading teams of engineers and scrum teams.

4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).

Solid Datawarehousing concepts.

Knowledge of Financial reporting ecosystem will be a plus.

4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.

Expert on Distributed ecosystem.

Hands-on experience with programming using Core Java or Python/Scala

Expert on Hadoop and Spark Architecture and its working principle

Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.

Experience in UNIX shell scripting.

Roles & Responsibilities

Ability to design and develop optimized Data pipelines for batch and real time data processing

Should have experience in analysis, design, development, testing, and implementation of system applications

Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.

Excellent technical and analytical aptitude

Good communication skills.

Excellent Project management skills.

Results driven Approach.

Mandatory SkillsBig Data, PySpark, Hive

Read more
Bengaluru (Bangalore), Hyderabad, Delhi, Gurugram
5 - 10 yrs
₹14L - ₹15L / yr
Google Cloud Platform (GCP)
Spark
PySpark
Apache Spark
"DATA STREAMING"

Data Engineering : Senior Engineer / Manager


As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.


Must Have skills :


1. GCP


2. Spark streaming : Live data streaming experience is desired.


3. Any 1 coding language: Java/Pyhton /Scala



Skills & Experience :


- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies


- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.


- Strong experience in at least of the programming language Java, Scala, Python. Java preferable


- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.


- Well-versed and working knowledge with data platform related services on GCP


- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position


Your Impact :


- Data Ingestion, Integration and Transformation


- Data Storage and Computation Frameworks, Performance Optimizations


- Analytics & Visualizations


- Infrastructure & Cloud Computing


- Data Management Platforms


- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time


- Build functionality for data analytics, search and aggregation

Read more
Neysa Networks Pvt Ltd

at Neysa Networks Pvt Ltd

2 candid answers
Swapna Uchil
Posted by Swapna Uchil
Mumbai
7 - 10 yrs
Best in industry
TensorFlow
PyTorch
Python
Hadoop
R Programming
+9 more

Day in the life...


As a machine learning engineer at Neysa, you would be required to

- Collaborate with network engineers and IT teams to identify network-related challenges and areas where ML can provide solutions. Understand the specific network remediation problems that need to be addressed.

- Develop ML-based models and algorithms specific to issues that affect computer networks, for example, congestion, security threats and human errors.

- Be comfortable handling multiple types and data sources and then pre-process and clean them for modelling purposes. 

- Develop machine learning models that analyse networking data to predict (and possibly prevent) issues, detect anomalies or optimise performance. Choose the right approach for each, such as deep learning, reinforcement learning, or traditional statistical methods. 

- Train and evaluate the efficacy of your models, create performance metrics to assess robustness and effectiveness, and use feedback loops to make course corrections.

- Design solutions that can scale to handle large network environments efficiently. This typically means optimising execution latency and resource usage.  

- Integrate your model to work with running and maintaining a network. These could be with other machines, human operators, or both. 

- Document your design, architecture and your thought process. Work with the technical writers to make sure your message gets through.

- Stay updated with all that is changing in AI and ML.

 

Must have skills

 

- You should have expertise in machine learning algorithms, data processing, and model development. It would be best to have proficiency in associated frameworks such as TensorFlow, PyTorch or scikit-learn.

- You should understand how computer networks work, how they are used, how they fail, and what happens if they fail.

- You should be proficient in programming languages like Python, R, Go, LISP, etc. It would help if you also were very useful with old-school shell scripting. 

- Experience with data processing tools like Hadoop, Spark or Kafka. 

- An above-average understanding of Linux and operating systems in general.

 

What separates the best from the rest

 

ML and AI are continuously evolving, and so is understanding how these technologies can be applied to solve real-world problems. To do your best, you may also need to

- Conceptualise ways to map new problems to existing methodologies or create new ones.

- Be prepared to iterate, reiterate and then iterate your approach.

- Be able to interact with subject matter experts in multiple fields to identify potential use cases for machine learning.

 

What you can expect

 

An environment where you can do your best work...

- The best equipment that complements your talents

- The best tools in the business for you to bring your creations to life

- A great environment

- Flexible work hours and flexible work locations

- The opportunity to make your mark and shape the future

- And have fun...

Read more
Hyderabad
6 - 15 yrs
₹11L - ₹15L / yr
Python
Spark
SQL Azure
Apache Kafka
MongoDB
+4 more

5+ years of experience designing, developing, validating, and automating ETL processes 3+ years of experience traditional ETL tools such as Visual Studio, SQL Server Management Studio, SSIS, SSAS and SSRS 2+ years of experience with cloud technologies and platforms, such as: Kubernetes, Spark, Kafka, Azure Data Factory, Snowflake, ML Flow, Databricks, Airflow or similar Must have experience with designing and implementing data access layers Must be an expert with SQL/T-SQL and Python Must have experience in Kafka Define and implement data models with various database technologies like MongoDB, CosmosDB, Neo4j, MariaDB and SQL Serve Ingest and publish data from sources and to destinations via an API Exposure to ETL/ELT with using Kafka or Azure Event Hubs with Spark or Databricks is a plus Exposure to healthcare technologies and integrations for FHIR API, HL7 or other HIE protocols is a plus


Skills Required :


Designing, Developing, ETL, Visual Studio, Python, Spark, Kubernetes, Kafka, Azure Data Factory, SQL Server, Airflow, Databricks, T-SQL, MongoDB, CosmosDB, Snowflake, SSIS, SSAS, SSRS, FHIR API, HL7, HIE Protocols

Read more
NutaNXT Technologies

at NutaNXT Technologies

1 recruiter
Jidnyasa S
Posted by Jidnyasa S
Pune
6 - 9 yrs
₹15L - ₹28L / yr
Spark
Scala
databricks,
NOSQL Databases

DATA ENGINEERING CONSULTANT


About NutaNXT: NutaNXT is a next-gen Software Product Engineering services provider building ground-breaking products using AI/ML, Data Analytics, IOT, Cloud & new emerging technologies disrupting the global markets. Our mission is to help clients leverage our specialized Digital Product Engineering capabilities on Data Engineering, AI Automations, Software Full stack solutions and services to build best-in-class products and stay ahead of the curve. You will get a chance to work on multiple projects critical to NutaNXT needs with opportunities to learn, develop new skills,switch teams and projects as you and our fast-paced business grow and evolve. Location: Pune Experience: 6 to 8 years


Job Description: NutaNXT is looking for supporting the planning and implementation of data design services, providing sizing and configuration assistance and performing needs assessments. Delivery of architectures for transformations and modernizations of enterprise data solutions using Azure cloud data technologies. As a Data Engineering Consultant, you will collect, aggregate, store, and reconcile data in support of Customer's business decisions. You will design and build data pipelines, data streams, data service APIs, data generators and other end-user information portals and insight tools.


Mandatory Skills: -


  1. Demonstrable experience in enterprise level data platforms involving implementation of end-to-end data pipelines with Python or Scala - Hands-on experience with at least one of the leading public cloud data platforms (Ideally Azure)
  2. - Experience with different Databases (like column-oriented database, NoSQL database, RDBMS)
  3. - Experience in architecting data pipelines and solutions for both streaming and batch integrations using tools/frameworks like Azure Databricks, Azure Data Factory, Spark, Spark Streaming, etc
  4. . - Understanding of data modeling, warehouse design and fact/dimension concepts - Good Communication


Good To Have:


Certifications for any of the cloud services (Ideally Azure)

• Experience working with code repositories and continuous integration • Understanding of development and project methodologies


Why Join Us?


We offer Innovative work in AI & Data Engineering Space, with a unique, diverse workplace environment having a Continuous learning and development opportunities. These are just some of the reasons we're consistently being recognized as one of the best companies to work for, and why our people choose to grow careers at NutaNXT. We also offer a highly flexible, self-driven, remote work culture which fosters the best of innovation, creativity and work-life balance, market industry-leading compensation which we believe help us consistently deliver to our clients and grow in the highly competitive, fast evolving Digital Engineering space with a strong focus on building advanced software products for clients in the US, Europe and APAC regions.

Read more
Compile

at Compile

16 recruiters
Sarumathi NH
Posted by Sarumathi NH
Bengaluru (Bangalore)
7 - 10 yrs
Best in industry
Data Warehouse (DWH)
Informatica
ETL
Spark

You will be responsible for designing, building, and maintaining data pipelines that handle Real-world data at Compile. You will be handling both inbound and outbound data deliveries at Compile for datasets including Claims, Remittances, EHR, SDOH, etc.

You will

  • Work on building and maintaining data pipelines (specifically RWD).
  • Build, enhance and maintain existing pipelines in pyspark, python and help build analytical insights and datasets.
  • Scheduling and maintaining pipeline jobs for RWD.
  • Develop, test, and implement data solutions based on the design.
  • Design and implement quality checks on existing and new data pipelines.
  • Ensure adherence to security and compliance that is required for the products.
  • Maintain relationships with various data vendors and track changes and issues across vendors and deliveries.

You have

  • Hands-on experience with ETL process (min of 5 years).
  • Excellent communication skills and ability to work with multiple vendors.
  • High proficiency with Spark, SQL.
  • Proficiency in Data modeling, validation, quality check, and data engineering concepts.
  • Experience in working with big-data processing technologies using - databricks, dbt, S3, Delta lake, Deequ, Griffin, Snowflake, BigQuery.
  • Familiarity with version control technologies, and CI/CD systems.
  • Understanding of scheduling tools like Airflow/Prefect.
  • Min of 3 years of experience managing data warehouses.
  • Familiarity with healthcare datasets is a plus.

Compile embraces diversity and equal opportunity in a serious way. We are committed to building a team of people from many backgrounds, perspectives, and skills. We know the more inclusive we are, the better our work will be.         

Read more
Molecular Connections

at Molecular Connections

4 recruiters
Molecular Connections
Posted by Molecular Connections
Bengaluru (Bangalore)
8 - 10 yrs
₹15L - ₹20L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
  1. Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
  2. A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
  3. Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
  4. Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
  5. Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
  6. Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
  7. Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
  8. Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
  9. Exposure to Cloudera development environment and management using Cloudera Manager.
  10. Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
  11. Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
  12. Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
  13. Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
  14. Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
  15. Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  16. Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
  17. Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
  18. In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
  19. Hands on expertise in real time analytics with Apache Spark.
  20. Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
  21. Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
  22. Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
  23. Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
  24. Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
  25. Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
  26. Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
  27. Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
  28. In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
  29. Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis. 
  30. Generated various kinds of knowledge reports using Power BI based on Business specification. 
  31. Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
  32. Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
  33. Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
  34. Good experience with use-case development, with Software methodologies like Agile and Waterfall.
  35. Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
  36. Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
  37. Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
  38. Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
  39. Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Read more
A Product Based Client,Chennai
Chennai
4 - 8 yrs
₹10L - ₹15L / yr
Data Warehouse (DWH)
Informatica
ETL
Spark
PySpark
+2 more

Analytics Job Description

We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will

partner closely with leaders across the organization, working together to understand the how

and why of people, team and company challenges, workflows and culture. The team is

responsible for delivering data and insights that drive decision-making, execution, and

investments for our product initiatives.

You will work cross-functionally with product, marketing, sales, engineering, finance, and our

customer-facing teams enabling them with data and narratives about the customer journey.

You’ll also work closely with other data teams, such as data engineering and product analytics,

to ensure we are creating a strong data culture at Blend that enables our cross-functional partners

to be more data-informed.


Role : DataEngineer 

Please find below the JD for the DataEngineer Role..

  Location: Guindy,Chennai

How you’ll contribute:

• Develop objectives and metrics, ensure priorities are data-driven, and balance short-

term and long-term goals


• Develop deep analytical insights to inform and influence product roadmaps and

business decisions and help improve the consumer experience

• Work closely with GTM and supporting operations teams to author and develop core

data sets that empower analyses

• Deeply understand the business and proactively spot risks and opportunities

• Develop dashboards and define metrics that drive key business decisions

• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,

and Workato

• Design our Analytics and Business Intelligence architecture, assessing and

implementing new technologies that fitting


• Work with our engineering teams to continually make our data pipelines and tooling

more resilient


Who you are:

• Bachelor’s degree or equivalent required from an accredited institution with a

quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist

• Must have strong SQL and data modeling skills, with experience applying skills to

thoughtfully create data models in a warehouse environment.

• A proven track record of using analysis to drive key decisions and influence change

• Strong storyteller and ability to communicate effectively with managers and

executives

• Demonstrated ability to define metrics for product areas, understand the right

questions to ask and push back on stakeholders in the face of ambiguous, complex

problems, and work with diverse teams with different goals

• A passion for documentation.

• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a

dynamic environment.

• A bias towards communication and collaboration with business and technical

stakeholders.

• Quantitative rigor and systems thinking.

• Prior startup experience is preferred, but not required.

• Interest or experience in machine learning techniques (such as clustering, decision

tree, and segmentation)

• Familiarity with a scientific computing language, such as Python, for data wrangling

and statistical analysis

• Experience with a SQL focused data transformation framework such as dbt

• Experience with a Business Intelligence Tool such as Mode/Tableau


Mandatory Skillset:


-Very Strong in SQL

-Spark OR pyspark OR Python

-Shell Scripting


Read more
Epik Solutions
Sakshi Sarraf
Posted by Sakshi Sarraf
Bengaluru (Bangalore), Noida
4 - 13 yrs
₹7L - ₹18L / yr
Python
SQL
databricks
Scala
Spark
+2 more

Job Description:


As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:


Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.


Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.


Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.


Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.


Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.


Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.


Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.


Skills and Qualifications:


Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.

Proficiency in designing and developing data pipelines and ETL processes.

Solid understanding of data modeling concepts and database design principles.

Familiarity with data integration and orchestration using Azure Data Factory.

Knowledge of data quality management and data governance practices.

Experience with performance tuning and optimization of data pipelines.

Strong problem-solving and troubleshooting skills related to data engineering.

Excellent collaboration and communication skills to work effectively in cross-functional teams.

Understanding of cloud computing principles and experience with Azure services.


Read more
Bengaluru (Bangalore)
5 - 9 yrs
₹10L - ₹18L / yr
Machine Learning (ML)
Data Science
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+10 more

Requirements

Experience

  • 5+ years of professional experience in implementing MLOps framework to scale up ML in production.
  • Hands-on experience with Kubernetes, Kubeflow, MLflow, Sagemaker, and other ML model experiment management tools including training, inference, and evaluation.
  • Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.)
  • Proficiency with ML model training frameworks (PyTorch, Pytorch Lightning, Tensorflow, etc.).
  • Experience with GPU computing to do data and model training parallelism.
  • Solid software engineering skills in developing systems for production.
  • Strong expertise in Python.
  • Building end-to-end data systems as an ML Engineer, Platform Engineer, or equivalent.
  • Experience working with cloud data processing technologies (S3, ECR, Lambda, AWS, Spark, Dask, ElasticSearch, Presto, SQL, etc.).
  • Having Geospatial / Remote sensing experience is a plus.
Read more
BlueYonder
Bengaluru (Bangalore), Hyderabad
10 - 14 yrs
Best in industry
Java
J2EE
Spring Boot
Hibernate (Java)
Gradle
+13 more

·      Core responsibilities to include analyze business requirements and designs for accuracy and completeness. Develops and maintains relevant product.

·      BlueYonder is seeking a Senior/Principal Architect in the Data Services department (under Luminate Platform ) to act as one of key technology leaders to build and manage BlueYonder’ s technology assets in the Data Platform and Services.

·      This individual will act as a trusted technical advisor and strategic thought leader to the Data Services department. The successful candidate will have the opportunity to lead, participate, guide, and mentor other people in the team on architecture and design in a hands-on manner. You are responsible for technical direction of Data Platform. This position reports to the Global Head, Data Services and will be based in Bangalore, India.

·      Core responsibilities to include Architecting and designing (along with counterparts and distinguished Architects) a ground up cloud native (we use Azure) SaaS product in Order management and micro-fulfillment

·      The team currently comprises of 60+ global associates across US, India (COE) and UK and is expected to grow rapidly. The incumbent will need to have leadership qualities to also mentor junior and mid-level software associates in our team. This person will lead the Data platform architecture – Streaming, Bulk with Snowflake/Elastic Search/other tools

Our current technical environment:

·      Software: Java, Springboot, Gradle, GIT, Hibernate, Rest API, OAuth , Snowflake

·      • Application Architecture: Scalable, Resilient, event driven, secure multi-tenant Microservices architecture

·      • Cloud Architecture: MS Azure (ARM templates, AKS, HD insight, Application gateway, Virtue Networks, Event Hub, Azure AD)

·      Frameworks/Others: Kubernetes, Kafka, Elasticsearch, Spark, NOSQL, RDBMS, Springboot, Gradle GIT, Ignite

Read more
AdElement

at AdElement

2 recruiters
Sachin Bhatevara
Posted by Sachin Bhatevara
Pune
0 - 1 yrs
₹0.1L - ₹3L / yr
Java
Javascript
React.js
Angular (2+)
AngularJS (1.x)
+10 more

We are looking for computer science/engineering final year students/ fresh graduates that have solid understanding of computer science fundamentals (algorithms, data structures, object oriented programming) and strong java. programming skills. You will get to work on machine learning algorithms as applied to online advertising or do data analytics. You will learn how to collaborate in small, agile teams, do rapid development, testing and get to taste the invigorating feel of a start-up company.


Experience

None required


Required Skills

-Solid foundation in computer science, with strong competencies in data structures, algorithms, and software design

-Java / Python programming

-UI/UX HTML5 CSS3, Javascript

-MYSQL, Relational Databases

-MVC Framework, ReactJS


Optional Skills

-Familiarity with online advertising, web technologies

-Familiarity with Hadoop, Spark, Scala


Education

UG - B.Tech/B.E. - Computers; PG - M.Tech - Computers

Read more
Exponentia.ai

at Exponentia.ai

1 product
1 recruiter
Vipul Tiwari
Posted by Vipul Tiwari
Mumbai
4 - 6 yrs
₹12L - ₹19L / yr
ETL
Informatica
Data Warehouse (DWH)
databricks
Amazon Web Services (AWS)
+6 more

 Job DescriptionPosition: Sr Data Engineer – Databricks & AWS

Experience: 4 - 5 Years

 

Company Profile:


Exponentia.ai is an AI tech organization with a presence across India, Singapore, the Middle East, and the UK. We are an innovative and disruptive organization, working on cutting-edge technology to help our clients transform into the enterprises of the future. We provide artificial intelligence-based products/platforms capable of automated cognitive decision-making to improve productivity, quality, and economics of the underlying business processes. Currently, we are transforming ourselves and rapidly expanding our business.

Exponentia.ai has developed long-term relationships with world-class clients such as PayPal, PayU, SBI Group, HDFC Life, Kotak Securities, Wockhardt and Adani Group amongst others.

One of the top partners of Cloudera (leading analytics player) and Qlik (leader in BI technologies), Exponentia.ai has recently been awarded the ‘Innovation Partner Award’ by Qlik in 2017.

Get to know more about us on our website: http://www.exponentia.ai/ and Life @Exponentia.

 

​Role Overview: 


·         A Data Engineer understands the client requirements and develops and delivers the data engineering solutions as per the scope.

·         The role requires good skills in the development of solutions using various services required for data architecture on Databricks Delta Lake, streaming, AWS, ETL Development, and data modeling.

 

Job Responsibilities


•         Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.

•         Apply best practices during design in data modeling (logical, physical) and ETL pipelines (streaming and batch) using cloud-based services.

•         Design, develop and manage the pipelining (collection, storage, access), data engineering (data quality, ETL, Data Modelling) and understanding (documentation, exploration) of the data.

•         Interact with stakeholders regarding data landscape understanding, conducting discovery exercises, developing proof of concepts and demonstrating it to stakeholders.

 

Technical Skills 


•         Has more than 2 Years of experience in developing data lakes, and datamarts on the Databricks platform.

•         Proven skill sets in AWS Data Lake services such as - AWS Glue, S3, Lambda, SNS, IAM, and skills in Spark, Python, and SQL.

•         Experience in Pentaho

•         Good understanding of developing a data warehouse, data marts etc.

•         Has a good understanding of system architectures, and design patterns and should be able to design and develop applications using these principles.

 

Personality Traits


•         Good collaboration and communication skills

•         Excellent problem-solving skills to be able to structure the right analytical solutions.

•         Strong sense of teamwork, ownership, and accountability

•         Analytical and conceptual thinking 

•         Ability to work in a fast-paced environment with tight schedules.

•         Good presentation skills with the ability to convey complex ideas to peers and management.

 

Education:

 

BE / ME / MS/MCA.

    



Read more
iLink Systems

at iLink Systems

1 video
1 recruiter
Ganesh Sooriyamoorthu
Posted by Ganesh Sooriyamoorthu
Chennai, Pune, Noida, Bengaluru (Bangalore)
5 - 15 yrs
₹10L - ₹15L / yr
Apache Kafka
Big Data
Java
Spark
Hadoop
+1 more
  • KSQL
  • Data Engineering spectrum (Java/Spark)
  • Spark Scala / Kafka Streaming
  • Confluent Kafka components
  • Basic understanding of Hadoop


Read more
Kloud9 Technologies
Bengaluru (Bangalore)
3 - 6 yrs
₹5L - ₹20L / yr
Amazon Web Services (AWS)
Amazon EMR
EMR
Spark
PySpark
+9 more

About Kloud9:

 

Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.

 

Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.

 

At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.

 

Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.

 

We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.


What we are looking for:

● 3+ years’ experience developing Data & Analytic solutions

● Experience building data lake solutions leveraging one or more of the following AWS, EMR, S3, Hive& Spark

● Experience with relational SQL

● Experience with scripting languages such as Shell, Python

● Experience with source control tools such as GitHub and related dev process

● Experience with workflow scheduling tools such as Airflow

● In-depth knowledge of scalable cloud

● Has a passion for data solutions

● Strong understanding of data structures and algorithms

● Strong understanding of solution and technical design

● Has a strong problem-solving and analytical mindset

● Experience working with Agile Teams.

● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

● Able to quickly pick up new programming languages, technologies, and frameworks

● Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:

 

With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.

Read more
TensorGo Software Private Limited
Deepika Agarwal
Posted by Deepika Agarwal
Remote only
5 - 8 yrs
₹5L - ₹15L / yr
Python
PySpark
apache airflow
Spark
Hadoop
+4 more

Requirements:

● Understanding our data sets and how to bring them together.

● Working with our engineering team to support custom solutions offered to the product development.

● Filling the gap between development, engineering and data ops.

● Creating, maintaining and documenting scripts to support ongoing custom solutions.

● Excellent organizational skills, including attention to precise details

● Strong multitasking skills and ability to work in a fast-paced environment

● 5+ years experience with Python to develop scripts.

● Know your way around RESTFUL APIs.[Able to integrate not necessary to publish]

● You are familiar with pulling and pushing files from SFTP and AWS S3.

● Experience with any Cloud solutions including GCP / AWS / OCI / Azure.

● Familiarity with SQL programming to query and transform data from relational Databases.

● Familiarity to work with Linux (and Linux work environment).

● Excellent written and verbal communication skills

● Extracting, transforming, and loading data into internal databases and Hadoop

● Optimizing our new and existing data pipelines for speed and reliability

● Deploying product build and product improvements

● Documenting and managing multiple repositories of code

● Experience with SQL and NoSQL databases (Casendra, MySQL)

● Hands-on experience in data pipelining and ETL. (Any of these frameworks/tools: Hadoop, BigQuery,

RedShift, Athena)

● Hands-on experience in AirFlow

● Understanding of best practices, common coding patterns and good practices around

● storing, partitioning, warehousing and indexing of data

● Experience in reading the data from Kafka topic (both live stream and offline)

● Experience in PySpark and Data frames

Responsibilities:

You’ll

● Collaborating across an agile team to continuously design, iterate, and develop big data systems.

● Extracting, transforming, and loading data into internal databases.

● Optimizing our new and existing data pipelines for speed and reliability.

● Deploying new products and product improvements.

● Documenting and managing multiple repositories of code.

Read more
Telstra

at Telstra

1 video
1 recruiter
Mahesh Balappa
Posted by Mahesh Balappa
Bengaluru (Bangalore), Hyderabad, Pune
3 - 7 yrs
Best in industry
Spark
Hadoop
NOSQL Databases
Apache Kafka

About Telstra

 

Telstra is Australia’s leading telecommunications and technology company, with operations in more than 20 countries, including In India where we’re building a new Innovation and Capability Centre (ICC) in Bangalore.

 

We’re growing, fast, and for you that means many exciting opportunities to develop your career at Telstra. Join us on this exciting journey, and together, we’ll reimagine the future.

 

Why Telstra?

 

  • We're an iconic Australian company with a rich heritage that's been built over 100 years. Telstra is Australia's leading Telecommunications and Technology Company. We've been operating internationally for more than 70 years.
  • International presence spanning over 20 countries.
  • We are one of the 20 largest telecommunications providers globally
  • At Telstra, the work is complex and stimulating, but with that comes a great sense of achievement. We are shaping the tomorrow's modes of communication with our innovation driven teams.

 

Telstra offers an opportunity to make a difference to lives of millions of people by providing the choice of flexibility in work and a rewarding career that you will be proud of!

 

About the team

Being part of Networks & IT means you'll be part of a team that focuses on extending our network superiority to enable the continued execution of our digital strategy.

With us, you'll be working with world-leading technology and change the way we do IT to ensure business needs drive priorities, accelerating our digitisation programme.

 

Focus of the role

Any new engineer who comes into data chapter would be mostly into developing reusable data processing and storage frameworks that can be used across data platform.

 

About you

To be successful in the role, you'll bring skills and experience in:-

 

Essential 

  • Hands-on experience in Spark Core, Spark SQL, SQL/Hive/Impala, Git/SVN/Any other VCS and Data warehousing
  • Skilled in the Hadoop Ecosystem(HDP/Cloudera/MapR/EMR etc)
  • Azure data factory/Airflow/control-M/Luigi
  • PL/SQL
  • Exposure to NOSQL(Hbase/Cassandra/GraphDB(Neo4J)/MongoDB)
  • File formats (Parquet/ORC/AVRO/Delta/Hudi etc.)
  • Kafka/Kinesis/Eventhub

 

Highly Desirable

Experience and knowledgeable on the following:

  • Spark Streaming
  • Cloud exposure (Azure/AWS/GCP)
  • Azure data offerings - ADF, ADLS2, Azure Databricks, Azure Synapse, Eventhubs, CosmosDB etc.
  • Presto/Athena
  • Azure DevOps
  • Jenkins/ Bamboo/Any similar build tools
  • Power BI
  • Prior experience in building or working in team building reusable frameworks,
  • Data modelling.
  • Data Architecture and design principles. (Delta/Kappa/Lambda architecture)
  • Exposure to CI/CD
  • Code Quality - Static and Dynamic code scans
  • Agile SDLC      

 

If you've got a passion to innovate, succeed as part of a great team, and looking for the next step in your career, we'd welcome you to apply!

___________________________

 

We’re committed to building a diverse and inclusive workforce in all its forms. We encourage applicants from diverse gender, cultural and linguistic backgrounds and applicants who may be living with a disability. We also offer flexibility in all our roles, to ensure everyone can participate.

To learn more about how we support our people, including accessibility adjustments we can provide you through the recruitment process, visit tel.st/thrive.

Read more
Kloud9 Technologies
manjula komala
Posted by manjula komala
Bengaluru (Bangalore)
3 - 6 yrs
₹18L - ₹27L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more

About Kloud9:


Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.


Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.


At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.


Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.


We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.



What we are looking for:


●       3+ years’ experience developing Big Data & Analytic solutions

●       Experience building data lake solutions leveraging Google Data Products (e.g. Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep, etc.), Hive, Spark

●       Experience with relational SQL/No SQL

●       Experience with Spark (Scala/Python/Java) and Kafka

●       Work experience with using Databricks (Data Engineering and Delta Lake components)

●       Experience with source control tools such as GitHub and related dev process

●       Experience with workflow scheduling tools such as Airflow

●       In-depth knowledge of any scalable cloud vendor(GCP preferred)

●       Has a passion for data solutions

●       Strong understanding of data structures and algorithms

●       Strong understanding of solution and technical design

●       Has a strong problem solving and analytical mindset

●       Experience working with Agile Teams.

●       Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders

●       Able to quickly pick up new programming languages, technologies, and frameworks

●       Bachelor’s Degree in computer science


Why Explore a Career at Kloud9:


With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers!

Read more
Gurugram
1 - 7 yrs
₹7L - ₹35L / yr
SQL
Python
PySpark
ETL
Informatica
+11 more

About Us:

Cognitio Analytics is an award-winning, niche service provider that offers digital transformation solutions powered by AI and machine learning. We help clients realize the full potential of their data assets and the investments made in related technologies, be it analytics and big data platforms, digital technologies or your people. Our solutions include Health Analytics powered by Cognitio’s Health Data Factory that drives better care outcomes and higher ROI on care spending. We specialize in providing Data Governance solutions that help effectively use data in a consistent and compliant manner. Additionally, our smart intelligence solutions enable a deeper understanding of operations through the use of data science and advanced solutions like process mining technologies. We have offices in New Jersey and Connecticut in the USA and in Gurgaon in India.

 

What we're looking for:

  • Ability in data modelling, design, build and deploy DW/BI systems for Insurance, Health Care, Banking, etc.
  • Performance tuning of ETL process and SQL queries and recommend & implement ETL and query tuning techniques.
  • Develop and create transformation queries, views, and stored procedures for ETL processes, and process automations.
  • Translate business needs into technical specifications, evaluate, and improve existing BI systems.
  • Use business intelligence and visualization software (e.g., Tableau, Qlik Sense, Power BI etc.) to empower customers to drive their analytics and reporting
  • Develop and update technical documentation for BI systems.

 

 

Key Technical Skills:

  • Hands on experience in MS SQL Server & MSBI (SSIS/SSRS/SSAS), with understanding of database concepts, star schema, SQL Tuning, OLAP, Databricks, Hadoop, Spark, cloud technologies.
  • Experience in designing and building complete ETL / SSIS processes and transforming data for ODS, Staging, and Data Warehouse
  • Experience building self-service reporting solutions using business intelligence software (e.g., Tableau, Qlik Sense, Power BI etc.)
Read more
Noida
3 - 8 yrs
₹9L - ₹24L / yr
Amazon Web Services (AWS)
AWS Lambda
AWS CloudFormation
Spark
Python
Hi

About the Company :

Our Client enables enterprises in their digital transformation journey by offering Consulting & Implementation Services related to Data Analytics &Enterprise Performance Management (EPM).

Our Cleint  deliver the best-suited solutions to our customers across industries such as Retail & E-commerce, Consumer Goods, Pharmaceuticals & Life Sciences, Real Estate & Senior Housing, Hi-tech, Media & Telecom as Manufacturing and Automotive clientele.

Our in-house research and innovation lab has conceived multiple plug-n-play apps, toolkits and plugins to streamline implementation and faster time-to-market

 

Job Title– AWS Developer

Notice period- Immediate to 60 days

Experience – 3-8

Location -  Noida, Mumbai, Bangalore & Kolkata

 

Roles & Responsibilities

  • Bachelor’s degree in Computer Science or a related analytical field or equivalent experience is preferred
  • 3+ years’ experience in one or more architecture domains (e.g., business architecture, solutions architecture, application architecture) 
  • Must have 2 years of experience in design and implementation of cloud workloads in AWS.
  • Minimum of 2 years of experience handling workloads in large-scale environments. Experience in managing large operational cloud environments spanning multiple tenants through techniques such as Multi-Account management, AWS Well Architected Best Practices. 
  • Minimum 3 years of microservice architectural experience. 
  • Minimum of 3 years of experience working exclusively designing and implementing cloud-native workloads. 
  • Experience with analysing and defining technical requirements & design specifications. 
  • Experience with database design with both relational and document-based database systems.
  • Experience with integrating complex multi-tier applications. 
  • Experience with API design and development.
  • Experience with cloud networking and network security, including virtual networks, network security groups, cloud-native firewalls, etc.
  • Proven ability to write programs using an object-oriented or functional programming language such as Spark, Python, AWS Glue, Aws Lambda

 

Job Specification

*Strong and innovative approach to problem solving and finding solutions.

*Excellent communicator (written and verbal, formal and informal).

*Flexible and proactive/self-motivated working style with strong personal ownership of problem resolution.

*Ability to multi-task under pressure and work independently with minimal supervision.

Regards
Team Merito

Read more
Pune
0 - 1 yrs
₹10L - ₹15L / yr
Java
J2EE
Spring Boot
Hibernate (Java)
SQL
+6 more
1. Work closely with senior engineers to design, implement and deploy applications that impact the business with an emphasis on mobile, payments, and product website development
2. Design software and make technology choices across the stack (from data storage to application to front-end)
3. Understand a range of tier-1 systems/services that power our product to make scalable changes to critical path code
4. Own the design and delivery of an integral piece of a tier-1 system or application
5. Work closely with product managers, UX designers, and end users and integrate software components into a fully functional system
6. Work on the management and execution of project plans and delivery commitments
7. Take ownership of product/feature end-to-end for all phases from the development to the production
8. Ensure the developed features are scalable and highly available with no quality concerns
9. Work closely with senior engineers for refining and implementation
10. Manage and execute project plans and delivery commitments
11. Create and execute appropriate quality plans, project plans, test strategies, and processes for development activities in concert with business and project management efforts
Read more
Pune
5 - 9 yrs
₹5L - ₹15L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
This role is for a developer with strong core application or system programming skills in Scala, java and
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with

For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Read more
Hyderabad
4 - 7 yrs
₹14L - ₹25L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more

Roles and Responsibilities

Big Data Engineer + Spark Responsibilies Atleast 3 to 4 years of relevant experience as Big Data Engineer Min 1 year of relevant hands-on experience into Spark framework. Minimum 4 years of Application Development experience using any programming language like Scala/Java/Python. Hands on experience on any major components in Hadoop Ecosystem like HDFS or Map or Reduce or Hive or Impala. Strong programming experience of building applications / platforms using Scala/Java/Python. Experienced in implementing Spark RDD Transformations, actions to implement business analysis. An efficient interpersonal communicator with sound analytical problemsolving skills and management capabilities. Strive to keep the slope of the learning curve high and able to quickly adapt to new environments and technologies. Good knowledge on agile methodology of Software development.
Read more
Hyderabad
3 - 5 yrs
₹10L - ₹14L / yr
Microservices
Java
Ansible
Spring Boot
Spring MVC
+8 more

Roles and Responsibilities

Java + Microservices Developer Responsibilies Hands-on experience of minimum 3-5 Years in development of scalable and extensible systems using Java. Hands-on experience into Microservices. Experience into frameworks like Spring, Spring MVC, Spring Boot, Hibernate etc. Good knowledge or hands-on experience with a minimum 1 year in Java Script Good working exposure into any Bigdata Technologies like Hadoop, Spark, Scala etc. Experience into Jenkins, Maven, Git. Solid and fluent understanding of algorithm and data structures. Excellent software design, problem-solving and analytical skills. Candidates graduated from Good schools like IIT's, NIIT's, IIIT's (Preferred). Excellent Communication Skills Experience in Database technology such as SQL & No SQL. Good understanding of Elastic Search, Redis, Routines Sync & Async
Read more
Hyderabad
7 - 12 yrs
₹12L - ₹24L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+5 more

Skills

Proficient experience of minimum 7 years into Hadoop. Hands-on experience of minimum 2 years into AWS - EMR/ S3 and other AWS services and dashboards. Good experience of minimum 2 years into Spark framework. Good understanding of Hadoop Eco system including Hive, MR, Spark and Zeppelin. Responsible for troubleshooting and recommendation for Spark and MR jobs. Should be able to use existing logs to debug the issue. Responsible for implementation and ongoing administration of Hadoop infrastructure including monitoring, tuning and troubleshooting Triage production issues when they occur with other operational teams. Hands on experience to troubleshoot incidents, formulate theories and test hypothesis and narrow down possibilities to find the root cause.
Read more
Consulting
Agency job
via Michael Page by Pratanu Chakraborty
Pune, Mumbai
6 - 8 yrs
₹5L - ₹20L / yr
Python
Spark
SQL
6-8 years of hands-on development experience using core Python
Hands-on experience with Spark and SQL
Good to have java knowledge
Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
Remote only
5 - 8 yrs
₹10L - ₹25L / yr
DevOps
Kubernetes
Docker
SAS
Apache Hive
+2 more

Must Have skills :

Experience in Linux Administration

Experience in building, deploying, and monitoring distributed apps using container systems (Docker) and container orchestration (Kubernetes, EKS)

Ability to read and understand code (Java / Python / R / Scala)

Experience AWS and tools

 

Nice to have skills:

Experience in SAS Viya administration

Experience managing large Big Data clusters

Experience in Big Data tools like Hue, Hive, Spark, Jupyter, SAS and R-Studio

Read more
Encubate Tech Private Ltd
Mumbai
5 - 6 yrs
₹15L - ₹20L / yr
Amazon Web Services (AWS)
Amazon Redshift
Data modeling
ITL
Agile/Scrum
+7 more

Roles and

Responsibilities

Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to

help us in configure and develop new AWS environments for our Enterprise Data Lake,

migrate the on-premise traditional workloads to cloud. Must have a sound

understanding of BI best practices, relational structures, dimensional data modelling,

structured query language (SQL) skills, data warehouse and reporting techniques.

 Extensive experience in providing AWS Cloud solutions to various business

use cases.

 Creating star schema data models, performing ETLs and validating results with

business representatives

 Supporting implemented BI solutions by: monitoring and tuning queries and

data loads, addressing user questions concerning data integrity, monitoring

performance and communicating functional and technical issues.

Job Description: -

This position is responsible for the successful delivery of business intelligence

information to the entire organization and is experienced in BI development and

implementations, data architecture and data warehousing.

Requisite Qualification

Essential

-

AWS Certified Database Specialty or -

AWS Certified Data Analytics

Preferred

Any other Data Engineer Certification

Requisite Experience

Essential 4 -7 yrs of experience

Preferred 2+ yrs of experience in ETL & data pipelines

Skills Required

Special Skills Required

 AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.

 Bigdata: Databricks, Spark, Glue and Athena

 Expertise in Lake Formation, Python programming, Spark, Shell scripting

 Minimum Bachelor’s degree with 5+ years of experience in designing, building,

and maintaining AWS data components

 3+ years of experience in data component configuration, related roles and

access setup

 Expertise in Python programming

 Knowledge in all aspects of DevOps (source control, continuous integration,

deployments, etc.)

 Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD

 Hands on ETL development experience, preferably using or SSIS

 SQL Server experience required

 Strong analytical skills to solve and model complex business requirements

 Sound understanding of BI Best Practices/Methodologies, relational structures,

dimensional data modelling, structured query language (SQL) skills, data

warehouse and reporting techniques

Preferred Skills

Required

 Experience working in the SCRUM Environment.

 Experience in Administration (Windows/Unix/Network/Database/Hadoop) is a

plus.

 Experience in SQL Server, SSIS, SSAS, SSRS

 Comfortable with creating data models and visualization using Power BI

 Hands on experience in relational and multi-dimensional data modelling,

including multiple source systems from databases and flat files, and the use of

standard data modelling tools

 Ability to collaborate on a team with infrastructure, BI report development and

business analyst resources, and clearly communicate solutions to both

technical and non-technical team members

Read more
Hyderabad
5 - 15 yrs
₹4L - ₹14L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+4 more
Big Data Engineer:-


-Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight.

-Experience in developing lambda functions with AWS Lambda.

-
Expertise with Spark/PySpark

– Candidate should be hands on with PySpark code and should be able to do transformations with Spark

-Should be able to code in Python and Scala.

-
Snowflake experience will be a plus
Read more
Thoughtworks

at Thoughtworks

1 video
27 recruiters
Vidyashree Kulkarni
Posted by Vidyashree Kulkarni
Remote only
9 - 15 yrs
Best in industry
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more
Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.

Job responsibilities
  • You will partner with teammates to create complex data processing pipelines in order to solve our clients' most complex challenges
  • You will collaborate with Data Scientists in order to design scalable implementations of their models
  • You will pair to write clean and iterative code based on TDD
  • Leverage various continuous delivery practices to deploy, support and operate data pipelines
  • Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
  • Develop and operate modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
  • Create data models and speak to the tradeoffs of different modeling approaches
  • Seamlessly incorporate data quality into your day-to-day work as well as into the delivery process
  • Assure effective collaboration between Thoughtworks' and the client's teams, encouraging open communication and advocating for shared outcomes
Job qualifications

Technical skills

  • You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
  • You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
  • Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
  • You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
  • Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systems
  • You're genuinely excited about data infrastructure and operations with a familiarity working in cloud environments
  • Professional skills
  • You're resilient and flexible in ambiguous situations and enjoy solving problems from technical and business perspectives
  • An interest in coaching, sharing your experience and knowledge with teammates
  • You enjoy influencing others and always advocate for technical excellence while being open to change when needed
  • Presence in the external tech community: you willingly share your expertise with others via speaking engagements, contributions to open source, blogs and more
Read more
Hyderabad
4 - 8 yrs
₹6L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+4 more
  1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
  2. Experience in developing lambda functions with AWS Lambda
  3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
  4. Should be able to code in Python and Scala.
  5. Snowflake experience will be a plus

 

Read more
Miracle Software Systems, Inc
Ratnakumari Modhalavalasa
Posted by Ratnakumari Modhalavalasa
Visakhapatnam
3 - 5 yrs
₹2L - ₹4L / yr
Hadoop
Apache Sqoop
Apache Hive
Apache Spark
Apache Pig
+9 more
Position : Data Engineer

Duration : Full Time

Location : Vishakhapatnam, Bangalore, Chennai

years of experience : 3+ years

Job Description :

- 3+ Years of working as a Data Engineer with thorough understanding of data frameworks that collect, manage, transform and store data that can derive business insights.

- Strong communications (written and verbal) along with being a good team player.

- 2+ years of experience within the Big Data ecosystem (Hadoop, Sqoop, Hive, Spark, Pig, etc.)

- 2+ years of strong experience with SQL and Python (Data Engineering focused).

- Experience with GCP Data Services such as BigQuery, Dataflow, Dataproc, etc. is an added advantage and preferred.

- Any prior experience in ETL tools such as DataStage, Informatica, DBT, Talend, etc. is an added advantage for the role.
Read more
Hyderabad
3 - 7 yrs
₹1L - ₹15L / yr
Big Data
Spark
Hadoop
PySpark
Amazon Web Services (AWS)
+3 more

Big data Developer

Exp: 3yrs to 7 yrs.
Job Location: Hyderabad
Notice: Immediate / within 30 days

1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus

We can start keeping Hadoop and Hive requirements as good to have or understanding of is enough rather than keeping it as a desirable requirement.

Read more
Getinz

at Getinz

11 recruiters
kousalya k
Posted by kousalya k
Remote only
4 - 8 yrs
₹10L - ₹15L / yr
Penetration testing
Python
Powershell
Bash
Spark
+5 more
-3 + years of Red Team experience
-5+ years hands on experience with penetration testing would be added plus
-Strong Knowledge of programming or scripting languages, such as Python, PowerShell, Bash
-Industry certifications like OSCP and AWS are highly desired for this role
-Well-rounded knowledge in security tools, software and processes
Read more
HCL Technologies

at HCL Technologies

3 recruiters
Agency job
via Saiva System by Sunny Kumar
Delhi, Gurugram, Noida, Ghaziabad, Faridabad, Bengaluru (Bangalore), Hyderabad, Chennai, Pune, Mumbai, Kolkata
5 - 10 yrs
₹5L - ₹20L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more
Exp- 5 + years
Skill- Spark and Scala along with Azure
Location - Pan India

Looking for someone Bigdata along with Azure
Read more
Astegic

at Astegic

3 recruiters
Nikita Pasricha
Posted by Nikita Pasricha
Remote only
5 - 7 yrs
₹8L - ₹15L / yr
Data engineering
SQL
Relational Database (RDBMS)
Big Data
Scala
+14 more

WHAT YOU WILL DO:

  • ●  Create and maintain optimal data pipeline architecture.

  • ●  Assemble large, complex data sets that meet functional / non-functional business requirements.

  • ●  Identify, design, and implement internal process improvements: automating manual processes,

    optimizing data delivery, re-designing infrastructure for greater scalability, etc.

  • ●  Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide

    variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).

  • ●  Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,

    operational efficiency and other key business performance metrics.

  • ●  Work with stakeholders including the Executive, Product, Data and Design teams to assist with

    data-related technical issues and support their data infrastructure needs.

  • ●  Keep our data separated and secure across national boundaries through multiple data centers and AWS

    regions.

  • ●  Create data tools for analytics and data scientist team members that assist them in building and

    optimizing our product into an innovative industry leader.

  • ●  Work with data and analytics experts to strive for greater functionality in our data systems.

    REQUIRED SKILLS & QUALIFICATIONS:

  • ●  5+ years of experience in a Data Engineer role.

  • ●  Advanced working SQL knowledge and experience working with relational databases, query authoring

    (SQL) as well as working familiarity with a variety of databases.

  • ●  Experience building and optimizing 'big data' data pipelines, architectures and data sets.

  • ●  Experience performing root cause analysis on internal and external data and processes to answer

    specific business questions and identify opportunities for improvement.

  • ●  Strong analytic skills related to working with unstructured datasets.

  • ●  Build processes supporting data transformation, data structures, metadata, dependency and workload

    management.

  • ●  A successful history of manipulating, processing and extracting value from large disconnected datasets.

  • ●  Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.

  • ●  Strong project management and organizational skills.

  • ●  Experience supporting and working with cross-functional teams in a dynamic environment

  • ●  Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.

  • ●  Experience with AWS cloud services: EC2, EMR, S3, Athena

  • ●  Experience with Linux

  • ●  Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.


    PREFERRED SKILLS & QUALIFICATIONS:

● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.

Read more
Agiletech Info Solutions pvt ltd
Chennai
4 - 8 yrs
₹4L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
Spark
SQL
+1 more
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoy optimizing data systems and building them from the ground up.

The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
• Create and maintain optimal data pipeline architecture,
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
• Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
• Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
• Experience building and optimizing big data ETL pipelines, architectures and data sets.
• Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
• Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
• A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort