Cutshort logo
Data Engineer
Information Solution Provider Company's logo

Data Engineer

Agency job
2 - 7 yrs
₹10L - ₹15L / yr
Delhi, Gurugram, Noida, Ghaziabad, Faridabad
Skills
Spark
skill iconScala
Hadoop
PySpark
Data engineering
Big Data
skill iconMachine Learning (ML)

Responsibilities:

 

  • Designing and implementing fine-tuned production ready data/ML pipelines in Hadoop platform.
  • Driving optimization, testing and tooling to improve quality.
  • Reviewing and approving high level & amp; detailed design to ensure that the solution delivers to the business needs and aligns to the data & analytics architecture principles and roadmap.
  • Understanding business requirements and solution design to develop and implement solutions that adhere to big data architectural guidelines and address business requirements.
  • Following proper SDLC (Code review, sprint process).
  • Identifying, designing, and implementing internal process improvements: automating manual processes, optimizing data delivery, etc.
  • Building robust and scalable data infrastructure (both batch processing and real-time) to support needs from internal and external users.
  • Understanding various data security standards and using secure data security tools to apply and adhere to the required data controls for user access in the Hadoop platform.
  • Supporting and contributing to development guidelines and standards for data ingestion.
  • Working with a data scientist and business analytics team to assist in data ingestion and data related technical issues.
  • Designing and documenting the development & deployment flow.

 

Requirements:

 

  • Experience in developing rest API services using one of the Scala frameworks.
  • Ability to troubleshoot and optimize complex queries on the Spark platform
  • Expert in building and optimizing ‘big data’ data/ML pipelines, architectures and data sets.
  • Knowledge in modelling unstructured to structured data design.
  • Experience in Big Data access and storage techniques.
  • Experience in doing cost estimation based on the design and development.
  • Excellent debugging skills for the technical stack mentioned above which even includes analyzing server logs and application logs.
  • Highly organized, self-motivated, proactive, and ability to propose best design solutions.
  • Good time management and multitasking skills to work to deadlines by working independently and as a part of a team.

 

Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About Information Solution Provider Company

Founded
Type
Size
Stage
About
N/A
Company social profiles
N/A

Similar jobs

Databook
at Databook
5 candid answers
1 video
Nikhil Mohite
Posted by Nikhil Mohite
Mumbai
1 - 3 yrs
Upto ₹20L / yr (Varies
)
Data engineering
skill iconPython
Apache Kafka
Spark
skill iconAmazon Web Services (AWS)
+1 more

Lightning Job By Cutshort ⚡

 

As part of this feature, you can expect status updates about your application and replies within 72 hours (once the screening questions are answered)

 

 

About Databook:-

- Great salespeople let their customers’ strategies do the talking.

 

Databook’s award-winning Strategic Relationship Management (SRM) platform uses advanced AI and NLP to empower the world’s largest B2B sales teams to create, manage, and maintain strategic relationships at scale. The platform ingests and interprets billions of financial and market data signals to generate actionable sales strategies that connect the seller’s solutions to a buyer’s financial pain and urgency.

 

The Opportunity

We're seeking Junior Engineers to support and develop Databook’s capabilities. Working closely with our seasoned engineers, you'll contribute to crafting new features and ensuring our platform's reliability. If you're eager about playing a part in building the future of customer intelligence, with a keen eye towards quality, we'd love to meet you!

 

Specifically, you'll

- Participate in various stages of the engineering lifecycle alongside our experienced engineers.

- Assist in maintaining and enhancing features of the Databook platform.

- Collaborate with various teams to comprehend requirements and aid in implementing technology solutions.

 

Please note: As you progress and grow with us, you might be introduced to on-call rotations to handle any platform challenges.

 

Working Arrangements:

- This position offers a hybrid work mode, allowing employees to work both remotely and in-office as mutually agreed upon.

 

What we're looking for

- 1-2+ years experience as a Data Engineer

- Bachelor's degree in Engineering

- Willingness to work across different time zones

- Ability to work independently

- Knowledge of cloud (AWS or Azure)

- Exposure to distributed systems such as Spark, Flink or Kafka

- Fundamental knowledge of data modeling and optimizations

- Minimum of one year of experience using Python working as a Software Engineer

- Knowledge of SQL (Postgres) databases would be beneficial

- Experience with building analytics dashboard

- Familiarity with RESTful APIs and/or GraphQL is welcomed

- Hand-on experience with Numpy, Pandas, SpaCY would be a plus

- Exposure or working experience on GenAI (LLMs in general), LLMOps would be a plus

- Highly fluent in both spoken and written English language

 

Ideal candidates will also have:

- Self-motivated with great organizational skills.

- Ability to focus on small and subtle details.

- Are willing to learn and adapt in a rapidly changing environment.

- Excellent written and oral communication skills.

 

Join us and enjoy these perks!

- Competitive salary with bonus

- Medical insurance coverage

- 5 weeks leave plus public holidays

- Employee referral bonus program

- Annual learning stipend to spend on books, courses or other training materials that help you develop skills relevant to your role or professional development

- Complimentary subscription to Masterclass

Read more
dataeaze systems
at dataeaze systems
1 recruiter
Ankita Kale
Posted by Ankita Kale
Pune
1 - 5 yrs
₹3L - ₹10L / yr
ETL
Hadoop
Apache Hive
skill iconJava
Spark
+2 more
  • Core Java: advanced level competency, should have worked on projects with core Java development.

 

  • Linux shell : advanced level competency, work experience with Linux shell scripting, knowledge and experience to use important shell commands

 

  • Rdbms, SQL: advanced level competency, Should have expertise in SQL query language syntax, should be well versed with aggregations, joins of SQL query language.

 

  • Data structures and problem solving: should have ability to use appropriate data structure.

 

  • AWS cloud : Good to have experience with aws serverless toolset along with aws infra

 

  • Data Engineering ecosystem : Good to have experience and knowledge of data engineering, ETL, data warehouse (any toolset)

 

  • Hadoop, HDFS, YARN : Should have introduction to internal working of these toolsets

 

  • HIVE, MapReduce, Spark: Good to have experience developing transformations using hive queries, MapReduce job implementation and Spark Job Implementation. Spark implementation in Scala will be plus point.

 

  • Airflow, Oozie, Sqoop, Zookeeper, Kafka: Good to have knowledge about purpose and working of these technology toolsets. Working experience will be a plus point here.

 

Read more
xpressbees
Alfiya Khan
Posted by Alfiya Khan
Pune, Bengaluru (Bangalore)
6 - 8 yrs
₹15L - ₹25L / yr
Big Data
Data Warehouse (DWH)
Data modeling
Apache Spark
Data integration
+10 more
Company Profile
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.

Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.

What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.

Qualifications & Experience relevant for the role

• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).

• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
Read more
Nascentvision
at Nascentvision
1 recruiter
Shanu Mohan
Posted by Shanu Mohan
Gurugram, Mumbai, Bengaluru (Bangalore)
2 - 4 yrs
₹10L - ₹17L / yr
skill iconPython
PySpark
skill iconAmazon Web Services (AWS)
Spark
skill iconScala
+2 more
  • Hands-on experience in any Cloud Platform
· Versed in Spark, Scala/python, SQL
  • Microsoft Azure Experience
· Experience working on Real Time Data Processing Pipeline
Read more
Networking & Cybersecurity Solutions
Bengaluru (Bangalore)
4 - 16 yrs
₹40L - ₹60L / yr
Spark
Apache Kafka
Data Engineer
Hadoop
Big Data
+3 more
  • Developing telemetry software to connect Junos devices to the cloud
  • Fast prototyping and laying the SW foundation for product solutions
  • Moving prototype solutions to a production cloud multitenant SaaS solution
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources
  • Build analytics tools that utilize the data pipeline to provide significant insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with partners including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics specialists to strive for greater functionality in our data systems.

Qualification and Desired Experiences

  • Master in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
  • 5+ years experiences building data pipelines for data science-driven solutions
  • Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning model
  • Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
  • Good team worker with excellent interpersonal skills written, verbal and presentation
  • Create and maintain optimal data pipeline architecture,
  • Assemble large, sophisticated data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Experience with AWS, S3, Flink, Spark, Kafka, Elastic Search
  • Previous work in a start-up environment
  • 3+ years experiences building data pipelines for data science-driven solutions
  • Master in Computer Science, Electrical Engineering, Statistics, Applied Math or equivalent fields with strong mathematical background
  • We are looking for a candidate with 9+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
  • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
  • Strong hands-on coding skills (preferably in Python) processing large-scale data set and developing machine learning model
  • Familiar with one or more machine learning or statistical modeling tools such as Numpy, ScikitLearn, MLlib, Tensorflow
  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and find opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Proven understanding of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • Strong project management and interpersonal skills.
  • Experience supporting and working with multi-functional teams in a multidimensional environment.
Read more
Bengaluru (Bangalore)
2 - 5 yrs
₹30L - ₹40L / yr
Data Scientist
skill iconData Science
skill iconMachine Learning (ML)
  • You'd have to set up your own shop, work with design customers to find generalizable use cases, and build them out.
  • Ability to collaborate with cross-functional teams to build and ship new features
  • At least 2-5 years of experience
  • Predictive Analytics – Machine Learning Algorithms, Logistics & Linear Regression, Decision Tree, Clustering. 
  • Exploratory Data Analysis – Data Preparation, Data Exploration, and Data Visualization. 
  • Analytics Tools – R, Python, SQL, Power BI, MS Excel.
Read more
Fragma Data Systems
at Fragma Data Systems
8 recruiters
Priyanka U
Posted by Priyanka U
Remote only
4 - 10 yrs
₹12L - ₹23L / yr
Informatica
ETL
Big Data
Spark
SQL
Skill:- informatica with big data management
 
1.Minimum 6 to 8 years of experience in informatica BDM development
2. Experience working on Spark/SQL
3. Develops informtica mapping/Sql 
4. Should have experience in Hadoop, spark etc

Work days- Sun-Thu
Day shift
 
 
 
Read more
A Fintech startup
Agency job
via Success Pact by Priya Sariyal
Remote, Bengaluru (Bangalore)
3 - 15 yrs
₹16L - ₹22L / yr
skill iconData Science
XGBoost
Retail banking
Random boosting
Gradient boosting
+2 more
  • 3-5yrs of practical DS experience working with varied data sets. Working with retail banking is preferred but not necessary.
  • Need to be strong in concepts of statistical modelling – particularly looking for practical knowledge learnt from work experience (should be able to give "rule of thumb" answers)
  • Strong problem solving skills and the ability to articulate really well.
  • Ideally, the data scientist should have interfaced with data engineering and model deployment teams to bring models / solutions to "live" in production.
  • Strong working knowledge of python ML stack is very important here.
  • Willing to work on diverse range of tasks in building ML related capability on the Corridor Platform as well as client work.
  • Someone with strong interest in data engineering aspect of ML is highly preferred, i.e. can play dual role of Data Scientist as well as someone who can code a module on our Corridor Platform writing robust code.

Structured ML techniques for candidates:

 

  1. GBM
  2. XgBoost
  3. Random Forest
  4. Neural Net
  5. Logistic Regression
Read more
Data Team
Agency job
via Oceanworld by Chandan J
Remote only
8 - 12 yrs
₹10L - ₹20L / yr
Big Data
Data engineering
Hadoop
data engineer
Apache Hive
+1 more
Senior Data Engineer (SDE)

(Hadoop, HDFS, Kafka, Spark, Hive)

Overall Experience - 8 to 12 years

Relevant exp on Big data - 3+ years in above

Salary: Max up-to 20LPA 

Job location - Chennai / Bangalore / 

Notice Period - Immediate joiner / 15-to-20-day Max 

The Responsibilities of The Senior Data Engineer Are:

- Requirements gathering and assessment

- Breakdown complexity and translate requirements to specification artifacts and story boards to build towards, using a test-driven approach

- Engineer scalable data pipelines using big data technologies including but not limited to Hadoop, HDFS, Kafka, HBase, Elastic

- Implement the pipelines using execution frameworks including but not limited to MapReduce, Spark, Hive, using Java/Scala/Python for application design.

- Mentoring juniors in a dynamic team setting

- Manage stakeholders with proactive communication upholding TheDataTeam's brand and values

A Candidate Must Have the Following Skills:

- Strong problem-solving ability

- Excellent software design and implementation ability

- Exposure and commitment to agile methodologies

- Detail oriented with willingness to proactively own software tasks as well as management tasks, and see them to completion with minimal guidance

- Minimum 8 years of experience

- Should have experience in full life-cycle of one big data application

- Strong understanding of various storage formats (ORC/Parquet/Avro)

- Should have hands on experience in one of the Hadoop distributions (Hortoworks/Cloudera/MapR)

- Experience in at least one cloud environment (GCP/AWS/Azure)

- Should be well versed with at least one database (MySQL/Oracle/MongoDB/Postgres)

- Bachelor's in Computer Science, and preferably, a Masters as well - Should have good code review and debugging skills

Additional skills (Good to have):

- Experience in Containerization (docker/Heroku)

- Exposure to microservices

- Exposure to DevOps practices - Experience in Performance tuning of big data applications
Read more
Jiva adventures
at Jiva adventures
2 recruiters
Bharat Chinhara
Posted by Bharat Chinhara
Bengaluru (Bangalore)
1 - 4 yrs
₹5L - ₹15L / yr
skill iconData Science
skill iconPython
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
Should be experienced in building Machine learning pipelines.   Should be proficient in Python and scientific packages like pandas, numpy, scikit, matplotlib, etc. Experience with techniques such as Data mining, Distributed Computing, Applied Mathematics and Algorthims, Probablity & statistics, Strong problem solving and conceptual thinking abilities Hands on experience in Model building Building highly customized and optimized data pipelines integrating third party API’s and inhouse data sources. Extracting features from text data using tools like Scapy Deep learning for NLP using any modern framework
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos