Data Engineer

at Mobile Programming LLC

DP
Posted by Apurva kalsotra
icon
Mohali, Gurugram, Pune, Bengaluru (Bangalore), Hyderabad, Chennai
icon
3 - 8 yrs
icon
₹2L - ₹9L / yr (ESOP available)
icon
Full time
Skills
Data engineering
Data engineer
Spark
Apache Spark
Apache Kafka
Python
SQL
Linux/Unix
Shell Scripting
DevOps
CI/CD
Docker
Kubernetes
Javascript
Scala
Data integration
Spark Kafka
Pentaho

Responsibilities for Data Engineer

  • Create and maintain optimal data pipeline architecture,
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.

Qualifications for Data Engineer

  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • Strong project management and organizational skills.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:

  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
  • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
  • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Read more

About Mobile Programming LLC

Founded
1998
Type
Services
Size
100-1000 employees
Stage
Profitable
View full company details
Why apply to jobs via Cutshort
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
2101133
Matches delivered
3712187
Network size
15000
Companies hiring

Similar jobs

Senior Data Engineer

at ERUDITUS Executive Education

Founded 2010  •  Services  •  1000-5000 employees  •  Raised funding
ETL
Informatica
Data Warehouse (DWH)
Data Science
Data engineering
ETL management
Docker
Kubernetes
Apache Beam
Apache Kafka
Machine Learning (ML)
Virtualization
icon
Remote only
icon
4 - 9 yrs
icon
₹30L - ₹40L / yr

Emeritus is committed to teaching the skills of the future by making high-quality education accessible and affordable to individuals, companies, and governments around the world. It does this by collaborating with more than 50 top-tier universities across the United States, Europe, Latin America, Southeast Asia, India and China. Emeritus’ short courses, degree programs, professional certificates, and senior executive programs help individuals learn new skills and transform their lives, companies and organizations. Its unique model of state-of-the-art technology, curriculum innovation, and hands-on instruction from senior faculty, mentors and coaches has educated more than 250,000 individuals across 80+ countries. Founded in 2015, Emeritus, part of Eruditus Group, has more than 2,000 employees globally and offices in Mumbai, New Delhi, Shanghai, Singapore, Palo Alto, Mexico City, New York, Boston, London, and Dubai. Following its $650 million Series E funding round in August 2021, the Company is valued at $3.2 billion, and is backed by Accel, SoftBank Vision Fund 2, the Chan Zuckerberg Initiative, Leeds Illuminate, Prosus Ventures, Sequoia Capital India, and Bertelsmann.

 

 

Key responsibilities of the role:

 

As a data engineer at Emeritus, you'll be working on a wide variety of data problems. At this fast paced company, you will frequently have to balance achieving an immediate goal with building sustainable and scalable architecture. The ideal candidate gets excited about streaming data, protocol buffers and microservices. They want to develop and maintain a centralized data platform that provides accurate, comprehensive, and timely data to a growing organization

 

Role & responsibilities

  • Developing ETL pipelines for data replication
  • Analyze, query and manipulate data according to defined business rules and procedures
  • Manage very large-scale data from a multitude of sources into appropriate sets for research and development for data science and analysts across the company
  • Convert prototypes into production data engineering solutions through rigorous software engineering practices and modern deployment pipelines
  • Resolve internal and external data exceptions in timely and accurate manner
  • Improve multi-environment data flow quality, security, and performance

 

Skills & qualifications:

  • Must have experience with:
  1. Virtualization, containers, and orchestration (Docker, Kubernetes)
  2. Creating log ingestion pipelines (Apache Beam) both batch and streaming processing (Pub/Sub, Kafka)
  3. Workflow orchestration tools (Argo, Airflow)
  4. Supporting machine learning models in production
  • Have a desire to continually keep up with advancements in data engineering practices
  • Strong Python programming and exploratory data analysis skills
  • Ability to work independently and with team members from different backgrounds
  • At least a bachelor's degree in an analytical or technical field. This could be applied mathematics, statistics, computer science, operations research, economics, etc. Higher education is welcome and encouraged.
  • 3+ years of work in software/data engineering.
  • Superior interpersonal, independent judgment, complex problem-solving skills
  • Global orientation, experience working across countries, regions and time zones

 

 

 

Emeritus provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

 

Read more
Job posted by
Shikha G

Senior Machine Learning Engineer | Permanent WFH

at Delivery Solutions

Founded 2015  •  Product  •  100-500 employees  •  Profitable
Python
NumPy
pandas
SQL
Docker
icon
Remote only
icon
3 - 6 yrs
icon
₹7L - ₹19L / yr
Title: Senior Software Engineer - AI/ML

  • Minimum 3 years of technical experience in AI/ML (you can include internships & freelance work towards this)
  • Excellent proficiency in Python (Numpy, Pandas)
  • Experience working with SQL/NoSQL databases
  • Experience working with AWS, Docker
  • Should have worked with a large set of data
  • Should be familiar with MI model building and deployment on AWS.
  • Good communication skills and very good problem-solving skills 

Perks & Benefits @Delivery Solutions: 

  • Permanent Remote work - (Work from anywhere)
  • Broadband reimbursement
  • Flexi work hours - (Login/Logout flexibility)
  • 21 Paid leaves in a year (Jan to Dec) and 7 COVID leaves
  • Two appraisal cycles in a year
  • Encashment of unused leaves on Gross
  • RNR - Amazon Gift Voucher
  • Employee Referral Bonus
  • Technical & Soft skills training
  • Sodexo meal card
  • Surprise on birthday/ service anniversary/new baby/wedding gifts
  • Annual trip 
Read more
Job posted by
Ayyappan Paramasivam

Data Engineer & Sr Data Engineer

at Fragma Data Systems

Founded 2015  •  Products & Services  •  employees  •  Profitable
PySpark
Data engineering
Big Data
Hadoop
Spark
Python
icon
Bengaluru (Bangalore)
icon
2 - 10 yrs
icon
₹5L - ₹15L / yr
Job Description:

Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
Read more
Job posted by
Vamsikrishna G

AI Engineer

at StatusNeo

Founded 2020  •  Products & Services  •  100-1000 employees  •  Profitable
Artificial Intelligence (AI)
Amazon Web Services (AWS)
Windows Azure
Hadoop
Scala
Python
Google Cloud Platform (GCP)
postgres
icon
Gurugram, Hyderabad, Bengaluru (Bangalore)
icon
1 - 3 yrs
icon
₹3L - ₹12L / yr


·       Build data products and processes alongside the core engineering and technology team.

·       Collaborate with senior data scientists to curate, wrangle, and prepare data for use in their advanced analytical models

·       Integrate data from a variety of sources, assuring that they adhere to data quality and accessibility standards

·       Modify and improve data engineering processes to handle ever larger, more complex, and more types of data sources and pipelines

·       Use Hadoop architecture and HDFS commands to design and optimize data queries at scale

·       Evaluate and experiment with novel data engineering tools and advises information technology leads and partners about new capabilities to determine optimal solutions for particular technical problems or designated use cases .
Read more
Job posted by
Alex P

Senior Software Engineer

at Episource LLC

Founded 2008  •  Product  •  500-1000 employees  •  Profitable
Big Data
Python
Amazon Web Services (AWS)
Serverless
DevOps
Cloud Computing
Infrastructure
Solution architecture
CI/CD
icon
Mumbai
icon
5 - 12 yrs
icon
₹18L - ₹30L / yr

ABOUT EPISOURCE:


Episource has devoted more than a decade in building solutions for risk adjustment to measure healthcare outcomes. As one of the leading companies in healthcare, we have helped numerous clients optimize their medical records, data, analytics to enable better documentation of care for patients with chronic diseases.


The backbone of our consistent success has been our obsession with data and technology. At Episource, all of our strategic initiatives start with the question - how can data be “deployed”? Our analytics platforms and datalakes ingest huge quantities of data daily, to help our clients deliver services. We have also built our own machine learning and NLP platform to infuse added productivity and efficiency into our workflow. Combined, these build a foundation of tools and practices used by quantitative staff across the company.


What’s our poison you ask? We work with most of the popular frameworks and technologies like Spark, Airflow, Ansible, Terraform, Docker, ELK. For machine learning and NLP, we are big fans of keras, spacy, scikit-learn, pandas and numpy. AWS and serverless platforms help us stitch these together to stay ahead of the curve.


ABOUT THE ROLE:


We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations, clinical named entity recognition, improving patient health, clinical suspecting and information extraction from clinical notes.


This is a role for highly technical data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.


You will be responsible for setting an agenda to develop and ship data-driven architectures that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company and help build a foundation of tools and practices used by quantitative staff across the company.


During the course of a typical day with our team, expect to work on one or more projects around the following;


1. Create and maintain optimal data pipeline architectures for ML


2. Develop a strong API ecosystem for ML pipelines


3. Building CI/CD pipelines for ML deployments using Github Actions, Travis, Terraform and Ansible


4. Responsible to design and develop distributed, high volume, high-velocity multi-threaded event processing systems


5. Knowledge of software engineering best practices across the development lifecycle, coding standards, code reviews, source management, build processes, testing, and operations  


6. Deploying data pipelines in production using Infrastructure-as-a-Code platforms

 

7. Designing scalable implementations of the models developed by our Data Science teams  


8. Big data and distributed ML with PySpark on AWS EMR, and more!



BASIC REQUIREMENTS 


  1.  Bachelor’s degree or greater in Computer Science, IT or related fields

  2.  Minimum of 5 years of experience in cloud, DevOps, MLOps & data projects

  3. Strong experience with bash scripting, unix environments and building scalable/distributed systems

  4. Experience with automation/configuration management using Ansible, Terraform, or equivalent

  5. Very strong experience with AWS and Python

  6. Experience building CI/CD systems

  7. Experience with containerization technologies like Docker, Kubernetes, ECS, EKS or equivalent

  8. Ability to build and manage application and performance monitoring processes

Read more
Job posted by
Ahamed Riaz

Data Engineer

at Srijan Technologies

Founded 2002  •  Products & Services  •  100-1000 employees  •  Profitable
Big Data
Apache Kafka
Hadoop
Spark
Data engineering
Python
Scala
Kafka
icon
Remote only
icon
2 - 5 yrs
icon
₹5L - ₹15L / yr
Job Description:-
We are looking for a Data Engineer, responsibilities include creating machine learning models and retraining systems. To do this job successfully, you need exceptional skills in statistics and programming. If you also have knowledge of data science and software engineering, your ultimate goal will be to shape and build efficient self-learning applications.


Technical Knowledge (Must Have)

  • Strong experience in SQL / HiveQL/ AWS Athena,
  • Strong expertise in the development of data pipelines (snaplogic is preferred).
  • Design, Development, Deployment and administration of data processing applications.
  • Good Exposure towards AWS and Azure Cloud computing environments.
  • Knowledge around BigData, AWS Cloud Architecture, Best practices, Securities, Governance, Metadata Management, Data Quality etc.
  • Data extraction through various firm sources (RDBMS, Unstructured Data Sources) and load to datalake with all best practices.
  • Knowledge in Python
  • Good knowledge in NoSQL technologies (Neo4J/ MongoDB)
  • Experience/knowledge in SnapLogic (ETL Technologies)
  • Working knowledge on Unix (AIX, Linux), shell scripting
  • Experience/knowledge in Data Modeling. Database Development
  • Experience/knowledge creation of reports and dashboards in Tableau/ PowerBI
Read more
Job posted by
Srijan Technologies

Machine Learning Engineer

at Centime

Agency job
via FlexAbility
Machine Learning (ML)
Artificial Intelligence (AI)
Deep Learning
Java
Python
icon
Hyderabad
icon
8 - 14 yrs
icon
₹15L - ₹35L / yr

Required skill

  • Around 6- 8.5 years of experience and around 4+ years in AI / Machine learning space
  • Extensive experience in designing large scale machine learning solution for the ML use case,  large scale deployments and establishing continues automated improvement / retraining framework.
  • Strong experience in Python and Java is required.
  • Hands on experience on Scikit-learn, Pandas, NLTK
  • Experience in Handling of Timeseries data and associated techniques like Prophet, LSTM
  • Experience in Regression, Clustering, classification algorithms
  • Extensive experience in buildings traditional Machine Learning SVM, XGBoost, Decision tree and Deep Neural Network models like RNN, Feedforward is required.
  • Experience in AutoML like TPOT or other
  • Must have strong hands on experience in Deep learning frameworks like Keras, TensorFlow or PyTorch 
  • Knowledge of Capsule Network or reinforcement learning, SageMaker is a desirable skill
  • Understanding of Financial domain is desirable skill

 Responsibilities 

  • Design and implementation of solutions for ML Use cases
  • Productionize System and Maintain those
  • Lead and implement data acquisition process for ML work
  • Learn new methods and model quickly and utilize those in solving use cases
Read more
Job posted by
srikanth voona

Data Scientist

at Societe Generale Global Solution Centre

Founded 2000  •  Products & Services  •  100-1000 employees  •  Profitable
Data Science
Python
R Programming
Machine Learning (ML)
icon
Bengaluru (Bangalore)
icon
3 - 7 yrs
icon
₹12L - ₹18L / yr
• Model design, feature planning, system infrastructure, production setup and monitoring, and release management. 
• Excellent understanding of machine learning techniques and algorithms, such as SVM, Decision Forests, k-NN, Naive Bayes etc.
• Experience in selecting features, building and optimizing classifiers using machine learning techniques.
• Prior experience with data visualization tools, such as D3.js, GGplot, etc..
• Good knowledge on statistics skills, such as distributions, statistical testing, regression, etc..
• Adequate presentation and communication skills to explain results and methodologies to non-technical stakeholders.
• Basic understanding of the banking industry is value add

Develop, process, cleanse and enhance data collection procedures from multiple data sources.
• Conduct & deliver experiments and proof of concepts to validate business ideas and potential value.
• Test, troubleshoot and enhance the developed models in a distributed environments to improve it's accuracy.
• Work closely with product teams to implement algorithms with Python and/or R.
• Design and implement scalable predictive models, classifiers leveraging machine learning, data regression.
• Facilitate integration with enterprise applications using APIs to enrich implementations
 
Read more
Job posted by
Bushra Syeda

Senior Data Scientist

at Dataweave Pvt Ltd

Founded 2011  •  Products & Services  •  100-1000 employees  •  Raised funding
Machine Learning (ML)
Python
Data Science
Natural Language Processing (NLP)
Deep Learning
Statistical Modeling
Image processing
icon
Bengaluru (Bangalore)
icon
6 - 10 yrs
icon
Best in industry
About us
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.

Data [email protected]
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.

How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!

What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.

Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.


We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).


Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.

Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.

Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.
Read more
Job posted by
Sanket Patil

Data Scientist

at A Fintech startup in Dubai

Agency job
via Jobbie
Data Science
Python
R Programming
icon
Remote, Dubai, Bengaluru (Bangalore), Mumbai
icon
2 - 18 yrs
icon
₹14L - ₹38L / yr
RESPONSIBILITIES AND QUALIFICATIONS The mission of the Marcus Surveillance Analytics team is to deliver a platform which detects security incidents which have a tangible business impact and actionable response. You will work alongside industry leading technologists from who have recently joined Goldman from across consumer security, technology, fintech, finance and quant firms. The role has a broad scope which will involve interacting with senior leaders of Goldman and the Consumer business on a regular basis. The position is hands-on and requires a driven and “take ownership” oriented individual who is intently focused on execution. You will work directly with developers, business leaders, vendors and partners in order to deliver security assets to the consumer business. Develop a team, vision and platform which identifies/prioritizes actionable security & fraud risks which have tangible businesses impact across Goldman's consumer and commercial banking businesses. Develop response and recovery technology and programs to ensure resilience from fraud and abuse events. Manage, develop and operationalize analytics which discover security & fraud events and identifies risks for all of Goldman's consumer businesses. Partner with fraud / abuse operations and leadership to ensure consumer fraud rates are within industry norms and own outcomes related to fraud improvements. Skills And Experience We Are Looking For BA/BS degree in Computer Science, Cybersecurity, or other relevant Computer/Data/Engineering degrees 2+ years of experience as a security professional or data analyst/scientist/engineer Python, PySpark, R, Bash, SQL, Splunk (search, ES, UBA) Experience with cloud infrastructure/big data tool sets Visualization tools such as Tableau or D3 Research and development to create innovative predictive detections for security and fraud Build a 24/7 real-time monitoring system with long term vision for scaling to new lines of consumer businesses Strong focus on customer experience and product usability Ability to work closely with the business, fraud, and security incident response teams on creating actionable detections
Read more
Job posted by
Sourav Nandi
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
Get to hear about interesting companies hiring right now
iconFollow Cutshort
Want to apply to this role at Mobile Programming LLC?
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Learn more
Get to hear about interesting companies hiring right now
iconFollow Cutshort