Senior Software Engineer - Data
Job Description:
We are looking for a tech savvy Data Engineer to join our growing data team. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support our software developers, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. The hire must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Data Engineer Job Responsibilities:
- Develop and maintain scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity.
- Implement processes and systems to monitor data accuracy, ensuring 100% data availability for key stakeholders and business processes that depend on it.
- Write unit/integration tests and document work.
- Perform data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Design data integrations and reporting framework.
- Work with stakeholders including the Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs
- Design and evaluate open source and vendor tools for data lineage.
- Work closely with all business units and engineering teams to develop strategy for long term data platform architecture.
Data Engineer Qualifications / Skills:
- 3+ years of Java development experience
- Experience with or knowledge of Agile Software Development methodologies
- Excellent problem solving and troubleshooting skills
- Process oriented with great documentation skills
- Experience with big data technologies like Kafka, BigQuery, etc
- Experience with AWS cloud services: EC2, RDS, etc
- Experience with message queuing, stream-processing systems
Education, Experience and Licensing Requirements:
- Degree in Computer Science, IT, or similar field; a Master’s is a plus
- 3+ years of hands on development experience
- 3+ years of SQL experience (No-SQL experience is a plus)
- 3+ years of experience with schema design and dimensional data modeling
- Experience designing, building and maintaining data processing systems
About DeepIntent
DeepIntent is the leading independent healthcare marketing technology company built purposefully to influence patient health and business outcomes. The DeepIntent Healthcare Marketing Platform is the first and only platform that uniquely combines real-world health data, premium media partnerships, and custom integrations to reach patients and providers across any device. This enables healthcare marketers to plan, activate, optimize and measure campaigns that drive measurable patient and business outcomes, all within a single platform. DeepIntent is leading the healthcare advertising industry with data-driven solutions built for the future. From day one, our mission has been to improve patient outcomes through the artful use of advertising, data science, and real-world clinical data.
The DeepIntent Healthcare Marketing Platform is the first and only platform that uniquely combines real-world health data, premium media partnerships, and custom integrations to reach patients and providers across any device. This enables healthcare marketers to plan, activate, optimize and measure campaigns that drive measurable patient and business outcomes, all within a single platform.
Similar jobs
This profile will include the following responsibilities:
- Develop Parsers for XML and JSON Data sources/feeds
- Write Automation Scripts for product development
- Build API Integrations for 3rd Party product integration
- Perform Data Analysis
- Research on Machine learning algorithms
- Understand AWS cloud architecture and work with 3 party vendors for deployments
- Resolve issues in AWS environmentWe are looking for candidates with:
Qualification: BE/BTech/Bsc-IT/MCA
Programming Language: Python
Web Development: Basic understanding of Web Development. Working knowledge of Python Flask is desirable
Database & Platform: AWS/Docker/MySQL/MongoDB
Basic Understanding of Machine Learning Models & AWS Fundamentals is recommended.
About the Startup!
BharatX is a startup trying to change how the 250 million Indian Middle-Class Indians get access to credit. We give Credit via other consumer-facing apps and platforms as-a-Feature to their customers via a simple integration of our APIs in a Plug-and-Play manner. Our offerings enable journeys like Postpaid on Uber/Ola, Pay after Trial on Lenskart/Meesho, Pay in 3 on Flipkart/BoAt, Credit-Line on PhonePe/Gpay in a white-labelled and embedded manner!
Who We Are:
A team of young, ambitious, and bold people love to dedicate their life’s work towards something meaningful for India & the world. We love to have a shit ton of fun and cut the bullshit corporate culture! We are not colleagues, we are a family, in it for the long run!
Folks who believe in us:
We have been fortunate to have a lot of Global VCs, Founders, Clients, Angels and Industry veterans back us in our journey. We also have a lot of mentors in the Industry Globally who work with us day in, day out on building BharatX. Some of our Investors Include:
Global VCs
|
Angels
|
A special shout out to some of the clients of BharatX who have also chosen to back us, their vote of confidence in our product and vision is the most valuable to us.
What you will impact:
Data is the backbone of everything we do at BharatX. And you will be leading that. Your role would majorly revolve around gathering insights from the data points that various systems at BharatX generate.
Data points that you could handle can include:
- Device metrics of different cohorts of users
- User behavioural data
- User interaction data (events triggered based how user interacts with various interfaces)
- Many more
And of course you are welcome to introduce more data sources that we might have overlooked.
What you will learn:
How to get stuff done! You will solve real-world challenges that no experience or training can help you. Only your grit and passion for solving the problem will help you figure out how to deal with them. You will learn to think from a technical as well as product point of view.
Key Responsibilities:
- Providing insights to optimise existing processes or design new ones
- Providing insights to gauge performance of various products
- Drafting policies or algorithms that can add on to our underwriting stack
What we look for:
We need people who are quick learners and bold enough to suggest crazy ideas. From a technical front, we need people who have:
- Demonstrable experience with SQL, python/R
- Good knowledge of analytical concepts
- Ability to interact with all stakeholders and understand business requirements.
- Startup work experience (preferred).
We also encourage you to apply even if you took a break from professional life for whatever reasons. For us, the people and their skills matter more than their resumes.
What you get:
We don’t seek employees, we seek friends. If you are looking for an environment where smart people work around you who help you achieve your goals without any corporate bureaucracy, encourage you to make mistakes and get complete ownership of your work, then this is the place for you! Here are some side perks that come with this job:
- Be a part of our 0 to 1 ride!
- Attractive compensation with the best-in-class ESOP Structure. You take care of doing your best work, and we will take care of making sure you never have to worry about money.
- Insurance for your entire family.
- Choose your own device (and keep it if you stay long enough).
- Unlimited paid time off with no questions asked.
- Encouragement to take time for your mental health and personal life.
- Maternity and Paternity Leave.
- A tight-knit and brutally honest team with no politics or hierarchy :D
Role & responsibilities:
- Developing ETL pipelines for data replication
- Analyze, query and manipulate data according to defined business rules and procedures
- Manage very large-scale data from a multitude of sources into appropriate sets for research and development for data science and analysts across the company
- Convert prototypes into production data engineering solutions through rigorous software engineering practices and modern deployment pipelines
- Resolve internal and external data exceptions in timely and accurate manner
- Improve multi-environment data flow quality, security, and performance
Skills & qualifications:
- Must have experience with:
- virtualization, containers, and orchestration (Docker, Kubernetes)
- creating log ingestion pipelines (Apache Beam) both batch and streaming processing (Pub/Sub, Kafka)
- workflow orchestration tools (Argo, Airflow)
- supporting machine learning models in production
- Have a desire to continually keep up with advancements in data engineering practices
- Strong Python programming and exploratory data analysis skills
- Ability to work independently and with team members from different backgrounds
- At least a bachelor's degree in an analytical or technical field. This could be applied mathematics, statistics, computer science, operations research, economics, etc. Higher education is welcome and encouraged.
- 3+ years of work in software/data engineering.
- Superior interpersonal, independent judgment, complex problem-solving skills
- Global orientation, experience working across countries, regions and time zones
Data Engineers develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions. You might spend a few weeks with a new client on a deep technical review or a complete organizational review, helping them to understand the potential that data brings to solve their most pressing problems. On other projects, you might be acting as the architect, leading the design of technical solutions, or perhaps overseeing a program inception to build a new product. It could also be a software delivery project where you're equally happy coding and tech-leading the team to implement the solution.
You’ll spend time on the following:
- You will partner with teammates to create complex data processing pipelines in order to solve our clients’ most ambitious challenges
- You will collaborate with Data Scientists in order to design scalable implementations of their models
- You will pair to write clean and iterative code based on TDD
- Leverage various continuous delivery practices to deploy data pipelines
- Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available
- Develop modern data architecture approaches to meet key business objectives and provide end-to-end data solutions
- Create data models and speak to the tradeoffs of different modeling approaches
Here’s what we’re looking for:
- You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and Hadoop
- You have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc.) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production setting
- Hands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc.) based Hadoop distributions
- You are comfortable taking data-driven approaches and applying data security strategy to solve business problems
- Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systems
- Strong communication and client-facing skills with the ability to work in a consulting environment
Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
● Experience with big data tools: Hive/Hadoop, Spark, Kafka, Hive etc.
● Experience with querying multiple databases SQL/NoSQL, including
Oracle, MySQL and MongoDB etc.
● Experience in Redis, RabbitMQ, Elastic Search is desirable.
● Strong Experience with object-oriented/functional/ scripting languages:
Python(preferred), Core Java, Java Script, Scala, Shell Scripting etc.
● Must have debugging complex code skills, experience on ML/AI
algorithms is a plus.
● Experience in version control tool Git or any is mandatory.
● Experience with AWS cloud services: EC2, EMR, RDS, Redshift, S3
● Experience with stream-processing systems: Storm, Spark-Streaming,
etc
Good Python developers / Data Engineers / Devops engineers
Exp: 1-8years
Work loc: Chennai. / Remote support
2. Assemble large, complex data sets that meet business requirements
3. Identify, design, and implement internal process improvements
4. Optimize data delivery and re-design infrastructure for greater scalability
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies
6. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
7. Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs
8. Create data tools for analytics and data scientist team members
Skills Required:
1. Working knowledge of ETL on any cloud (Azure / AWS / GCP)
2. Proficient in Python (Programming / Scripting)
3. Good understanding of any of the data warehousing concepts (Snowflake / AWS Redshift / Azure Synapse Analytics / Google Big Query / Hive)
4. In-depth understanding of principles of database structure
5. Good understanding of any of the ETL technologies (Informatica PowerCenter / AWS Glue / Data Factory / SSIS / Spark / Matillion / Talend / Azure)
6. Proficient in SQL (query solving)
7. Knowledge in Change case Management / Version Control – (VSS / DevOps / TFS / GitHub, Bit bucket, CICD Jenkin)
The person holding this position is responsible for leading the solution development and implementing advanced analytical approaches across a variety of industries in the supply chain domain.
At this position you act as an interface between the delivery team and the supply chain team, effectively understanding the client business and supply chain.
Candidates will be expected to lead projects across several areas such as
- Demand forecasting
- Inventory management
- Simulation & Mathematical optimization models.
- Procurement analytics
- Distribution/Logistics planning
- Network planning and optimization
Qualification and Experience
- 4+ years of analytics experience in supply chain – preferable industries hi-tech, consumer technology, CPG, automobile, retail or e-commerce supply chain.
- Master in Statistics/Economics or MBA or M. Sc./M. Tech with Operations Research/Industrial Engineering/Supply Chain
- Hands-on experience in delivery of projects using statistical modelling
Skills / Knowledge
- Hands on experience in statistical modelling software such as R/ Python and SQL.
- Experience in advanced analytics / Statistical techniques – Regression, Decision tress, Ensemble machine learning algorithms etc. will be considered as an added advantage.
- Highly proficient with Excel, PowerPoint and Word applications.
- APICS-CSCP or PMP certification will be added advantage
- Strong knowledge of supply chain management
- Working knowledge on the linear/nonlinear optimization
- Ability to structure problems through a data driven decision-making process.
- Excellent project management skills, including time and risk management and project structuring.
- Ability to identify and draw on leading-edge analytical tools and techniques to develop creative approaches and new insights to business issues through data analysis.
- Ability to liaison effectively with multiple stakeholders and functional disciplines.
- Experience in Optimization tools like Cplex, ILOG, GAMS will be an added advantage.
- Build a team with skills in ETL, reporting, MDM and ad-hoc analytics support
- Build technical solutions using latest open source and cloud based technologies
- Work closely with offshore senior consultant, onshore team and client's business and IT teams to gather project requirements
- Assist overall project execution from India - starting from project planning, team formation system design and development, testing, UAT and deployment
- Build demos and POCs in support of business development for new and existing clients
- Prepare project documents and PowerPoint presentations for client communication
- Conduct training sessions to train associates and help shape their growth