Pipelines should be optimised to handle both real time data, batch update data and historical data.
Establish scalable, efficient, automated processes for complex, large scale data analysis.
Write high quality code to gather and manage large data sets (both real time and batch data) from multiple sources, perform ETL and store it in a data warehouse.
Manipulate and analyse complex, high-volume, high-dimensional data from varying sources using a variety of tools and data analysis techniques.
Participate in data pipelines health monitoring and performance optimisations as well as quality documentation.
Interact with end users/clients and translate business language into technical requirements.
Acts independently to expose and resolve problems.
Job Requirements :-
2+ years experience working in software development & data pipeline development for enterprise analytics.
2+ years of working with Python with exposure to various warehousing tools
In-depth working with any of commercial tools like AWS Glue, Ta-lend, Informatica, Data-stage, etc.
Experience with various relational databases like MySQL, MSSql, Oracle etc. is a must.
Experience with analytics and reporting tools (Tableau, Power BI, SSRS, SSAS).
Experience in various DevOps practices helping the client to deploy and scale the systems as per requirement.
Strong verbal and written communication skills with other developers and business client.
Knowledge of Logistics and/or Transportation Domain is a plus.
Hands-on with traditional databases and ERP systems like Sybase and People-soft.
Who Are We?
Vahak (https://www.vahak.in) is India’s largest & most trusted online transport marketplace & directory for road transport businesses and individual commercial vehicle (Trucks, Trailers, Containers, Hyva, LCVs) owners for online truck and load booking, transport business branding and transport business network expansion. Lorry owners can find intercity and intracity loads from all over India and connect with other businesses to find trusted transporters and best deals in the Indian logistics services market. With the Vahak app, users can book loads and lorries from a live transport marketplace with over 5 Lakh + Transporters and Lorry owners in over 10,000+ locations for daily transport requirements.
Vahak has raised a capital of $5 Million in a “Pre Series A” round from RTP Global along with participation from Luxor Capital and Leo Capital. The other marquee angel investors include Kunal Shah, Founder and CEO, CRED; Jitendra Gupta, Founder and CEO, Jupiter; Vidit Aatrey and Sanjeev Barnwal, Co-founders, Meesho; Mohd Farid, Co-founder, Sharechat; Amrish Rau, CEO, Pine Labs; Harsimarbir Singh, Co-founder, Pristyn Care; Rohit and Kunal Bahl, Co-founders, Snapdeal; and Ravish Naresh, Co-founder and CEO, Khatabook.
Responsibilities for Data Analyst:
- Undertake preprocessing of structured and unstructured data
- Propose solutions and strategies to business challenges
- Present information using data visualization techniques
- Identify valuable data sources and automate collection processes
- Analyze large amounts of information to discover trends and patterns
- Mine and analyze data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.
- Proactively analyze data to answer key questions from stakeholders or out of self-initiated curiosity with an eye for what drives business performance.
Qualifications for Data Analyst
- Experience using business intelligence tools (e.g. Tableau, Power BI – not mandatory)
- Strong SQL or Excel skills with the ability to learn other analytic tools
- Conceptual understanding of various modelling techniques, pros and cons of each technique
- Strong problem solving skills with an emphasis on product development.
- Programming advanced computing, Developing algorithms and predictive modeling experience
- Experience using statistical computer languages (R, Python, SQL, etc.) to manipulate data and draw insights from large data sets.
- Advantage - Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks.
- Demonstrated experience applying data analysis methods to real-world data problems
o 3+ years of software engineering experience.
o Advanced knowledge of Python, with 2+ years in a production environment.
o Experience with practical applications of deep learning.
o Experience with agile, test-driven development, continuous integration, and automated testing.
o Experience with productionizing machine learning models and integrating into web- services.
o Experience with the full software development life cycle, including requirements collection, design, implementation, testing, and operational support.
o Excellent verbal and written communication, teamwork, decision making and influencing
o Hustle. Thrives in an evolving, fast paced, ambiguous work environment.
We are looking for a motivated data analyst with sound experience in handling web/ digital analytics, to join us as part of the Kapiva D2C Business Team. This team is primarily responsible for driving sales and customer engagement on our website. This channel has grown 5x in revenue over the last 12 months and is poised to grow another 5x over the next six. It represents a high-growth, important part of our overall e-commerce growth strategy.
The mandate here is to run an end-to-end sustainable e-commerce business, boost sales through marketing campaigns, and build a cutting edge product (website) that optimizes the customer’s journey as well as increases customer lifetime value.
The Data Analyst will support the business heads by providing data-backed insights in order to drive customer growth, retention and engagement. They will be required to set-up and manage reports, test various hypotheses and coordinate with various stakeholders on a day-to-day basis.
Strategy and planning:
● Work with the D2C functional leads and support analytics planning on a quarterly/ annual basis
● Identify reports and analytics needed to be conducted on a daily/ weekly/ monthly frequency
● Drive planning for hypothesis-led testing of key metrics across the customer funnel
● Interpret data, analyze results using statistical techniques and provide ongoing reports
● Analyze large amounts of information to discover trends and patterns
● Work with business teams to prioritize business and information needs
● Collaborate with engineering and product development teams to setup data infrastructure as needed
Reporting and communication:
● Prepare reports / presentations to present actionable insights that can drive business objectives
● Setup live dashboards reporting key cross-functional metrics
● Coordinate with various stakeholders to collect useful and required data
● Present findings to business stakeholders to drive action across the organization
● Propose solutions and strategies to business challenges
● Bachelor’s/ Masters in Mathematics, Economics, Computer Science, Information Management, Statistics or related field
● High proficiency in MS Excel and SQL
● Knowledge of one or more programming languages like Python/ R. Adept at queries, report writing and presenting findings
● Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy - working knowledge of statistics and statistical methods
● Ability to work in a highly dynamic environment across cross-functional teams; good at
coordinating with different departments and managing timelines
● Exceptional English written/verbal communication
● A penchant for understanding consumer traits and behavior and a keen eye to detail
Good to have:
● Hands-on experience with one or more web analytics tools like Google Analytics, Mixpanel, Kissmetrics, Heap, Adobe Analytics, etc.
● Experience in using business intelligence tools like Metabase, Tableau, Power BI is a plus
● Experience in developing predictive models and machine learning algorithms
1. Understand the business problem and translate these to data services and
2. Expertise in working on cloud application designs, cloud approval plans, and
systems required to manage cloud storage.
3. Explore new technologies and learn new techniques to solve business problems
4. Collaborate with different teams - engineering and business, to build better data
5. Regularly evaluate cloud applications, hardware, and software.
6. Respond to technical issues in a professional and timely manner.
7. Identify the top cloud architecture solutions to successfully meet the strategic
needs of the company.
8. Offer guidance in infrastructure movement techniques including bulk application
transfers into the cloud.
9. Manage team and handle delivery of 2-3 projects
JD | Data Architect 24-Aug-2021
Solving for better 3 of 7
Is Education overrated? Yes. We believe so. But there is no way to locate you
otherwise. So we might look for at least a computer science, computer engineering,
information technology, or relevant field along with:
1. Over 4-6 years of experience in Data handling
2. Hands-on experience of any one programming language (Python, Java, Scala)
3. Understanding of SQL is must
4. Big data (Hadoop, Hive, Yarn, Sqoop)
5. MPP platforms (Spark, Presto)
6. Data-pipeline & scheduler tool (Ozzie, Airflow, Nifi)
7. Streaming engines (Kafka, Storm, Spark Streaming)
8. Any Relational database or DW experience
9. Any ETL tool experience
10. Hands-on experience in pipeline design, ETL and application development
11. Hands-on experience in cloud platforms like AWS, GCP etc.
12. Good communication skills and strong analytical skills
13. Experience in team handling and project delivery
- Conducting advanced statistical analysis to provide actionable insights, identify trends, and measure performance
- Performing data exploration, cleaning, preparation and feature engineering; in addition to executing tasks such as building a POC, validation/ AB testing
- Collaborating with data engineers & architects to implement and deploy scalable solutions
- Communicating results to diverse audiences with effective writing and visualizations
- Identifying and executing on high impact projects, triage external requests, and ensure timely completion for the results to be useful
- Providing thought leadership by researching best practices, conducting experiments, and collaborating with industry leaders
What you need to have:
- 2-4 year experience in machine learning algorithms, predictive analytics, demand forecasting in real-world projects
- Strong statistical background in descriptive and inferential statistics, regression, forecasting techniques.
- Strong Programming background in Python (including packages like Tensorflow), R, D3.js , Tableau, Spark, SQL, MongoDB.
- Preferred exposure to Optimization & Meta-heuristic algorithm and related applications
- Background in a highly quantitative field like Data Science, Computer Science, Statistics, Applied Mathematics,Operations Research, Industrial Engineering, or similar fields.
- Should have 2-4 years of experience in Data Science algorithm design and implementation, data analysis in different applied problems.
- DS Mandatory skills : Python, R, SQL, Deep learning, predictive analysis, applied statistics
• 2+ years of experience in data engineering & strong understanding of data engineering principles using big data technologies
• Excellent programming skills in Python is mandatory
• Expertise in relational databases (MSSQL/MySQL/Postgres) and expertise in SQL. Exposure to NoSQL such as Cassandra. MongoDB will be a plus.
• Exposure to deploying ETL pipelines such as AirFlow, Docker containers & Lambda functions
• Experience in AWS loud services such as AWS CLI, Glue, Kinesis etc
• Experience using Tableau for data visualization is a plus
• Ability to demonstrate a portfolio of projects (GitHub, papers, etc.) is a plus
• Motivated, can-do attitude and desire to make a change is a must
• Excellent communication skills
o Strong Python development skills, with 7+ yrs. experience with SQL.
o A bachelor or master’s degree in Computer Science or related areas
o 5+ years of experience in data integration and pipeline development
o Experience in Implementing Databricks Delta lake and data lake
o Expertise designing and implementing data pipelines using modern data engineering approach and tools: SQL, Python, Delta Lake, Databricks, Snowflake Spark
o Experience in working with multiple file formats (Parque, Avro, Delta Lake) & API
o experience with AWS Cloud on data integration with S3.
o Hands on Development experience with Python and/or Scala.
o Experience with SQL and NoSQL databases.
o Experience in using data modeling techniques and tools (focused on Dimensional design)
o Experience with micro-service architecture using Docker and Kubernetes
o Have experience working with one or more of the public cloud providers i.e. AWS, Azure or GCP
o Experience in effectively presenting and summarizing complex data to diverse audiences through visualizations and other means
o Excellent verbal and written communications skills and strong leadership capabilities
Azure Data Lake, dataFactory, Databricks, Delta Lake
Essential Skills :
- Develop, enhance and maintain Python related projects, data services, platforms and processes.
- Apply and maintain data quality checks to ensure data integrity and completeness.
- Able to integrate multiple data sources and databases.
- Collaborate with cross-functional teams across, Decision Sciences, Search, Database Management. To design innovative solutions, capture requirements and drive a common future vision.
Technical Skills/Capabilities :
- Hands on experience in Python programming language.
- Understanding and proven application of Computer Science fundamentals in object oriented design, data structures, algorithm design, Regular expressions, data storage procedures, problem solving, and complexity analysis.
- Understanding of natural language processing and basic ML algorithms will be a plus.
- Good troubleshooting and debugging skills.
- Strong individual contributor, self-motivated, and a proven team player.
- Eager to learn and develop new experience and skills.
- Good communication and interpersonal skills.
About Company Profile :
Gauge Data Solutions Pvt Ltd :
- We are a leading company into Data Science, Machine learning and Artificial Intelligence.
- Within Gauge data we have a competitive environment for the Developers and Engineers.
- We at Gauge create potential solutions for the real world problems. One such example of our engineering is Casemine.
- Casemine is a legal research platform powered by Artificial Intelligence. It helps lawyers, judges and law researchers in their day to day life.
- Casemine provides exhaustive case results to its users with the use of cutting edge technologies.
- It is developed with the efforts of great engineers at Gauge Data.
- One such opportunity is now open for you. We at Gauge Data invites application for competitive, self motivated Python Developer.
Purpose of the Role :
- This position will play a central role in developing new features and enhancements for the products and services at Gauge Data.
- To know more about what we do and how we do it, feel free to read these articles:
- You can also visit us at https://www.casemine.com/.
- For more information visit us at: - www.gaugeanalytics.com
- Join us on LinkedIn, Twitter & Facebook