Key Responsibilities : ( Data Developer Python, Spark)
Exp : 2 to 9 Yrs
Development of data platforms, integration frameworks, processes, and code.
Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages
Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.
Elaborate stories in a collaborative agile environment (SCRUM or Kanban)
Familiarity with cloud platforms like GCP, AWS or Azure.
Experience with large data volumes.
Familiarity with writing rest-based services.
Experience with distributed processing and systems
Experience with Hadoop / Spark toolsets
Experience with relational database management systems (RDBMS)
Experience with Data Flow development
Knowledge of Agile and associated development techniques including:
n
About Hiring for one of the MNC for India location
Similar jobs
Job Title -Data Scientist
Job Duties
- Data Scientist responsibilities includes planning projects and building analytics models.
- You should have a strong problem-solving ability and a knack for statistical analysis.
- If you're also able to align our data products with our business goals, we'd like to meet you. Your ultimate goal will be to help improve our products and business decisions by making the most out of our data.
Responsibilities
Own end-to-end business problems and metrics, build and implement ML solutions using cutting-edge technology.
Create scalable solutions to business problems using statistical techniques, machine learning, and NLP.
Design, experiment and evaluate highly innovative models for predictive learning
Work closely with software engineering teams to drive real-time model experiments, implementations, and new feature creations
Establish scalable, efficient, and automated processes for large-scale data analysis, model development, deployment, experimentation, and evaluation.
Research and implement novel machine learning and statistical approaches.
Requirements
2-5 years of experience in data science.
In-depth understanding of modern machine learning techniques and their mathematical underpinnings.
Demonstrated ability to build PoCs for complex, ambiguous problems and scale them up.
Strong programming skills (Python, Java)
High proficiency in at least one of the following broad areas: machine learning, statistical modelling/inference, information retrieval, data mining, NLP
Experience with SQL and NoSQL databases
Strong organizational and leadership skills
Excellent communication skills
Data Engineer
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
- AWS certification
- Spark Streaming
- Kafka Streaming / Kafka Connect
- ELK Stack
- Cassandra / MongoDB
- CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
We are #hiring for AWS Data Engineer expert to join our team
Job Title: AWS Data Engineer
Experience: 5 Yrs to 10Yrs
Location: Remote
Notice: Immediate or Max 20 Days
Role: Permanent Role
Skillset: AWS, ETL, SQL, Python, Pyspark, Postgres DB, Dremio.
Job Description:
Able to develop ETL jobs.
Able to help with data curation/cleanup, data transformation, and building ETL pipelines.
Strong Postgres DB exp and knowledge of Dremio data visualization/semantic layer between DB and the application is a plus.
Sql, Python, and Pyspark is a must.
Communication should be good
Ganit has flipped the data science value chain as we do not start with a technique but for us, consumption comes first. With this philosophy, we have successfully scaled from being a small start-up to a 200 resource company with clients in the US, Singapore, Africa, UAE, and India.
We are looking for experienced data enthusiasts who can make the data talk to them.
You will:
- Understand business problems and translate business requirements into technical requirements.
- Conduct complex data analysis to ensure data quality & reliability i.e., make the data talk by extracting, preparing, and transforming it.
- Identify, develop and implement statistical techniques and algorithms to address business challenges and add value to the organization.
- Gather requirements and communicate findings in the form of a meaningful story with the stakeholders
- Build & implement data models using predictive modelling techniques. Interact with clients and provide support for queries and delivery adoption.
- Lead and mentor data analysts.
We are looking for someone who has:
- Apart from your love for data and ability to code even while sleeping you would need the following.
- Minimum of 02 years of experience in designing and delivery of data science solutions.
- You should have successful projects of retail/BFSI/FMCG/Manufacturing/QSR in your kitty to show-off.
- Deep understanding of various statistical techniques, mathematical models, and algorithms to start the conversation with the data in hand.
- Ability to choose the right model for the data and translate that into a code using R, Python, VBA, SQL, etc.
- Bachelors/Masters degree in Engineering/Technology or MBA from Tier-1 B School or MSc. in Statistics or Mathematics
Skillset Required:
- Regression
- Classification
- Predictive Modelling
- Prescriptive Modelling
- Python
- R
- Descriptive Modelling
- Time Series
- Clustering
What is in it for you:
- Be a part of building the biggest brand in Data science.
- An opportunity to be a part of a young and energetic team with a strong pedigree.
- Work on awesome projects across industries and learn from the best in the industry, while growing at a hyper rate.
Please Note:
At Ganit, we are looking for people who love problem solving. You are encouraged to apply even if your experience does not precisely match the job description above. Your passion and skills will stand out and set you apart—especially if your career has taken some extraordinary twists and turns over the years. We welcome diverse perspectives, people who think rigorously and are not afraid to challenge assumptions in a problem. Join us and punch above your weight!
Ganit is an equal opportunity employer and is committed to providing a work environment that is free from harassment and discrimination.
All recruitment, selection procedures and decisions will reflect Ganit’s commitment to providing equal opportunity. All potential candidates will be assessed according to their skills, knowledge, qualifications, and capabilities. No regard will be given to factors such as age, gender, marital status, race, religion, physical impairment, or political opinions.
Job Overview
We are looking for a Data Engineer to join our data team to solve data-driven critical
business problems. The hire will be responsible for expanding and optimizing the existing
end-to-end architecture including the data pipeline architecture. The Data Engineer will
collaborate with software developers, database architects, data analysts, data scientists and platform team on data initiatives and will ensure optimal data delivery architecture is
consistent throughout ongoing projects. The right candidate should have hands on in
developing a hybrid set of data-pipelines depending on the business requirements.
Responsibilities
- Develop, construct, test and maintain existing and new data-driven architectures.
- Align architecture with business requirements and provide solutions which fits best
- to solve the business problems.
- Build the infrastructure required for optimal extraction, transformation, and loading
- of data from a wide variety of data sources using SQL and Azure ‘big data’
- technologies.
- Data acquisition from multiple sources across the organization.
- Use programming language and tools efficiently to collate the data.
- Identify ways to improve data reliability, efficiency and quality
- Use data to discover tasks that can be automated.
- Deliver updates to stakeholders based on analytics.
- Set up practices on data reporting and continuous monitoring
Required Technical Skills
- Graduate in Computer Science or in similar quantitative area
- 1+ years of relevant work experience as a Data Engineer or in a similar role.
- Advanced SQL knowledge, Data-Modelling and experience working with relational
- databases, query authoring (SQL) as well as working familiarity with a variety of
- databases.
- Experience in developing and optimizing ETL pipelines, big data pipelines, and datadriven
- architectures.
- Must have strong big-data core knowledge & experience in programming using Spark - Python/Scala
- Experience with orchestrating tool like Airflow or similar
- Experience with Azure Data Factory is good to have
- Build processes supporting data transformation, data structures, metadata,
- dependency and workload management.
- Experience supporting and working with cross-functional teams in a dynamic
- environment.
- Good understanding of Git workflow, Test-case driven development and using CICD
- is good to have
- Good to have some understanding of Delta tables It would be advantage if the candidate also have below mentioned experience using
- the following software/tools:
- Experience with big data tools: Hadoop, Spark, Hive, etc.
- Experience with relational SQL and NoSQL databases
- Experience with cloud data services
- Experience with object-oriented/object function scripting languages: Python, Scala, etc.
Work Timings:4:00PM to 11:30PM
Fulltime WFH
6+ Yrs in Data science
Strong Experience ML Regression, Classification, Anomaly detection, NLP, Deep learning, Predictive analytics, Predictive maintenance ,Python, Added advantage Data visualization
datasets
● Translate complex business requirements into scalable technical solutions meeting data design
standards. Strong understanding of analytics needs and proactive-ness to build generic solutions
to improve the efficiency
● Build dashboards using Self-Service tools on Kibana and perform data analysis to support
business verticals
● Collaborate with multiple cross-functional teams and work
Specialism- Advance Analytics, Data Science, regression, forecasting, analytics, SQL, R, python, decision tree, random forest, SAS, clustering classification
Senior Analytics Consultant- Responsibilities
- Understand business problem and requirements by building domain knowledge and translate to data science problem
- Conceptualize and design cutting edge data science solution to solve the data science problem, apply design thinking concepts
- Identify the right algorithms , tech stack , sample outputs required to efficiently adder the end need
- Prototype and experiment the solution to successfully demonstrate the value
Independently or with support from team execute the conceptualized solution as per plan by following project management guidelines - Present the results to internal and client stakeholder in an easy to understand manner with great story telling, story boarding, insights and visualization
- Help build overall data science capability for eClerx through support in pilots, pre sales pitches, product development , practice development initiatives
DataWeave provides Retailers and Brands with “Competitive Intelligence as a Service” that enables them to take key decisions that impact their revenue. Powered by AI, we provide easily consumable and actionable competitive intelligence by aggregating and analyzing billions of publicly available data points on the Web to help businesses develop data-driven strategies and make smarter decisions.
Data Science@DataWeave
We the Data Science team at DataWeave (called Semantics internally) build the core machine learning backend and structured domain knowledge needed to deliver insights through our data products. Our underpinnings are: innovation, business awareness, long term thinking, and pushing the envelope. We are a fast paced labs within the org applying the latest research in Computer Vision, Natural Language Processing, and Deep Learning to hard problems in different domains.
How we work?
It's hard to tell what we love more, problems or solutions! Every day, we choose to address some of the hardest data problems that there are. We are in the business of making sense of messy public data on the web. At serious scale!
What do we offer?
- Some of the most challenging research problems in NLP and Computer Vision. Huge text and image datasets that you can play with!
- Ability to see the impact of your work and the value you're adding to our customers almost immediately.
- Opportunity to work on different problems and explore a wide variety of tools to figure out what really excites you.
- A culture of openness. Fun work environment. A flat hierarchy. Organization wide visibility. Flexible working hours.
- Learning opportunities with courses and tech conferences. Mentorship from seniors in the team.
- Last but not the least, competitive salary packages and fast paced growth opportunities.
Who are we looking for?
The ideal candidate is a strong software developer or a researcher with experience building and shipping production grade data science applications at scale. Such a candidate has keen interest in liaising with the business and product teams to understand a business problem, and translate that into a data science problem. You are also expected to develop capabilities that open up new business productization opportunities.
We are looking for someone with 6+ years of relevant experience working on problems in NLP or Computer Vision with a Master's degree (PhD preferred).
Key problem areas
- Preprocessing and feature extraction noisy and unstructured data -- both text as well as images.
- Keyphrase extraction, sequence labeling, entity relationship mining from texts in different domains.
- Document clustering, attribute tagging, data normalization, classification, summarization, sentiment analysis.
- Image based clustering and classification, segmentation, object detection, extracting text from images, generative models, recommender systems.
- Ensemble approaches for all the above problems using multiple text and image based techniques.
Relevant set of skills
- Have a strong grasp of concepts in computer science, probability and statistics, linear algebra, calculus, optimization, algorithms and complexity.
- Background in one or more of information retrieval, data mining, statistical techniques, natural language processing, and computer vision.
- Excellent coding skills on multiple programming languages with experience building production grade systems. Prior experience with Python is a bonus.
- Experience building and shipping machine learning models that solve real world engineering problems. Prior experience with deep learning is a bonus.
- Experience building robust clustering and classification models on unstructured data (text, images, etc). Experience working with Retail domain data is a bonus.
- Ability to process noisy and unstructured data to enrich it and extract meaningful relationships.
- Experience working with a variety of tools and libraries for machine learning and visualization, including numpy, matplotlib, scikit-learn, Keras, PyTorch, Tensorflow.
- Use the command line like a pro. Be proficient in Git and other essential software development tools.
- Working knowledge of large-scale computational models such as MapReduce and Spark is a bonus.
- Be a self-starter—someone who thrives in fast paced environments with minimal ‘management’.
- It's a huge bonus if you have some personal projects (including open source contributions) that you work on during your spare time. Show off some of your projects you have hosted on GitHub.
Role and responsibilities
- Understand the business problems we are solving. Build data science capability that align with our product strategy.
- Conduct research. Do experiments. Quickly build throw away prototypes to solve problems pertaining to the Retail domain.
- Build robust clustering and classification models in an iterative manner that can be used in production.
- Constantly think scale, think automation. Measure everything. Optimize proactively.
- Take end to end ownership of the projects you are working on. Work with minimal supervision.
- Help scale our delivery, customer success, and data quality teams with constant algorithmic improvements and automation.
- Take initiatives to build new capabilities. Develop business awareness. Explore productization opportunities.
- Be a tech thought leader. Add passion and vibrance to the team. Push the envelope. Be a mentor to junior members of the team.
- Stay on top of latest research in deep learning, NLP, Computer Vision, and other relevant areas.