REQUIREMENT:
- Previous experience of working in large scale data engineering
- 4+ years of experience working in data engineering and/or backend technologies with cloud experience (any) is mandatory.
- Previous experience of architecting and designing backend for large scale data processing.
- Familiarity and experience of working in different technologies related to data engineering – different database technologies, Hadoop, spark, storm, hive etc.
- Hands-on and have the ability to contribute a key portion of data engineering backend.
- Self-inspired and motivated to drive for exceptional results.
- Familiarity and experience working with different stages of data engineering – data acquisition, data refining, large scale data processing, efficient data storage for business analysis.
- Familiarity and experience working with different DB technologies and how to scale them.
RESPONSIBILITY:
- End to end responsibility to come up with data engineering architecture, design, development and then implementation of it.
- Build data engineering workflow for large scale data processing.
- Discover opportunities in data acquisition.
- Bring industry best practices for data engineering workflow.
- Develop data set processes for data modelling, mining and production.
- Take additional tech responsibilities for driving an initiative to completion
- Recommend ways to improve data reliability, efficiency and quality
- Goes out of their way to reduce complexity.
- Humble and outgoing - engineering cheerleaders.
About Product / Internet / Media Companies
Similar jobs
Responsibilities
Researches, develops and maintains machine learning and statistical models for
business requirements
Work across the spectrum of statistical modelling including supervised,
unsupervised, & deep learning techniques to apply the right level of solution to
the right problem Coordinate with different functional teams to monitor outcomes and refine/
improve the machine learning models Implements models to uncover patterns and predictions creating business value and innovation
Identify unexplored data opportunities for the business to unlock and maximize
the potential of digital data within the organization
Develop NLP concepts and algorithms to classify and summarize structured/unstructured text data
Qualifications
3+ years of experience solving complex business problems using machine
learning.
Fluency in programming languages such as Python, NLP and Bert, is a must
Strong analytical and critical thinking skills
Experience in building production quality models using state-of-the-art technologies
Familiarity with databases .
desirable Ability to collaborate on projects and work independently when required.
Previous experience in Fintech/payments domain is a bonus
You should have Bachelor’s or Master’s degree in Computer Science, Statistics
or Mathematics or another quantitative field from a top tier Institute
About Us:
6sense is a Predictive Intelligence Engine that is reimagining how B2B companies do
sales and marketing. It works with big data at scale, advanced machine learning and
predictive modelling to find buyers and predict what they will purchase, when and
how much.
6sense helps B2B marketing and sales organizations fully understand the complex ABM
buyer journey. By combining intent signals from every channel with the industry’s most
advanced AI predictive capabilities, it is finally possible to predict account demand and
optimize demand generation in an ABM world. Equipped with the power of AI and the
6sense Demand PlatformTM, marketing and sales professionals can uncover, prioritize,
and engage buyers to drive more revenue.
6sense is seeking a Staff Software Engineer and data to become part of a team
designing, developing, and deploying its customer-centric applications.
We’ve more than doubled our revenue in the past five years and completed our Series
E funding of $200M last year, giving us a stable foundation for growth.
Responsibilities:
1. Own critical datasets and data pipelines for product & business, and work
towards direct business goals of increased data coverage, data match rates, data
quality, data freshness
2. Create more value from various datasets with creative solutions, and unlocking
more value from existing data, and help build data moat for the company3. Design, develop, test, deploy and maintain optimal data pipelines, and assemble
large, complex data sets that meet functional and non-functional business
requirements
4. Improving our current data pipelines i.e. improve their performance, SLAs,
remove redundancies, and figure out a way to test before v/s after roll out
5. Identify, design, and implement process improvements in data flow across
multiple stages and via collaboration with multiple cross functional teams eg.
automating manual processes, optimising data delivery, hand-off processes etc.
6. Work with cross function stakeholders including the Product, Data Analytics ,
Customer Support teams for their enablement for data access and related goals
7. Build for security, privacy, scalability, reliability and compliance
8. Mentor and coach other team members on scalable and extensible solutions
design, and best coding standards
9. Help build a team and cultivate innovation by driving cross-collaboration and
execution of projects across multiple teams
Requirements:
8-10+ years of overall work experience as a Data Engineer
Excellent analytical and problem-solving skills
Strong experience with Big Data technologies like Apache Spark. Experience with
Hadoop, Hive, Presto would-be a plus
Strong experience in writing complex, optimized SQL queries across large data
sets. Experience with optimizing queries and underlying storage
Experience with Python/ Scala
Experience with Apache Airflow or other orchestration tools
Experience with writing Hive / Presto UDFs in Java
Experience working on AWS cloud platform and services.
Experience with Key Value stores or NoSQL databases would be a plus.
Comfortable with Unix / Linux command line
Interpersonal Skills:
You can work independently as well as part of a team.
You take ownership of projects and drive them to conclusion.
You’re a good communicator and are capable of not just doing the work, but also
teaching others and explaining the “why” behind complicated technical
decisions.
You aren’t afraid to roll up your sleeves: This role will evolve over time, and we’ll
want you to evolve with it
About Quadratyx:
We are a product-centric insight & automation services company globally. We help the world’s organizations make better & faster decisions using the power of insight & intelligent automation. We build and operationalize their next-gen strategy, through Big Data, Artificial Intelligence, Machine Learning, Unstructured Data Processing and Advanced Analytics. Quadratyx can boast more extensive experience in data sciences & analytics than most other companies in India.
We firmly believe in Excellence Everywhere.
Job Description
Purpose of the Job/ Role:
• As a Technical Lead, your work is a combination of hands-on contribution, customer engagement and technical team management. Overall, you’ll design, architect, deploy and maintain big data solutions.
Key Requisites:
• Expertise in Data structures and algorithms.
• Technical management across the full life cycle of big data (Hadoop) projects from requirement gathering and analysis to platform selection, design of the architecture and deployment.
• Scaling of cloud-based infrastructure.
• Collaborating with business consultants, data scientists, engineers and developers to develop data solutions.
• Led and mentored a team of data engineers.
• Hands-on experience in test-driven development (TDD).
• Expertise in No SQL like Mongo, Cassandra etc, preferred Mongo and strong knowledge of relational databases.
• Good knowledge of Kafka and Spark Streaming internal architecture.
• Good knowledge of any Application Servers.
• Extensive knowledge of big data platforms like Hadoop; Hortonworks etc.
• Knowledge of data ingestion and integration on cloud services such as AWS; Google Cloud; Azure etc.
Skills/ Competencies Required
Technical Skills
• Strong expertise (9 or more out of 10) in at least one modern programming language, like Python, or Java.
• Clear end-to-end experience in designing, programming, and implementing large software systems.
• Passion and analytical abilities to solve complex problems Soft Skills.
• Always speaking your mind freely.
• Communicating ideas clearly in talking and writing, integrity to never copy or plagiarize intellectual property of others.
• Exercising discretion and independent judgment where needed in performing duties; not needing micro-management, maintaining high professional standards.
Academic Qualifications & Experience Required
Required Educational Qualification & Relevant Experience
• Bachelor’s or Master’s in Computer Science, Computer Engineering, or related discipline from a well-known institute.
• Minimum 7 - 10 years of work experience as a developer in an IT organization (preferably Analytics / Big Data/ Data Science / AI background.
We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.
Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs
Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
• S/he possesses a wide exposure to complete lifecycle of data starting from creation to consumption
• S/he has in the past built repeatable tools / data-models to solve specific business problems
• S/he should have hand-on experience of having worked on projects (either as a consultant or with in a company) that needed them to
o Provide consultation to senior client personnel o Implement and enhance data warehouses or data lakes.
o Worked with business teams or was a part of the team that implemented process re-engineering driven by data analytics/insights
• Should have deep appreciation of how data can be used in decision-making
• Should have perspective on newer ways of solving business problems. E.g. external data, innovative techniques, newer technology
• S/he must have a solution-creation mindset.
Ability to design and enhance scalable data platforms to address the business need
• Working experience on data engineering tool for one or more cloud platforms -Snowflake, AWS/Azure/GCP
• Engage with technology teams from Tredence and Clients to create last mile connectivity of the solutions
o Should have experience of working with technology teams
• Demonstrated ability in thought leadership – Articles/White Papers/Interviews
Mandatory Skills Program Management, Data Warehouse, Data Lake, Analytics, Cloud Platform
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
-
Azure Synapse or Azure SQL data warehouse
-
Spark on Azure is available in HD insights and data bricks
- Data pre-processing, data transformation, data analysis, and feature engineering
- Performance optimization of scripts (code) and Productionizing of code (SQL, Pandas, Python or PySpark, etc.)
- Required skills:
- Bachelors in - in Computer Science, Data Science, Computer Engineering, IT or equivalent
- Fluency in Python (Pandas), PySpark, SQL, or similar
- Azure data factory experience (min 12 months)
- Able to write efficient code using traditional, OO concepts, modular programming following the SDLC process.
- Experience in production optimization and end-to-end performance tracing (technical root cause analysis)
- Ability to work independently with demonstrated experience in project or program management
- Azure experience ability to translate data scientist code in Python and make it efficient (production) for cloud deployment
along with metrics to track their progress
Managing available resources such as hardware, data, and personnel so that deadlines
are met
Analysing the ML algorithms that could be used to solve a given problem and ranking
them by their success probability
Exploring and visualizing data to gain an understanding of it, then identifying
differences in data distribution that could affect performance when deploying the model
in the real world
Verifying data quality, and/or ensuring it via data cleaning
Supervising the data acquisition process if more data is needed
Defining validation strategies
Defining the pre-processing or feature engineering to be done on a given dataset
Defining data augmentation pipelines
Training models and tuning their hyper parameters
Analysing the errors of the model and designing strategies to overcome them
Deploying models to production