50+ Hadoop Jobs in India
Apply to 50+ Hadoop Jobs on CutShort.io. Find your next job, effortlessly. Browse Hadoop Jobs and apply today!
Experience: 12-15 Years
Key Responsibilities:
- Client Engagement & Requirements Gathering: Independently engage with client stakeholders to
- understand data landscapes and requirements, translating them into functional and technical specifications.
- Data Architecture & Solution Design: Architect and implement Hadoop-based Cloudera CDP solutions,
- including data integration, data warehousing, and data lakes.
- Data Processes & Governance: Develop data ingestion and ETL/ELT frameworks, ensuring robust data governance and quality practices.
- Performance Optimization: Provide SQL expertise and optimize Hadoop ecosystems (HDFS, Ozone, Kudu, Spark Streaming, etc.) for maximum performance.
- Coding & Development: Hands-on coding in relevant technologies and frameworks, ensuring project deliverables meet stringent quality and performance standards.
- API & Database Management: Integrate APIs and manage databases (e.g., PostgreSQL, Oracle) to support seamless data flows.
- Leadership & Mentoring: Guide and mentor a team of data engineers and analysts, fostering collaboration and technical excellence.
Skills Required:
- a. Technical Proficiency:
- • Extensive experience with Hadoop ecosystem tools and services (HDFS, YARN, Cloudera
- Manager, Impala, Kudu, Hive, Spark Streaming, etc.).
- • Proficiency in programming languages like Spark, Python, Scala and a strong grasp of SQL
- performance tuning.
- • ETL tool expertise (e.g., Informatica, Talend, Apache Nifi) and data modelling knowledge.
- • API integration skills for effective data flow management.
- b. Project Management & Communication:
- • Proven ability to lead large-scale data projects and manage project timelines.
- • Excellent communication, presentation, and critical thinking skills.
- c. Client & Team Leadership:
- • Engage effectively with clients and partners, leading onsite and offshore teams.
ML Data Engineer, Railofy
About the Team & Role:
Railofy is solving the largest problems for Indian Railway passengers (90% of Indians travel by trains) - waitlisted tickets and quality of food in trains. We are a young team of technologists who have made a significant breakthrough in this field using Machine Learning, in order to create a major impact in the lives of Indians.
ML Data Engineers at Railofy lead all the processes from data collection, cleaning, and preprocessing, to managing the infra and deploying models to production. The ideal candidate will be passionate about big data and stay up-to-date with the latest developments in the field.
The Data Science team works on proprietary data sets (tabular) building classical machine learning models & recommendation systems.
What will you get to do here?
- Understanding and building expertise in the domain
- Managing available resources such as hardware, data, and personnel so that deadlines are met
- Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world
- Verifying data quality, and/or ensuring it via data cleaning
- Supervising the data acquisition process if more data is needed
- Defining validation strategies
- Defining the preprocessing or feature engineering to be done on a given dataset
- Defining data augmentation pipelines
- Training models and tuning their hyperparameters
- Deploying models to production and monitoring production systems
What qualities are we looking for?
- Proficiency with Python and libraries for machine learning such as scikit-learn, pandas, numPy, matplotlib, seaborn and keras/tensorflow
- Proficiency with PostgreSQL, ElasticSearch, Kibana, Redis and Big Data technologies
- Proficiency with Cloud computing on AWS using services like EC2, S3, ECS, Lambda and RDS
- Expertise in visualizing and manipulating big datasets
- Ability to select hardware to run an ML model with the required latency at scale
- Computer Science or IT Engineering background with solid understanding of basics of Data Structures and Algorithms
- 5-10 years of data science experience
- You love coding like a hobby, strive to code in a clean and structured manner with time and space complexities in mind and are up for a challenge!
- Motivation to join an early stage startup should go beyond compensation
Location: Mumbai/Remote
Secondary Skills: Streaming, Archiving , AWS / AZURE / CLOUD
Role:
· Should have strong programming and support experience in Java, J2EE technologies
· Should have good experience in Core Java, JSP, Sevlets, JDBC
· Good exposure in Hadoop development ( HDFS, Map Reduce, Hive, HBase, Spark)
· Should have 2+ years of Java experience and 1+ years of experience in Hadoop
· Should possess good communication skills
About the company
DCB Bank is a new generation private sector bank with 442 branches across India.It is a scheduled commercial bank regulated by the Reserve Bank of India. DCB Bank’s business segments are Retail banking, Micro SME, SME, mid-Corporate, Agriculture, Government, Public Sector, Indian Banks, Co-operative Banks and Non-Banking Finance Companies.
Job Description
Department: Risk Analytics
CTC: Max 18 Lacs
Grade: Sr Manager/AVP
Experience: Min 4 years of relevant experience
We are looking for a Data Scientist to join our growing team of Data Science experts and manage the processes and people responsible for accurate data collection, processing, modelling, analysis, implementation, and maintenance.
Responsibilities
- Understand, monitor and maintain existing financial scorecards (ML Based) and make changes to the model when required.
- Perform Statistical analysis in R and assist IT team with deployment of ML model and analytical frameworks in Python.
- Should be able to handle multiple tasks and must know how to prioritize the work.
- Lead cross-functional projects using advanced data modelling and analysis techniques to discover insights that will guide strategic decisions and uncover optimization opportunities.
- Develop clear, concise and actionable solutions and recommendations for client’s business needs and actively explore client’s business and formulate solutions/ideas which can help client in terms of efficient cost cutting or in achieving growth/revenue/profitability targets faster.
- Build, develop and maintain data models, reporting systems, data automation systems, dashboards and performance metrics support that support key business decisions.
- Design and build technical processes to address business issues.
- Oversee the design and delivery of reports and insights that analyse business functions and key operations and performance metrics.
- Manage and optimize processes for data intake, validation, mining, and engineering as well as modelling, visualization, and communication deliverables.
- Communicate results and business impacts of insight initiatives to the Management of the company.
Requirements
- Industry knowledge
- 4 years or more of experience in financial services industry particularly retail credit industry is a must.
- Candidate should have either worked in banking sector (banks/ HFC/ NBFC) or consulting organizations serving these clients.
- Experience in credit risk model building such as application scorecards, behaviour scorecards, and/ or collection scorecards.
- Experience in portfolio monitoring, model monitoring, model calibration
- Knowledge of ECL/ Basel preferred.
- Educational qualification: Advanced degree in finance, mathematics, econometrics, or engineering.
- Technical knowledge: Strong data handling skills in databases such as SQL and Hadoop. Knowledge with data visualization tools, such as SAS VI/Tableau/PowerBI is preferred.
- Expertise in either R or Python; SAS knowledge will be plus.
Soft skills:
- Ability to quickly adapt to the analytical tools and development approaches used within DCB Bank
- Ability to multi-task good communication and team working skills.
- Ability to manage day-to-day written and verbal communication with relevant stakeholders.
- Ability to think strategically and make changes to data when required.
Key Responsibilities:
• Install, configure, and maintain Hadoop clusters.
• Monitor cluster performance and ensure high availability.
• Manage Hadoop ecosystem components (HDFS, YARN, Ozone, Spark, Kudu, Hive).
• Perform routine cluster maintenance and troubleshooting.
• Implement and manage security and data governance.
• Monitor systems health and optimize performance.
• Collaborate with cross-functional teams to support big data applications.
• Perform Linux administration tasks and manage system configurations.
• Ensure data integrity and backup procedures.
The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.
Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.
Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.
Actively participates with other consultants in problem-solving and approach development.
Responsibilities :
Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.
Perform data analysis to validate data models and to confirm the ability to meet business needs.
Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.
Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.
Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.
Coordinate with Data Architects, Program Managers and participate in recurring meetings.
Help and mentor team members to understand the data model and subject areas.
Ensure that the team adheres to best practices and guidelines.
Requirements :
- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.
- Experience with Spark optimization/tuning/resource allocations
- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.
- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).
- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.
- Have a deep understanding of the various stacks and components of the Big Data ecosystem.
- Hands-on experience with Python is a huge plus
TVARIT GmbH develops and delivers solutions in the field of artificial intelligence (AI) for the Manufacturing, automotive, and process industries. With its software products, TVARIT makes it possible for its customers to make intelligent and well-founded decisions, e.g., in forward-looking Maintenance, increasing the OEE and predictive quality. We have renowned reference customers, competent technology, a good research team from renowned Universities, and the award of a renowned AI prize (e.g., EU Horizon 2020) which makes TVARIT one of the most innovative AI companies in Germany and Europe.
We are looking for a self-motivated person with a positive "can-do" attitude and excellent oral and written communication skills in English.
We are seeking a skilled and motivated senior Data Engineer from the manufacturing Industry with over four years of experience to join our team. The Senior Data Engineer will oversee the department’s data infrastructure, including developing a data model, integrating large amounts of data from different systems, building & enhancing a data lake-house & subsequent analytics environment, and writing scripts to facilitate data analysis. The ideal candidate will have a strong foundation in ETL pipelines and Python, with additional experience in Azure and Terraform being a plus. This role requires a proactive individual who can contribute to our data infrastructure and support our analytics and data science initiatives.
Skills Required:
- Experience in the manufacturing industry (metal industry is a plus)
- 4+ years of experience as a Data Engineer
- Experience in data cleaning & structuring and data manipulation
- Architect and optimize complex data pipelines, leading the design and implementation of scalable data infrastructure, and ensuring data quality and reliability at scale
- ETL Pipelines: Proven experience in designing, building, and maintaining ETL pipelines.
- Python: Strong proficiency in Python programming for data manipulation, transformation, and automation.
- Experience in SQL and data structures
- Knowledge in big data technologies such as Spark, Flink, Hadoop, Apache, and NoSQL databases.
- Knowledge of cloud technologies (at least one) such as AWS, Azure, and Google Cloud Platform.
- Proficient in data management and data governance
- Strong analytical experience & skills that can extract actionable insights from raw data to help improve the business.
- Strong analytical and problem-solving skills.
- Excellent communication and teamwork abilities.
Nice To Have:
- Azure: Experience with Azure data services (e.g., Azure Data Factory, Azure Databricks, Azure SQL Database).
- Terraform: Knowledge of Terraform for infrastructure as code (IaC) to manage cloud.
- Bachelor’s degree in computer science, Information Technology, Engineering, or a related field from top-tier Indian Institutes of Information Technology (IIITs).
- Benefits And Perks
- A culture that fosters innovation, creativity, continuous learning, and resilience
- Progressive leave policy promoting work-life balance
- Mentorship opportunities with highly qualified internal resources and industry-driven programs
- Multicultural peer groups and supportive workplace policies
- Annual workcation program allowing you to work from various scenic locations
- Experience the unique environment of a dynamic start-up
Why should you join TVARIT ?
Working at TVARIT, a deep-tech German IT startup, offers a unique blend of innovation, collaboration, and growth opportunities. We seek individuals eager to adapt and thrive in a rapidly evolving environment.
If this opportunity excites you and aligns with your career aspirations, we encourage you to apply today!
- Architectural Leadership:
- Design and architect robust, scalable, and high-performance Hadoop solutions.
- Define and implement data architecture strategies, standards, and processes.
- Collaborate with senior leadership to align data strategies with business goals.
- Technical Expertise:
- Develop and maintain complex data processing systems using Hadoop and its ecosystem (HDFS, YARN, MapReduce, Hive, HBase, Pig, etc.).
- Ensure optimal performance and scalability of Hadoop clusters.
- Oversee the integration of Hadoop solutions with existing data systems and third-party applications.
- Strategic Planning:
- Develop long-term plans for data architecture, considering emerging technologies and future trends.
- Evaluate and recommend new technologies and tools to enhance the Hadoop ecosystem.
- Lead the adoption of big data best practices and methodologies.
- Team Leadership and Collaboration:
- Mentor and guide data engineers and developers, fostering a culture of continuous improvement.
- Work closely with data scientists, analysts, and other stakeholders to understand requirements and deliver high-quality solutions.
- Ensure effective communication and collaboration across all teams involved in data projects.
- Project Management:
- Lead large-scale data projects from inception to completion, ensuring timely delivery and high quality.
- Manage project resources, budgets, and timelines effectively.
- Monitor project progress and address any issues or risks promptly.
- Data Governance and Security:
- Implement robust data governance policies and procedures to ensure data quality and compliance.
- Ensure data security and privacy by implementing appropriate measures and controls.
- Conduct regular audits and reviews of data systems to ensure compliance with industry standards and regulations.
Responsibilities:
• Build customer facing solution for Data Observability product to monitor Data Pipelines
• Work on POCs to build new data pipeline monitoring capabilities.
• Building next-generation scalable, reliable, flexible, high-performance data pipeline capabilities for ingestion of data from multiple sources containing complex dataset.
•Continuously improve services you own, making them more performant, and utilising resources in the most optimised way.
• Collaborate closely with engineering, data science team and product team to propose an optimal solution for a given problem statement
• Working closely with DevOps team on performance monitoring and MLOps
Required Skills:
• 3+ Years of Data related technology experience.
• Good understanding of distributed computing principles
• Experience in Apache Spark
• Hands on programming with Python
• Knowledge of Hadoop v2, Map Reduce, HDFS
• Experience with building stream-processing systems, using technologies such as Apache Storm, Spark-Streaming or Flink
• Experience with messaging systems, such as Kafka or RabbitMQ
• Good understanding of Big Data querying tools, such as Hive
• Experience with integration of data from multiple data sources
• Good understanding of SQL queries, joins, stored procedures, relational schemas
• Experience with NoSQL databases, such as HBase, Cassandra/Scylla, MongoDB
• Knowledge of ETL techniques and frameworks
• Performance tuning of Spark Jobs
• General understanding of Data Quality is a plus point
• Experience on Databricks,snowflake and BigQuery or similar lake houses would be a big plus
• Nice to have some knowledge in DevOps
Job Description: Data Engineer
Experience: Over 4 years
Responsibilities:
- Design, develop, and maintain scalable data pipelines for efficient data extraction, transformation, and loading (ETL) processes.
- Architect and implement data storage solutions, including data warehouses, data lakes, and data marts, aligned with business needs.
- Implement robust data quality checks and data cleansing techniques to ensure data accuracy and consistency.
- Optimize data pipelines for performance, scalability, and cost-effectiveness.
- Collaborate with data analysts and data scientists to understand data requirements and translate them into technical solutions.
- Develop and maintain data security measures to ensure data privacy and regulatory compliance.
- Automate data processing tasks using scripting languages (Python, Bash) and big data frameworks (Spark, Hadoop).
- Monitor data pipelines and infrastructure for performance and troubleshoot any issues.
- Stay up to date with the latest trends and technologies in data engineering, including cloud platforms (AWS, Azure, GCP).
- Document data pipelines, processes, and data models for maintainability and knowledge sharing.
- Contribute to the overall data governance strategy and best practices.
Qualifications:
- Strong understanding of data architectures, data modelling principles, and ETL processes.
- Proficiency in SQL (e.g., MySQL, PostgreSQL) and experience with big data querying languages (e.g., Hive, Spark SQL).
- Experience with scripting languages (Python, Bash) for data manipulation and automation.
- Experience with distributed data processing frameworks (Spark, Hadoop) (preferred).
- Familiarity with cloud platforms (AWS, Azure, GCP) for data storage and processing (a plus).
- Experience with data quality tools and techniques.
- Excellent problem-solving, analytical, and critical thinking skills.
- Strong communication, collaboration, and teamwork abilities.
Must have skills
3 to 6 years
Data Science
SQL, Excel, Big Query - mandate 3+ years
Python/ML, Hadoop, Spark - 2+ years
Requirements
• 3+ years prior experience as a data analyst
• Detail oriented, structural thinking and analytical mindset.
• Proven analytic skills, including data analysis and data validation.
• Technical writing experience in relevant areas, including queries, reports, and presentations.
• Strong SQL and Excel skills with the ability to learn other analytic tools
• Good communication skills (being precise and clear)
• Good to have prior knowledge of python and ML algorithms
Sigmoid works with a variety of clients from start-ups to fortune 500 companies. We are looking for a detailed oriented self-starter to assist our engineering and analytics teams in various roles as a Software Development Engineer.
This position will be a part of a growing team working towards building world class large scale Big Data architectures. This individual should have a sound understanding of programming principles, experience in programming in Java, Python or similar languages and can expect to
spend a majority of their time coding.
Location - Bengaluru and Hyderabad
Responsibilities:
● Good development practices
○ Hands on coder with good experience in programming languages like Java or
Python.
○ Hands-on experience on the Big Data stack like PySpark, Hbase, Hadoop, Mapreduce and ElasticSearch.
○ Good understanding of programming principles and development practices like checkin policy, unit testing, code deployment
○ Self starter to be able to grasp new concepts and technology and translate them into large scale engineering developments
○ Excellent experience in Application development and support, integration development and data management.
● Align Sigmoid with key Client initiatives
○ Interface daily with customers across leading Fortune 500 companies to understand strategic requirements
● Stay up-to-date on the latest technology to ensure the greatest ROI for customer &Sigmoid
○ Hands on coder with good understanding on enterprise level code
○ Design and implement APIs, abstractions and integration patterns to solve challenging distributed computing problems
○ Experience in defining technical requirements, data extraction, data
transformation, automating jobs, productionizing jobs, and exploring new big data technologies within a Parallel Processing environment
● Culture
○ Must be a strategic thinker with the ability to think unconventional /
out:of:box.
○ Analytical and data driven orientation.
○ Raw intellect, talent and energy are critical.
○ Entrepreneurial and Agile : understands the demands of a private, high growth company.
○ Ability to be both a leader and hands on "doer".
Qualifications: -
- Years of track record of relevant work experience and a computer Science or related technical discipline is required
- Experience with functional and object-oriented programming, Java must.
- hand-On knowledge in Map Reduce, Hadoop, PySpark, Hbase and ElasticSearch.
- Effective communication skills (both written and verbal)
- Ability to collaborate with a diverse set of engineers, data scientists and product managers
- Comfort in a fast-paced start-up environment
Preferred Qualification:
- Technical knowledge in Map Reduce, Hadoop & GCS Stack a plus.
- Experience in agile methodology
- Experience with database modeling and development, data mining and warehousing.
- Experience in architecture and delivery of Enterprise scale applications and capable in developing framework, design patterns etc. Should be able to understand and tackle technical challenges, propose comprehensive solutions and guide junior staff
- Experience working with large, complex data sets from a variety of sources
Job Title Big Data Developer
Job Description
Bachelor's degree in Engineering or Computer Science or equivalent OR Master's in Computer Applications or equivalent.
Solid Experience of software development experience and leading teams of engineers and scrum teams.
4+ years of hands-on experience of working with Map-Reduce, Hive, Spark (core, SQL and PySpark).
Solid Datawarehousing concepts.
Knowledge of Financial reporting ecosystem will be a plus.
4+ years of experience within Data Engineering/ Data Warehousing using Big Data technologies will be an addon.
Expert on Distributed ecosystem.
Hands-on experience with programming using Core Java or Python/Scala
Expert on Hadoop and Spark Architecture and its working principle
Hands-on experience on writing and understanding complex SQL(Hive/PySpark-dataframes), optimizing joins while processing huge amount of data.
Experience in UNIX shell scripting.
Roles & Responsibilities
Ability to design and develop optimized Data pipelines for batch and real time data processing
Should have experience in analysis, design, development, testing, and implementation of system applications
Demonstrated ability to develop and document technical and functional specifications and analyze software and system processing flows.
Excellent technical and analytical aptitude
Good communication skills.
Excellent Project management skills.
Results driven Approach.
Mandatory SkillsBig Data, PySpark, Hive
Looking for freelance?
We are seeking a freelance Data Engineer with 7+ years of experience
Skills Required: Deep knowledge in any cloud (AWS, Azure , Google cloud), Data bricks, Data lakes, Data Ware housing Python/Scala , SQL, BI, and other analytics systems
What we are looking for
We are seeking an experienced Senior Data Engineer with experience in architecture, design, and development of highly scalable data integration and data engineering processes
- The Senior Consultant must have a strong understanding and experience with data & analytics solution architecture, including data warehousing, data lakes, ETL/ELT workload patterns, and related BI & analytics systems
- Strong in scripting languages like Python, Scala
- 5+ years of hands-on experience with one or more of these data integration/ETL tools.
- Experience building on-prem data warehousing solutions.
- Experience with designing and developing ETLs, Data Marts, Star Schema
- Designing a data warehouse solution using Synapse or Azure SQL DB
- Experience building pipelines using Synapse or Azure Data Factory to ingest data from various sources
- Understanding of integration run times available in Azure.
- Advanced working SQL knowledge and experience working with relational databases, and queries. authoring (SQL) as well as working familiarity with a variety of database
Experience:
Should have a minimum of 10-12 years of Experience.
Should have experience on Product Development/Maintenance/Production Support experience in a support organization
Should have a good understanding of services business for fortune 1000 from the operations point of view
Ability to read, understand and communicate complex technical information
Ability to express ideas in an organized, articulate and concise manner
Ability to face stressful situation with positive attitude
Any certification in regards to support services will be an added advantage
Education: BE, B- Tech (CS), MCA
Location: India
Primary Skills:
Hands on experience with OpenStack framework. Ability to set up private cloud using OpenStack environment. Awareness to various OpenStack services and modules
Strong experience with OpenStack services like Neutron, Cinder, Keystone, etc.
Proficiency in programming languages such as Python, Ruby, or Go.
Strong knowledge of Linux systems administration and networking.
Familiarity with virtualization technologies like KVM or VMware.
Experience with configuration management and IaC tools like Ansible, Terraform.
Subject matter expertise in OpenStack security
Solid experience with Linux and shell scripting
Sound knowledge of cloud computing concepts & technologies, such as docker, Kubernetes, AWS, GCP, Azure etc.
Ability to configure OpenStack environment for optimum resources
Good knowledge of security, operations in open stack environment
Strong knowledge of Linux internals, networking, storage, security
Strong knowledge of VMware Enterprise products (ESX, vCenter)
Hands on experience with HEAT orchestration
Experience with CI/CD, monitoring, operational aspects
Strong experience working with Rest API's, JSON
Exposure to Big data technologies ( Messaging queues, Hadoop/MPP, NoSQL databases)
Hands on experience with open source monitoring tools like Grafana/Prometheus/Nagios/Ganglia/Zabbix etc.
Strong verbal and written communication skills are mandatory
Excellent analytical and problem solving skills are mandatory
Role & Responsibilities
Advise customers and colleagues on cloud and virtualization topics
Work with the architecture team on cloud design projects using openstack
Collaborate with product, customer success, and presales on customer projects
Participate in onsite assessments and workshops when requested
Provide subject matter expertise and mentor colleagues
Set up open stack environments for projects
Design, deploy, and maintain OpenStack infrastructure.
Collaborate with cross-functional chapters to integrate OpenStack with other services (k8s, DBaaS)
Develop automation scripts and tools to streamline OpenStack operations.
Troubleshoot and resolve issues related to OpenStack services.
Monitor and optimize the performance and scalability of OpenStack components.
Stay updated with the latest OpenStack releases and contribute to the OpenStack community.
Work closely with Architects and Product Management to understand requirement
should be capable of working independently & responsible for end-to-end implementation
Should work with complete ownership and handle all issues without missing SLA's
Work closely with engineering team and support team
Should be able to debug the issues and report appropriately in the ticketing system
Contribute to improve the efficiency of the assignment by quality improvements & innovative suggestions
Should be able to debug/create scripts for automation
Should be able to configure monitoring utilities & set up alerts
Should be hands on in setting up OS, applications, databases and have passion to learn new technologies
Should be able to scan logs, errors, exception and get to the root cause of the issue
Contribute in developing a knowledge base on collaboration with other team members
Maintain customer loyalty through Integrity and accountability
Groom and mentor team members on project technologies and work
- Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
- A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
- Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
- Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
- Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
- Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
- Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
- Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
- Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
- Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
- Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
- Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
- Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
- Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
- In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
- Hands on expertise in real time analytics with Apache Spark.
- Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
- Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
- Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
- Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
- Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
- Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
- Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
- Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
- Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis.
- Generated various kinds of knowledge reports using Power BI based on Business specification.
- Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
- Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
- Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
- Good experience with use-case development, with Software methodologies like Agile and Waterfall.
- Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
- Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
- Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
- Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
- Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
Job Title: Data Engineer
Job Summary: As a Data Engineer, you will be responsible for designing, building, and maintaining the infrastructure and tools necessary for data collection, storage, processing, and analysis. You will work closely with data scientists and analysts to ensure that data is available, accessible, and in a format that can be easily consumed for business insights.
Responsibilities:
- Design, build, and maintain data pipelines to collect, store, and process data from various sources.
- Create and manage data warehousing and data lake solutions.
- Develop and maintain data processing and data integration tools.
- Collaborate with data scientists and analysts to design and implement data models and algorithms for data analysis.
- Optimize and scale existing data infrastructure to ensure it meets the needs of the business.
- Ensure data quality and integrity across all data sources.
- Develop and implement best practices for data governance, security, and privacy.
- Monitor data pipeline performance / Errors and troubleshoot issues as needed.
- Stay up-to-date with emerging data technologies and best practices.
Requirements:
Bachelor's degree in Computer Science, Information Systems, or a related field.
Experience with ETL tools like Matillion,SSIS,Informatica
Experience with SQL and relational databases such as SQL server, MySQL, PostgreSQL, or Oracle.
Experience in writing complex SQL queries
Strong programming skills in languages such as Python, Java, or Scala.
Experience with data modeling, data warehousing, and data integration.
Strong problem-solving skills and ability to work independently.
Excellent communication and collaboration skills.
Familiarity with big data technologies such as Hadoop, Spark, or Kafka.
Familiarity with data warehouse/Data lake technologies like Snowflake or Databricks
Familiarity with cloud computing platforms such as AWS, Azure, or GCP.
Familiarity with Reporting tools
Teamwork/ growth contribution
- Helping the team in taking the Interviews and identifying right candidates
- Adhering to timelines
- Intime status communication and upfront communication of any risks
- Tech, train, share knowledge with peers.
- Good Communication skills
- Proven abilities to take initiative and be innovative
- Analytical mind with a problem-solving aptitude
Good to have :
Master's degree in Computer Science, Information Systems, or a related field.
Experience with NoSQL databases such as MongoDB or Cassandra.
Familiarity with data visualization and business intelligence tools such as Tableau or Power BI.
Knowledge of machine learning and statistical modeling techniques.
If you are passionate about data and want to work with a dynamic team of data scientists and analysts, we encourage you to apply for this position.
- KSQL
- Data Engineering spectrum (Java/Spark)
- Spark Scala / Kafka Streaming
- Confluent Kafka components
- Basic understanding of Hadoop
Role: Principal Software Engineer
We looking for a passionate Principle Engineer - Analytics to build data products that extract valuable business insights for efficiency and customer experience. This role will require managing, processing and analyzing large amounts of raw information and in scalable databases. This will also involve developing unique data structures and writing algorithms for the entirely new set of products. The candidate will be required to have critical thinking and problem-solving skills. The candidates must be experienced with software development with advanced algorithms and must be able to handle large volume of data. Exposure with statistics and machine learning algorithms is a big plus. The candidate should have some exposure to cloud environment, continuous integration and agile scrum processes.
Responsibilities:
• Lead projects both as a principal investigator and project manager, responsible for meeting project requirements on schedule
• Software Development that creates data driven intelligence in the products which deals with Big Data backends
• Exploratory analysis of the data to be able to come up with efficient data structures and algorithms for given requirements
• The system may or may not involve machine learning models and pipelines but will require advanced algorithm development
• Managing, data in large scale data stores (such as NoSQL DBs, time series DBs, Geospatial DBs etc.)
• Creating metrics and evaluation of algorithm for better accuracy and recall
• Ensuring efficient access and usage of data through the means of indexing, clustering etc.
• Collaborate with engineering and product development teams.
Requirements:
• Master’s or Bachelor’s degree in Engineering in one of these domains - Computer Science, Information Technology, Information Systems, or related field from top-tier school
• OR Master’s degree or higher in Statistics, Mathematics, with hands on background in software development.
• Experience of 8 to 10 year with product development, having done algorithmic work
• 5+ years of experience working with large data sets or do large scale quantitative analysis
• Understanding of SaaS based products and services.
• Strong algorithmic problem-solving skills
• Able to mentor and manage team and take responsibilities of team deadline.
Skill set required:
• In depth Knowledge Python programming languages
• Understanding of software architecture and software design
• Must have fully managed a project with a team
• Having worked with Agile project management practices
• Experience with data processing analytics and visualization tools in Python (such as pandas, matplotlib, Scipy, etc.)
• Strong understanding of SQL and querying to NoSQL database (eg. Mongo, Casandra, Redis
About Telstra
Telstra is Australia’s leading telecommunications and technology company, with operations in more than 20 countries, including In India where we’re building a new Innovation and Capability Centre (ICC) in Bangalore.
We’re growing, fast, and for you that means many exciting opportunities to develop your career at Telstra. Join us on this exciting journey, and together, we’ll reimagine the future.
Why Telstra?
- We're an iconic Australian company with a rich heritage that's been built over 100 years. Telstra is Australia's leading Telecommunications and Technology Company. We've been operating internationally for more than 70 years.
- International presence spanning over 20 countries.
- We are one of the 20 largest telecommunications providers globally
- At Telstra, the work is complex and stimulating, but with that comes a great sense of achievement. We are shaping the tomorrow's modes of communication with our innovation driven teams.
Telstra offers an opportunity to make a difference to lives of millions of people by providing the choice of flexibility in work and a rewarding career that you will be proud of!
About the team
Being part of Networks & IT means you'll be part of a team that focuses on extending our network superiority to enable the continued execution of our digital strategy.
With us, you'll be working with world-leading technology and change the way we do IT to ensure business needs drive priorities, accelerating our digitisation programme.
Focus of the role
Any new engineer who comes into data chapter would be mostly into developing reusable data processing and storage frameworks that can be used across data platform.
About you
To be successful in the role, you'll bring skills and experience in:-
Essential
- Hands-on experience in Spark Core, Spark SQL, SQL/Hive/Impala, Git/SVN/Any other VCS and Data warehousing
- Skilled in the Hadoop Ecosystem(HDP/Cloudera/MapR/EMR etc)
- Azure data factory/Airflow/control-M/Luigi
- PL/SQL
- Exposure to NOSQL(Hbase/Cassandra/GraphDB(Neo4J)/MongoDB)
- File formats (Parquet/ORC/AVRO/Delta/Hudi etc.)
- Kafka/Kinesis/Eventhub
Highly Desirable
Experience and knowledgeable on the following:
- Spark Streaming
- Cloud exposure (Azure/AWS/GCP)
- Azure data offerings - ADF, ADLS2, Azure Databricks, Azure Synapse, Eventhubs, CosmosDB etc.
- Presto/Athena
- Azure DevOps
- Jenkins/ Bamboo/Any similar build tools
- Power BI
- Prior experience in building or working in team building reusable frameworks,
- Data modelling.
- Data Architecture and design principles. (Delta/Kappa/Lambda architecture)
- Exposure to CI/CD
- Code Quality - Static and Dynamic code scans
- Agile SDLC
If you've got a passion to innovate, succeed as part of a great team, and looking for the next step in your career, we'd welcome you to apply!
___________________________
We’re committed to building a diverse and inclusive workforce in all its forms. We encourage applicants from diverse gender, cultural and linguistic backgrounds and applicants who may be living with a disability. We also offer flexibility in all our roles, to ensure everyone can participate.
To learn more about how we support our people, including accessibility adjustments we can provide you through the recruitment process, visit tel.st/thrive.
Job Responsibilities:
- Identify valuable data sources and automate collection processes
- Undertake preprocessing of structured and unstructured data.
- Analyze large amounts of information to discover trends and patterns
- Helping develop reports and analysis.
- Present information using data visualization techniques.
- Assessing tests and implementing new or upgraded software and assisting with strategic decisions on new systems.
- Evaluating changes and updates to source production systems.
- Develop, implement, and maintain leading-edge analytic systems, taking complicated problems and building simple frameworks
- Providing technical expertise in data storage structures, data mining, and data cleansing.
- Propose solutions and strategies to business challenges
Desired Skills and Experience:
- At least 1 year of experience in Data Analysis
- Complete understanding of Operations Research, Data Modelling, ML, and AI concepts.
- Knowledge of Python is mandatory, familiarity with MySQL, SQL, Scala, Java or C++ is an asset
- Experience using visualization tools (e.g. Jupyter Notebook) and data frameworks (e.g. Hadoop)
- Analytical mind and business acumen
- Strong math skills (e.g. statistics, algebra)
- Problem-solving aptitude
- Excellent communication and presentation skills.
- Bachelor’s / Master's Degree in Computer Science, Engineering, Data Science or other quantitative or relevant field is preferred
one of the world's leading multinational investment bank
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with
For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Multinational Company providing energy & Automation digital
Roles and Responsibilities
Multinational Company providing energy & Automation digital
Skills
at Play Games24x7
• B. E. /B. Tech. in Computer Science or MCA from a reputed university.
• 3.5 plus years of experience in software development, with emphasis on JAVA/J2EE Server side
programming.
• Hands on experience in core Java, multithreading, RMI, socket programing, JDBC, NIO, webservices
and design patterns.
• Knowledge of distributed system, distributed caching, messaging frameworks, ESB etc.
• Experience in Linux operating system and PostgreSQL/MySQL/MongoDB/Cassandra database.
• Additionally, knowledge of HBase, Hadoop and Hive is desirable.
• Familiarity with message queue systems and AMQP and Kafka is desirable.
• Experience as a participant in agile methodologies.
• Excellent written and verbal communication skills and presentation skills.
• This is not a fullstack requirement, we are looking for a purely backend expert.
Duration : Full Time
Location : Vishakhapatnam, Bangalore, Chennai
years of experience : 3+ years
Job Description :
- 3+ Years of working as a Data Engineer with thorough understanding of data frameworks that collect, manage, transform and store data that can derive business insights.
- Strong communications (written and verbal) along with being a good team player.
- 2+ years of experience within the Big Data ecosystem (Hadoop, Sqoop, Hive, Spark, Pig, etc.)
- 2+ years of strong experience with SQL and Python (Data Engineering focused).
- Experience with GCP Data Services such as BigQuery, Dataflow, Dataproc, etc. is an added advantage and preferred.
- Any prior experience in ETL tools such as DataStage, Informatica, DBT, Talend, etc. is an added advantage for the role.
Urgent Openings with one of our client
Experience : 3 to 7 Years
Number of Positions : 20
Job Location : Hyderabad
Notice : 30 Days
1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus
Hadoop and Hive requirements as good to have or understanding of is enough.
Data Engineer
- High Skilled and proficient on Azure Data Engineering Tech stacks (ADF, Databricks) - Should be well experienced in design and development of Big data integration platform (Kafka, Hadoop). - Highly skilled and experienced in building medium to complex data integration pipelines for Data at Rest and streaming data using Spark. - Strong knowledge in R/Python. - Advanced proficiency in solution design and implementation through Azure Data Lake, SQL and NoSQL Databases. - Strong in Data Warehousing concepts - Expertise in SQL, SQL tuning, Data Management (Data Security), schema design, Python and ETL processes - Highly Motivated, Self-Starter and quick learner - Must have Good knowledge on Data modelling and understating of Data analytics - Exposure to Statistical procedures, Experiments and Machine Learning techniques is an added advantage. - Experience in leading small team of 6/7 Data Engineers. - Excellent written and verbal communication skills
|
WHAT YOU WILL DO:
-
● Create and maintain optimal data pipeline architecture.
-
● Assemble large, complex data sets that meet functional / non-functional business requirements.
-
● Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
-
● Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide
variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).
-
● Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,
operational efficiency and other key business performance metrics.
-
● Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
-
● Keep our data separated and secure across national boundaries through multiple data centers and AWS
regions.
-
● Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
-
● Work with data and analytics experts to strive for greater functionality in our data systems.
REQUIRED SKILLS & QUALIFICATIONS:
-
● 5+ years of experience in a Data Engineer role.
-
● Advanced working SQL knowledge and experience working with relational databases, query authoring
(SQL) as well as working familiarity with a variety of databases.
-
● Experience building and optimizing 'big data' data pipelines, architectures and data sets.
-
● Experience performing root cause analysis on internal and external data and processes to answer
specific business questions and identify opportunities for improvement.
-
● Strong analytic skills related to working with unstructured datasets.
-
● Build processes supporting data transformation, data structures, metadata, dependency and workload
management.
-
● A successful history of manipulating, processing and extracting value from large disconnected datasets.
-
● Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
-
● Strong project management and organizational skills.
-
● Experience supporting and working with cross-functional teams in a dynamic environment
-
● Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.
-
● Experience with AWS cloud services: EC2, EMR, S3, Athena
-
● Experience with Linux
-
● Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.
PREFERRED SKILLS & QUALIFICATIONS:
● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
Our company is seeking to hire a skilled software developer to help with the development of our AI/ML platform.
Your duties will primarily revolve around building Platform by writing code in Scala, as well as modifying platform
to fix errors, work on distributed computing, adapt it to new cloud services, improve its performance, or upgrade
interfaces. To be successful in this role, you will need extensive knowledge of programming languages and the
software development life-cycle.
Responsibilities:
Analyze, design develop, troubleshoot and debug Platform
Writes code and guides other team membersfor best practices and performs testing and debugging of
applications.
Specify, design and implementminor changes to existing software architecture. Build highly complex
enhancements and resolve complex bugs. Build and execute unit tests and unit plans.
Duties and tasks are varied and complex, needing independent judgment. Fully competent in own area of
expertise
Experience:
The candidate should have about 2+ years of experience with design and development in Java/Scala. Experience in
algorithm, Distributed System, Data-structure, database and architectures of distributed System is mandatory.
Required Skills:
1. In-depth knowledge of Hadoop, Spark architecture and its componentssuch as HDFS, YARN and executor, cores and memory param
2. Knowledge of Scala/Java.
3. Extensive experience in developing spark job. Should possess good Oops knowledge and be aware of
enterprise application design patterns.
4. Good knowledge of Unix/Linux.
5. Experience working on large-scale software projects
6. Keep an eye out for technological trends, open-source projects that can be used.
7. Knows common programming languages Frameworks
Number Theory is looking for experienced software/data engineer who would be focused on owning and rearchitecting dynamic pricing engineering systems
Job Responsibilities:
Evaluate and recommend Big Data technology stack best suited for NT AI at scale Platform
and other products
Lead the team for defining proper Big Data Architecture Design.
Design and implement features on NT AI at scale platform using Spark and other Hadoop
Stack components.
Drive significant technology initiatives end to end and across multiple layers of architecture
Provides strong technical leadership in adopting and contributing to open source technologies related to Big Data across multiple engagements
Designing /architecting complex, highly available, distributed, failsafe compute systems dealing with considerable scalable amount of data
Identify and work upon incorporating Non-functional requirements into the solution (Performance, scalability, monitoring etc.)
Requirements:
A successful candidate with 8+ years of experience in the role of implementation of a highend software product.
Provides technical leadership in Big Data space (Spark and Hadoop Stack like Map/Reduc,
HDFS, Hive, HBase, Flume, Sqoop etc. NoSQL stores like Cassandra, HBase etc) across
Engagements and contributes to open-source Big Data technologies.
Rich hands on in Spark and worked on Spark at a larger scale.
Visualize and evangelize next generation infrastructure in Big Data space (Batch, Near
Real-time, Realtime technologies).
Passionate for continuous learning, experimenting, applying and contributing towards
cutting edge open-source technologies and software paradigms
Expert-level proficiency in Java and Scala.
Strong understanding and experience in distributed computing frameworks, particularly
Apache Hadoop2.0 (YARN; MR & HDFS) and associated technologies one or more of Hive,
Sqoop, Avro, Flume, Oozie, Zookeeper, etc.Hands-on experience with Apache Spark and its
components (Streaming, SQL, MLLib)
Operating knowledge of cloud computing platforms (AWS,Azure) –
Good to have:
Operating knowledge of different enterprise hadoop distribution (C) –
Good Knowledge of Design Patterns
Experience working within a Linux computing environment, and use of command line tools
including knowledge of shell/Python scripting for automating common tasks.
Role / Purpose - Lead Developer - API and Microservices
Must have a strong hands-on development track record building integration utilizing a variety of integration products, tools, protocols, technologies, and patterns.
- Must have an in-depth understanding of SOA/EAI/ESB concepts, SOA Governance, Event-Driven Architecture, message-based architectures, file sharing, and exchange platforms, data virtualization and caching strategies, J2EE design patterns, frameworks
- Should possess experience with at least one of middleware technologies (Application Servers, BPMS, BRMS, ESB & Message Brokers), Programming languages (e.g. Java/J2EE, JavaScript, COBOL, C), Operating Systems (e.g. Windows, Linux, MVS), and Databases (DB2, MySQL, No SQL Databases like MongoDB, Cassandra, Hadoop, etc.)
- Must have experience implementing API Service architectures (SOAP, REST) using any of the market-leading API Management tools such as Apigee and frameworks such as Spring Boot for Microservices
- Should have Advanced skills in implementing API Service architectures (SOAP, REST) using any of the market-leading API Management tools such as Apigee or similar frameworks such as Spring Boot for Microservices
- Appetite to manage large-scale projects and multiple tracks
- Experience and knowhow of the e-commerce domain and retail experience are preferred
- Good communication & people managerial skills
Job Description
- Solid technical skills with a proven and successful history working with data at scale and empowering organizations through data
- Big data processing frameworks: Spark, Scala, Hadoop, Hive, Kafka, EMR with Python
- Advanced experience and hands-on architecture and administration experience on big data platforms
The thrill of working at a start-up that is starting to scale massively is something else. Simpl (FinTech startup of the year - 2020) was formed in 2015 by Nitya Sharma, an investment banker from Wall Street and Chaitra Chidanand, a tech executive from the Valley, when they teamed up with a very clear mission - to make money simple so that people can live well and do amazing things. Simpl is the payment platform for the mobile-first world, and we’re backed by some of the best names in fintech globally (folks who have invested in Visa, Square and Transferwise), and
has Joe Saunders, Ex Chairman and CEO of Visa as a board member.
Everyone at Simpl is an internal entrepreneur who is given a lot of bandwidth and resources to create the next breakthrough towards the long term vision of “making money Simpl”. Our first product is a payment platform that lets people buy instantly, anywhere online, and pay later. In
the background, Simpl uses big data for credit underwriting, risk and fraud modelling, all without any paperwork, and enables Banks and Non-Bank Financial Companies to access a whole new consumer market.
In place of traditional forms of identification and authentication, Simpl integrates deeply into merchant apps via SDKs and APIs. This allows for more sophisticated forms of authentication that take full advantage of smartphone data and processing power
Skillset:
Workflow manager/scheduler like Airflow, Luigi, Oozie
Good handle on Python
ETL Experience
Batch processing frameworks like Spark, MR/PIG
File formats: parquet, JSON, XML, thrift, avro, protobuff
Rule engine (drools - business rule management system)
Distributed file systems like HDFS, NFS, AWS, S3 and equivalent
Built/configured dashboards
Nice to have:
Data platform experience for eg: building data lakes, working with near - realtime
applications/frameworks like storm, flink, spark.
AWS
File encoding types: Thrift, Avro, Protobuff, Parquet, JSON, XML
HIVE, HBASE
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.
Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.
What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.
Qualifications & Experience relevant for the role
• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).
• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
● Our Infrastructure team is looking for an excellent Big Data Engineer to join a core group that
designs the industry’s leading Micro-Engagement Platform. This role involves design and
implementation of architectures and frameworks of big data for industry’s leading intelligent
workflow automation platform. As a specialist in Ushur Engineering team, your responsibilities will
be to:
● Use your in-depth understanding to architect and optimize databases and data ingestion pipelines
● Develop HA strategies, including replica sets and sharding to for highly available clusters
● Recommend and implement solutions to improve performance, resource consumption, and
resiliency
● On an ongoing basis, identify bottlenecks in databases in development and production
environments and propose solutions
● Help DevOps team with your deep knowledge in the area of database performance, scaling,
tuning, migration & version upgrades
● Provide verifiable technical solutions to support operations at scale and with high availability
● Recommend appropriate data processing toolset and big data ecosystems to adopt
● Design and scale databases and pipelines across multiple physical locations on cloud
● Conduct Root-cause analysis of data issues
● Be self-driven, constantly research and suggest latest technologies
The experience you need:
● Engineering degree in Computer Science or related field
● 10+ years of experience working with databases, most of which should have been around
NoSql technologies
● Expertise in implementing and maintaining distributed, Big data pipelines and ETL
processes
● Solid experience in one of the following cloud-native data platforms (AWS Redshift/ Google
BigQuery/ SnowFlake)
● Exposure to real time processing techniques like Apache Kafka and CDC tools
(Debezium, Qlik Replicate)
● Strong experience in Linux Operating System
● Solid knowledge of database concepts, MongoDB, SQL, and NoSql internals
● Experience with backup and recovery for production and non-production environments
● Experience in security principles and its implementation
● Exceptionally passionate about always keeping the product quality bar at an extremely
high level
Nice-to-haves
● Proficient with one or more of Python/Node.Js/Java/similar languages
Why you want to Work with Us:
● Great Company Culture. We pride ourselves on having a values-based culture that
is welcoming, intentional, and respectful. Our internal NPS of over 65 speaks for
itself - employees recommend Ushur as a great place to work!
● Bring your whole self to work. We are focused on building a diverse culture, with
innovative ideas where you and your ideas are valued. We are a start-up and know
that every person has a significant impact!
● Rest and Relaxation. 13 Paid leaves, wellness Fridays offs (aka a day off to care
for yourself- every last Friday of the month), 12 paid sick Leaves, and more!
● Health Benefits. Preventive health checkups, Medical Insurance covering the
dependents, wellness sessions, and health talks at the office
● Keep learning. One of our core values is Growth Mindset - we believe in lifelong
learning. Certification courses are reimbursed. Ushur Community offers wide
resources for our employees to learn and grow.
● Flexible Work. In-office or hybrid working model, depending on position and
location. We seek to create an environment for all our employees where they can
thrive in both their profession and personal life.
Technical/Core skills
- Minimum 3 yrs of exp in Informatica Big data Developer(BDM) in Hadoop environment.
- Have knowledge of informatica Power exchange (PWX).
- Minimum 3 yrs of exp in big data querying tool like Hive and Impala.
- Ability to designing/development of complex mappings using informatica Big data Developer.
- Create and manage Informatica power exchange and CDC real time implementation
- Strong Unix knowledge skills for writing shell scripts and troubleshoot of existing scripts.
- Good knowledge of big data platforms and its framework.
- Good to have an experience in cloudera data platform (CDP)
- Experience with building stream processing systems using Kafka and spark
- Excellent SQL knowledge
Soft skills :
- Ability to work independently
- Strong analytical and problem solving skills
- Attitude of learning new technology
- Regular interaction with vendors, partners and stakeholders
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive
Responsibilities:
•Build robust and scalable data infrastructure software
•Design and create services and system architecture for your projects
•Improve code quality through writing unit tests, automation, and code reviews
•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
•The candidate has to understand the basics of Kubernetes.
•Build out the production and test infrastructure.
•Develop automation frameworks to reproduce issues and prevent regressions.
•Work closely with other developers providing services to our system.
•Help to analyze and to understand how customers use the product and improve it where necessary.
Qualifications:
•Deep familiarity with Java programming language.
•Hands-on experience with distributed systems.
•Knowledge of database concepts, RDBMS internals.
•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
•Has experience working in a distributed team.
•Has 3+ years of experience in software development.
Senior Software Engineer - 221254.
We (the Software Engineer team) are looking for a motivated, experienced person with a data driven approach to join our Distribution Team in Budapest or Szeged to help design, execute and improve our test sets and infrastructure for producing high-quality Hadoop software.
A Day in the life
You will be part of a team that makes sure our releases are predictable and deliver high value to the customer. This team is responsible for automating and maintaining our test harness, and making test results reliable and repeatable.
You will…
•work on making our distributed software stack more resilient to high-scale endurance runs and customer simulations
•provide valuable fixes to our product development teams to the issues you’ve found during exhaustive test runs
•work with product and field teams to make sure our customer simulations match the expectations and can provide valuable feedback to our customers
•work with amazing people - We are a fun & smart team, including many of the top luminaries in Hadoop and related open source communities. We frequently interact with the research community, collaborate with engineers at other top companies & host cutting edge researchers for tech talks.
•do innovative work - Cloudera pushes the frontier of big data & distributed computing, as our track record shows. We work on high-profile open source projects, interacting daily with engineers at other exciting companies, speaking at meet-ups, etc.
•be a part of a great culture - Transparent and open meritocracy. Everybody is always thinking of better ways to do things, and coming up with ideas that make a difference. We build our culture to be the best workplace in our careers.
You have...
•strong knowledge in at least 1 of the following languages: Java / Python / Scala / C++ / C#
•hands-on experience with at least 1 of the following configuration management tools: Ansible, Chef, Puppet, Salt
•confidence with Linux environments
•ability to identify critical weak spots in distributed software systems
•experience in developing automated test cases and test plans
•ability to deal with distributed systems
•solid interpersonal skills conducive to a distributed environment
•ability to work independently on multiple tasks
•self-driven & motivated, with a strong work ethic and a passion for problem solving
•innovate and automate and break the code
The right person in this role has an opportunity to make a huge impact at Cloudera and add value to our future decisions. If this position has piqued your interest and you have what we described - we invite you to apply! An adventure in data awaits.
Hi,
Enterprise Minds is looking for Data Architect for Pune Location.
Req Skills:
Python,Pyspark,Hadoop,Java,Scala
- Produce clean code and automated tests
- Align with enterprise architecture frameworks and standards
- Be the role-model for all engineers in the team in terms of technical competency
- Research, assess and adopt new technologies as required
- Be a guide and mentor to the team members and help in ramping up the overall skill-base of the team.
- Produce detailed estimates and optimized work plans for requirements and changes
- Ensure that features are delivered on time and that they meet the business needs
- Strive for quality of performance, usability, reliability, maintainability, and extensibility
- Identify opportunities for process and tool improvements
- Use analytical rigor to produce effective solutions to poorly defined problems
- Follow Build to Ship mantra in practice with full Dev Ops implementation
- 10+ years of core software development and product creation experience in CPaaS.
- Working knowledge in VoIP, communication API , J2EE, JMS/ Kafka, Web-Services, Hadoop, React, Node.js, GoLang.
- Working knowledge in Various CPaaS channels - SMS, voice, WhatsApp, RCS, Email.
- Working knowledge of DevOps, automation testing, test driven development, behavior driven development, server-less or micro-services
- Experience with AWS / Azure deployments
- Solid background in large scale software development.
- Full stack understanding of web/mobile/API/database development concepts and patterns
- Exposure to Microservices, Iaas, PaaS, service mesh, SaaS and cloud native application development.
- Understanding of Agile Scrum and SDLC principles.
- Containerization and orchestrations:- Dockers, kuberenetes, openshift, consule etc.
- Knowledge on NFV (openstack, Vsphere, Vcloud etc)
- Experience in Data Analytics/AI/ML or Marketing Tech domain is an added advantage
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
• Project Planning and Management
o Take end-to-end ownership of multiple projects / project tracks
o Create and maintain project plans and other related documentation for project
objectives, scope, schedule and delivery milestones
o Lead and participate across all the phases of software engineering, right from
requirements gathering to GO LIVE
o Lead internal team meetings on solution architecture, effort estimation, manpower
planning and resource (software/hardware/licensing) planning
o Manage RIDA (Risks, Impediments, Dependencies, Assumptions) for projects by
developing effective mitigation plans
• Team Management
o Act as the Scrum Master
o Conduct SCRUM ceremonies like Sprint Planning, Daily Standup, Sprint Retrospective
o Set clear objectives for the project and roles/responsibilities for each team member
o Train and mentor the team on their job responsibilities and SCRUM principles
o Make the team accountable for their tasks and help the team in achieving them
o Identify the requirements and come up with a plan for Skill Development for all team
members
• Communication
o Be the Single Point of Contact for the client in terms of day-to-day communication
o Periodically communicate project status to all the stakeholders (internal/external)
• Process Management and Improvement
o Create and document processes across all disciplines of software engineering
o Identify gaps and continuously improve processes within the team
o Encourage team members to contribute towards process improvement
o Develop a culture of quality and efficiency within the team
Must have:
• Minimum 08 years of experience (hands-on as well as leadership) in software / data engineering
across multiple job functions like Business Analysis, Development, Solutioning, QA, DevOps and
Project Management
• Hands-on as well as leadership experience in Big Data Engineering projects
• Experience developing or managing cloud solutions using Azure or other cloud provider
• Demonstrable knowledge on Hadoop, Hive, Spark, NoSQL DBs, SQL, Data Warehousing, ETL/ELT,
DevOps tools
• Strong project management and communication skills
• Strong analytical and problem-solving skills
• Strong systems level critical thinking skills
• Strong collaboration and influencing skills
Good to have:
• Knowledge on PySpark, Azure Data Factory, Azure Data Lake Storage, Synapse Dedicated SQL
Pool, Databricks, PowerBI, Machine Learning, Cloud Infrastructure
• Background in BFSI with focus on core banking
• Willingness to travel
Work Environment
• Customer Office (Mumbai) / Remote Work
Education
• UG: B. Tech - Computers / B. E. – Computers / BCA / B.Sc. Computer Science
at 6sense
The Company:
It’s no surprise that 6sense is named a top workplace year after year — we have industry-leading technology developed and taken to market by a world-class team. 6sense is Top Rated on Glassdoor with a 4.9/5 and our CEO Jason Zintak was recognized as the #1 CEO in the small & medium business category by Glassdoor’s https://www.glassdoor.com/Award/Top-CEOs-at-SMBs-LST_KQ0%2C16.htm">2021 Top CEO Employees Choice Awards.
In 2021, the company was recognized for having the Best Company for Diversity, Best Company for Women, Best CEO, Best Company Culture, Best Company Perks & Benefits and Happiest Employees from the employee feedback platform Comparably. In addition, 6sense has also won several accolades that demonstrate its reputation as an employer of choice including the Glassdoor Best Place to Work (2022), TrustRadius Tech Cares (2021) and Inc. Best Workplaces (2022, 2021, 2020, 2019).
6sense reinvents the way organizations create, manage, and convert pipeline to revenue. The 6sense Revenue AI captures anonymous buying signals, predicts the right accounts to target at the ideal time, and recommends the channels and messages to boost revenue performance. Removing guesswork, friction and wasted sales effort, 6sense empowers sales, marketing, and customer success teams to significantly improve pipeline quality, accelerate sales velocity, increase conversion rates, and grow revenue predictably.
Senior Software Engineer - Infrastructure, Cloud
Responsibilities:
Develop and deploy services to improve the availability, ease of use/management, and visibility of 6sense systems
Building and scaling out our services and infrastructure
Learning and adopting technologies that may aide in solving our challenges
Own our critical underlying systems like AWS, Kubernetes, Mesos, infrastructure deployment, and compute cluster architecture (which services frameworks and engines like Hadoop/Hive/Presto)
Write/review/debug production code, develop documentation and capacity plans, and debug live production problems Contributing back to open-source projects if we need to add or patch functionality
Support the overall Software Engineering team to resolve any issues they encounter
Minimum Qualifications:
5+ years of experience with Linux/Unix system administration and networking fundamentals 3+ years in a Software Engineering role or equivalent experience
4+ years of working with AWS
4+ years of experience working with Kubernetes, Docker.
Strong skills in reading code as well as writing clean, maintainable, and scalable code
Good knowledge of Python
Experience designing, building, and maintaining scalable services and/or service-oriented architecture
Experience with high-availability
Experience with modern configuration management tools (e.g. Ansible/AWX, Chef, Puppet, Pulumi) and idempotency
Bonus Requirements:
Knowledge of standard security practices
Knowledge of the Hadoop ecosystem (e.g. Hadoop, Hive, Presto) including deployment, scaling, and maintenance Experience with operating and maintaining VPN/SSH/ZeroTrust access infrastructure
Experience with CDNs such as CloudFront and Akamai
Good knowledge of Javascript, Java, Golang
Exposure to modern build systems such as Bazel, Buck, or Pants#LI-remote
Every person in every role at 6sense owns a part of defining the future of our industry-leading technology. You’ll join a team where curiosity is prized, no one’s satisfied with the status quo, and everyone’s all-in on the collective good.6sense is a place where difference-makers roll up their sleeves, take risks, act with integrity, and measure successby the value we create for our customers.
We want 6sense to be the best chapter of your career.
Feel part of something
You’ll be part of building tomorrow’s tech, revolutionizing how marketing and sales teams create, manage, and convert pipeline to revenue. And you’ll be seen and appreciated by co-workers who challenge you, cheer you on, and always have your back.
At 6sense, you’ll experience the passion from customers and colleagues alike for our market-leading vision, and you're entrusted with applying your unique talents to help bring that vision to life.
Build a career
As part of a company on a rocketship trajectory, there’s no way around it: You’re going to experience unparalleled career growth. With colleagues as humble and hungry as you are, and a leadership philosophy grounded in trust, transparency, and empowerment, every day is a chance to improve on the one before.
Enjoy access to our Udemy Training Library with 5,000+ courses, give and get recognition from your coworkers, and spend time with our executive team every two weeks in our All Hands gathering to connect, learn and ask leaders about whatever is on your mind.
Enjoy work, and your life
This is a place where you’ll do your best work and inspire others to do theirs — where you’re guaranteed to make real connections, for life, along the way.
We want to help you prioritize health and wellness, today and tomorrow. Take advantage of family medical coverage; a monthly stipend to support your physical, mental, and financial wellness; generous paid parental leave benefits; Plus, we have an open time-off policy, so you can take the time you need.
Set for success
A vision as big as ours only comes to life when we’re all winning together.
We’ll make sure you have the equipment you need to work at home or in one of our offices. And have the right snacks, pens or lighting with our work-from-home expense reimbursement allowance. We also partner with WeWork to make sure that if your choice is a hybrid of home and office, we have you covered in the locations they’re offered.
That’s the commitment we make to every one of our employees. If this sounds like a place where you'll thrive as you take your success to the next level, let’s chat!
Company Name: Intraedge Technologies Ltd (https://intraedge.com/" target="_blank">https://intraedge.com/)
Type: Permanent, Full time
Location: Any
A Bachelor’s degree in computer science, computer engineering, other technical discipline, or equivalent work experience
- 4+ years of software development experience
- 4+ years exp in programming languages- Python, spark, Scala, Hadoop, hive
- Demonstrated experience with Agile or other rapid application development methods
- Demonstrated experience with object-oriented design and coding.
Please mail you rresume to poornimakattherateintraedgedotcomalong with NP, how soon can you join, ECTC, Availability for interview, Location
Responsibilities:
- Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
- Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
- Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
- Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
- Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
- Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
- Analyse large amounts of information to discover trends and patterns
- Mine and analyse data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.\
Requirements
- Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
- 2-6 years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
- Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
- Good understanding of AWS ecosystem for big data analytics
- Hands on experience in creating data pipelines either using tools or by independently writing scripts
- Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
- Strong problem solving skills with an emphasis on product development.
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
About Slintel (a 6sense company) :
Slintel, a 6sense company, the leader in capturing technographics-powered buying intent, helps companies uncover the 3% of active buyers in their target market. Slintel evaluates over 100 billion data points and analyzes factors such as buyer journeys, technology adoption patterns, and other digital footprints to deliver market & sales intelligence.
Slintel's customers have access to the buying patterns and contact information of more than 17 million companies and 250 million decision makers across the world.
Slintel is a fast growing B2B SaaS company in the sales and marketing tech space. We are funded by top tier VCs, and going after a billion dollar opportunity. At Slintel, we are building a sales development automation platform that can significantly improve outcomes for sales teams, while reducing the number of hours spent on research and outreach.
We are a big data company and perform deep analysis on technology buying patterns, buyer pain points to understand where buyers are in their journey. Over 100 billion data points are analyzed every week to derive recommendations on where companies should focus their marketing and sales efforts on. Third party intent signals are then clubbed with first party data from CRMs to derive meaningful recommendations on whom to target on any given day.
6sense is headquartered in San Francisco, CA and has 8 office locations across 4 countries.
6sense, an account engagement platform, secured $200 million in a Series E funding round, bringing its total valuation to $5.2 billion 10 months after its $125 million Series D round. The investment was co-led by Blue Owl and MSD Partners, among other new and existing investors.
Linkedin (Slintel) : https://www.linkedin.com/company/slintel/">https://www.linkedin.com/company/slintel/
Industry : Software Development
Company size : 51-200 employees (189 on LinkedIn)
Headquarters : Mountain View, California
Founded : 2016
Specialties : Technographics, lead intelligence, Sales Intelligence, Company Data, and Lead Data.
Website (Slintel) : https://www.slintel.com/slintel">https://www.slintel.com/slintel
Linkedin (6sense) : https://www.linkedin.com/company/6sense/">https://www.linkedin.com/company/6sense/
Industry : Software Development
Company size : 501-1,000 employees (937 on LinkedIn)
Headquarters : San Francisco, California
Founded : 2013
Specialties : Predictive intelligence, Predictive marketing, B2B marketing, and Predictive sales
Website (6sense) : https://6sense.com/">https://6sense.com/
Acquisition News :
https://inc42.com/buzz/us-based-based-6sense-acquires-b2b-buyer-intelligence-startup-slintel/
Funding Details & News :
Slintel funding : https://www.crunchbase.com/organization/slintel">https://www.crunchbase.com/organization/slintel
6sense funding : https://www.crunchbase.com/organization/6sense">https://www.crunchbase.com/organization/6sense
https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round">https://www.nasdaq.com/articles/ai-software-firm-6sense-valued-at-%245.2-bln-after-softbank-joins-funding-round
https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round">https://www.bloomberg.com/news/articles/2022-01-20/6sense-reaches-5-2-billion-value-with-softbank-joining-round
https://xipometer.com/en/company/6sense">https://xipometer.com/en/company/6sense
Slintel & 6sense Customers :
https://www.featuredcustomers.com/vendor/slintel/customers
https://www.featuredcustomers.com/vendor/6sense/customers">https://www.featuredcustomers.com/vendor/6sense/customers
About the job
Responsibilities
- Work in collaboration with the application team and integration team to design, create, and maintain optimal data pipeline architecture and data structures for Data Lake/Data Warehouse
- Work with stakeholders including the Sales, Product, and Customer Support teams to assist with data-related technical issues and support their data analytics needs
- Assemble large, complex data sets from third-party vendors to meet business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Elastic search, MongoDB, and AWS technology
- Streamline existing and introduce enhanced reporting and analysis solutions that leverage complex data sources derived from multiple internal systems
Requirements
- 3+ years of experience in a Data Engineer role
- Proficiency in Linux
- Must have SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with databases including Mysql, Mongo, Cassandra, and Athena
- Must have experience with Python/ Scala
- Must have experience with Big Data technologies like Apache Spark
- Must have experience with Apache Airflow
- Experience with data pipeline and ETL tools like AWS Glue
- Experience working with AWS cloud services: EC2 S3 RDS, Redshift and other Data solutions eg. Databricks, Snowflake
Desired Skills and Experience
Python, SQL, Scala, Spark, ETL
Job Description:
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
Responsibilities:
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
Skill Sets:
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
Senior Software Engineer - Data Team
We are seeking a highly motivated Senior Software Engineer with hands-on experience and build scalable, extensible data solutions, identifying and addressing performance bottlenecks, collaborating with other team members, and implementing best practices for data engineering. Our engineering process is fully agile, and has a really fast release cycle - which keeps our environment very energetic and fun.
What you'll do:
Design and development of scalable applications.
Work with Product Management teams to get maximum value out of existing data.
Contribute to continual improvement by suggesting improvements to the software system.
Ensure high scalability and performance
You will advocate for good, clean, well documented and performing code; follow standards and best practices.
We'd love for you to have:
Education: Bachelor/Master Degree in Computer Science.
Experience: 3-5 years of relevant experience in BI/DW with hands-on coding experience.
Mandatory Skills
Strong in problem-solving
Strong experience with Big Data technologies, Hive, Hadoop, Impala, Hbase, Kafka, Spark
Strong experience with orchestration framework like Apache oozie, Airflow
Strong experience of Data Engineering
Strong experience with Database and Data Warehousing technologies and ability to understand complex design, system architecture
Experience with the full software development lifecycle, design, develop, review, debug, document, and deliver (especially in a multi-location organization)
Good knowledge of Java
Desired Skills
Experience with Python
Experience with reporting tools like Tableau, QlikView
Experience of Git and CI-CD pipeline
Awareness of cloud platform ex:- AWS
Excellent communication skills with team members, Business owners, across teams
Be able to work in a challenging, dynamic environment and meet tight deadlines
Location: Bangalore/Pune/Hyderabad/Nagpur
4-5 years of overall experience in software development.
- Experience on Hadoop (Apache/Cloudera/Hortonworks) and/or other Map Reduce Platforms
- Experience on Hive, Pig, Sqoop, Flume and/or Mahout
- Experience on NO-SQL – HBase, Cassandra, MongoDB
- Hands on experience with Spark development, Knowledge of Storm, Kafka, Scala
- Good knowledge of Java
- Good background of Configuration Management/Ticketing systems like Maven/Ant/JIRA etc.
- Knowledge around any Data Integration and/or EDW tools is plus
- Good to have knowledge of using Python/Perl/Shell
Please note - Hbase hive and spark are must.