Senior Big Data Engineer
Note: Notice Period : 45 days
Banyan Data Services (BDS) is a US-based data-focused Company that specializes in comprehensive data solutions and services, headquartered in San Jose, California, USA.
We are looking for a Senior Hadoop Bigdata Engineer who has expertise in solving complex data problems across a big data platform. You will be a part of our development team based out of Bangalore. This team focuses on the most innovative and emerging data infrastructure software and services to support highly scalable and available infrastructure.
It's a once-in-a-lifetime opportunity to join our rocket ship startup run by a world-class executive team. We are looking for candidates that aspire to be a part of the cutting-edge solutions and services we offer that address next-gen data evolution challenges.
· 5+ years of experience working with Java and Spring technologies
· At least 3 years of programming experience working with Spark on big data; including experience with data profiling and building transformations
· Knowledge of microservices architecture is plus
· Experience with any NoSQL databases such as HBase, MongoDB, or Cassandra
· Experience with Kafka or any streaming tools
· Knowledge of Scala would be preferable
· Experience with agile application development
· Exposure of any Cloud Technologies including containers and Kubernetes
· Demonstrated experience of performing DevOps for platforms
· Strong Skillsets in Data Structures & Algorithm in using efficient way of code complexity
· Exposure to Graph databases
· Passion for learning new technologies and the ability to do so quickly
· A Bachelor's degree in a computer-related field or equivalent professional experience is required
· Scope and deliver solutions with the ability to design solutions independently based on high-level architecture
· Design and develop the big data-focused micro-Services
· Involve in big data infrastructure, distributed systems, data modeling, and query processing
· Build software with cutting-edge technologies on cloud
· Willing to learn new technologies and research-orientated projects
· Proven interpersonal skills while contributing to team effort by accomplishing related results as needed
About Banyan Data Services
We're hell-bent on making this the most enjoyable job you've ever had. Send your resume to [email protected]
We foster a positive leadership culture and ensure that employees at all levels feel comfortable collaborating with one another.
Grow & Learn
Our employees are being groomed by instilling a startup culture in them, as well as providing them with tech-savvy mentors and a passionate team to drive the highest quality of work.
The success and pleasure of employees are top concerns. No matter their level, employees feel valued in all aspects of their lives, including both their professional and personal aspirations.
We strive to create a diverse and inclusive workplace in which everyone, regardless of who they are or what they do for the company, feels equally involved and supported in all aspects of the workplace.
What is the role?
You will be responsible for building and maintaining highly scalable data infrastructure for our cloud-hosted SAAS product. You will work closely with the Product Managers and Technical team to define and implement data pipelines for customer-facing and internal reports.
- Design and develop resilient data pipelines.
- Write efficient queries to fetch data from the report database.
- Work closely with application backend engineers on data requirements for their stories.
- Designing and developing report APIs for the front end to consume.
- Focus on building highly available, fault-tolerant report systems.
- Constantly improve the architecture of the application by clearing the technical backlog.
- Adopt a culture of learning and development to constantly keep pace with and adopt new technolgies.
What are we looking for?
An enthusiastic individual with the following skills. Please do not hesitate to apply if you do not match all of it. We are open to promising candidates who are passionate about their work and are team players.
- Education - BE/MCA or equivalent
- Overall 8+ years of experience
- Expert level understanding of database concepts and BI.
- Well verse in databases such as MySQL, MongoDB and hands on experience in creating data models.
- Must have designed and implemented low latency data warehouse systems.
- Must have strong understanding of Kafka and related systems.
- Experience in clickhouse database preferred.
- Must have good knowledge of APIs and should be able to build interfaces for frontend engineers.
- Should be innovative and communicative in approach
- Will be responsible for functional/technical track of a project
Whom will you work with?
You will work with a top-notch tech team, working closely with the CTO and product team.
What can you look for?
A wholesome opportunity in a fast-paced environment that will enable you to juggle between concepts, yet maintain the quality on content, interact and share your ideas and have loads of learning while at work. Work with a team of highly talented young professionals and enjoy the benefits.
A fast-growing SaaS commerce company based in Bangalore with offices in Delhi, Mumbai, SF, Dubai, Singapore and Dublin. We have three products in our portfolio: Plum, Empuls and Compass. Works with over 1000 global clients. We help our clients in engaging and motivating their employees, sales teams, channel partners or consumers for better business results.
We are looking for candidates who have demonstrated both a strong business sense and deep understanding of the quantitative foundations of modelling.
• Excellent analytical and problem-solving skills, including the ability to disaggregate issues, identify root causes and recommend solutions
• Statistical programming software experience in SPSS and comfortable working with large data sets.
• R, Python, SAS & SQL are preferred but not a mandate
• Excellent time management skills
• Good written and verbal communication skills; understanding of both written and spoken English
• Strong interpersonal skills
• Ability to act autonomously, bringing structure and organization to work
• Creative and action-oriented mindset
• Ability to interact in a fluid, demanding and unstructured environment where priorities evolve constantly, and methodologies are regularly challenged
• Ability to work under pressure and deliver on tight deadlines
Qualifications and Experience:
• Graduate degree in: Statistics/Economics/Econometrics/Computer
Science/Engineering/Mathematics/MBA (with a strong quantitative background) or
• Strong track record work experience in the field of business intelligence, market
research, and/or Advanced Analytics
• Knowledge of data collection methods (focus groups, surveys, etc.)
• Knowledge of statistical packages (SPSS, SAS, R, Python, or similar), databases,
and MS Office (Excel, PowerPoint, Word)
• Strong analytical and critical thinking skills
• Industry experience in Consumer Experience/Healthcare a plus
Roles and Responsibilities
- Build High level technical design both for Streaming and batch processing systems
- Design and build reusable components, frameworks and libraries at scale to support analytics data products
- Perform POCs on new technology, architecture patterns
- Design and implement product features in collaboration with business and Technology stakeholders
- Anticipate, identify and solve issues concerning data management to improve data quality
- Clean, prepare and optimize data at scale for ingestion and consumption
- Drive the implementation of new data management projects and re-structure of the current data architecture
- Implement complex automated workflows and routines using workflow scheduling tools
- Build continuous integration, test-driven development and production deployment frameworks
- Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
- Analyze and profile data for the purpose of designing scalable solutions
- Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues
- Lead, Mentor and develop other Sr Data Engineers and Data engineers in adopting best practices and deliver data products.
- Partner closely with product management to understand business requirements, breakdown Epics,
- Partner with Engineering Managers to define technology roadmaps, align on design, architecture and enterprise strategy
- Expert level expertise in building big data solutions
- Hands-on experience building cloud scalable, real time and high-performance data lake solutions using AWS, EMR, S3, Hive & Spark, Athena
- Hands-on experience in delivering batch and streaming jobs
- Expertise in an agile and iterative model
- Expert level expertise relational SQL
- Experience with scripting languages such as Shell, Python
- Experience with source control tools such as GitHub and related dev process
- Experience with workflow sc
- Scheduling tools like Airflow
- In-depth understanding of micro services architecture
- Strong understanding of developing complex data solutions
- Experience working on end-to-end solution design
- Able to lead others in solving complex Data and Analytics problems
- Strong understanding of data structures and algorithms
- Strong hands-on experience in solution and technical design
- Has a strong problem solving and analytical mindset
- Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
- Able to quickly pick up new programming languages, technologies, and frameworks
Tiger Analytics is a global AI & analytics consulting firm. With data and technology at the core of our solutions, we are solving some of the toughest problems out there. Our culture is modeled around expertise and mutual respect with a team first mindset. Working at Tiger, you’ll be at the heart of this AI revolution. You’ll work with teams that push the boundaries of what-is-possible and build solutions that energize and inspire.
We are headquartered in the Silicon Valley and have our delivery centres across the globe. The below role is for our Chennai or Bangalore office, or you can choose to work remotely.
About the Role:
As an Associate Director - Data Science at Tiger Analytics, you will lead data science aspects of endto-end client AI & analytics programs. Your role will be a combination of hands-on contribution, technical team management, and client interaction.
• Work closely with internal teams and client stakeholders to design analytical approaches to
solve business problems
• Develop and enhance a broad range of cutting-edge data analytics and machine learning
problems across a variety of industries.
• Work on various aspects of the ML ecosystem – model building, ML pipelines, logging &
versioning, documentation, scaling, deployment, monitoring and maintenance etc.
• Lead a team of data scientists and engineers to embed AI and analytics into the client
business decision processes.
• High level of proficiency in a structured programming language, e.g. Python, R.
• Experience designing data science solutions to business problems
• Deep understanding of ML algorithms for common use cases in both structured and
unstructured data ecosystems.
• Comfortable with large scale data processing and distributed computing
• Excellent written and verbal communication skills
• 10+ years exp of which 8 years of relevant data science experience including hands-on
Designation will be commensurate with expertise/experience. Compensation packages among the best in the industry.
- Should have good hands-on experience in Informatica MDM Customer 360, Data Integration(ETL) using PowerCenter, Data Quality.
- Must have strong skills in Data Analysis, Data Mapping for ETL processes, and Data Modeling.
- Experience with the SIF framework including real-time integration
- Should have experience in building C360 Insights using Informatica
- Should have good experience in creating performant design using Mapplets, Mappings, Workflows for Data Quality(cleansing), ETL.
- Should have experience in building different data warehouse architecture like Enterprise,
- Federated, and Multi-Tier architecture.
- Should have experience in configuring Informatica Data Director in reference to the Data
- Governance of users, IT Managers, and Data Stewards.
- Should have good knowledge in developing complex PL/SQL queries.
- Should have working experience on UNIX and shell scripting to run the Informatica workflows and to control the ETL flow.
- Should know about Informatica Server installation and knowledge on the Administration console.
- Working experience with Developer with Administration is added knowledge.
- Working experience in Amazon Web Services (AWS) is an added advantage. Particularly on AWS S3, Data pipeline, Lambda, Kinesis, DynamoDB, and EMR.
- Should be responsible for the creation of automated BI solutions, including requirements, design,development, testing, and deployment
Data Warehouse and Analytics solutions that aggregate data across diverse sources and data types
including text, video and audio through to live stream and IoT in an agile project delivery
environment with a focus on DataOps and Data Observability. You will work with Azure SQL
Databases, Synapse Analytics, Azure Data Factory, Azure Datalake Gen2, Azure Databricks, Azure
Machine Learning, Azure Service Bus, Azure Serverless (LogicApps, FunctionApps), Azure Data
Catalogue and Purview among other tools, gaining opportunities to learn some of the most
advanced and innovative techniques in the cloud data space.
You will be building Power BI based analytics solutions to provide actionable insights into customer
data, and to measure operational efficiencies and other key business performance metrics.
You will be involved in the development, build, deployment, and testing of customer solutions, with
responsibility for the design, implementation and documentation of the technical aspects, including
integration to ensure the solution meets customer requirements. You will be working closely with
fellow architects, engineers, analysts, and team leads and project managers to plan, build and roll
out data driven solutions
Proven expertise in developing data solutions with Azure SQL Server and Azure SQL Data Warehouse (now
Demonstrated expertise of data modelling and data warehouse methodologies and best practices.
Ability to write efficient data pipelines for ETL using Azure Data Factory or equivalent tools.
Integration of data feeds utilising both structured (ex XML/JSON) and flat schemas (ex CSV,TXT,XLSX)
across a wide range of electronic delivery mechanisms (API/SFTP/etc )
Azure DevOps knowledge essential for CI/CD of data ingestion pipelines and integrations.
Scala, etc is required.
Expertise in creating technical and Architecture documentation (ex: HLD/LLD) is a must.
Proven ability to rapidly analyse and design solution architecture in client proposals is an added advantage.
Expertise with big data tools: Hadoop, Spark, Kafka, NoSQL databases, stream-processing systems is a plus.
5 or more years of hands-on experience in a data architect role with the development of ingestion,
integration, data auditing, reporting, and testing with Azure SQL tech stack.
full data and analytics project lifecycle experience (including costing and cost management of data
solutions) in Azure PaaS environment is essential.
Microsoft Azure and Data Certifications, at least fundamentals, are a must.
Experience using agile development methodologies, version control systems and repositories is a must.
A good, applied understanding of the end-to-end data process development life cycle.
A good working knowledge of data warehouse methodology using Azure SQL.
A good working knowledge of the Azure platform, it’s components, and the ability to leverage it’s
resources to implement solutions is a must.
Experience working in the Public sector or in an organisation servicing Public sector is a must,
Ability to work to demanding deadlines, keep momentum and deal with conflicting priorities in an
environment undergoing a programme of transformational change.
The ability to contribute and adhere to standards, have excellent attention to detail and be strongly driven
Experience with AWS or google cloud platforms will be an added advantage.
Experience with Azure ML services will be an added advantage Personal Attributes
Articulated and clear in communications to mixed audiences- in writing, through presentations and one-toone.
Ability to present highly technical concepts and ideas in a business-friendly language.
Ability to effectively prioritise and execute tasks in a high-pressure environment.
Calm and adaptable in the face of ambiguity and in a fast-paced, quick-changing environment
Extensive experience working in a team-oriented, collaborative environment as well as working
Comfortable with multi project multi-tasking consulting Data Architect lifestyle
Excellent interpersonal skills with teams and building trust with clients
Ability to support and work with cross-functional teams in a dynamic environment.
A passion for achieving business transformation; the ability to energise and excite those you work with
Initiative; the ability to work flexibly in a team, working comfortably without direct supervision.
We’re looking to hire someone to help scale Machine Learning and NLP efforts at Episource. You’ll work with the team that develops the models powering Episource’s product focused on NLP driven medical coding. Some of the problems include improving our ICD code recommendations , clinical named entity recognition and information extraction from clinical notes.
This is a role for highly technical machine learning & data engineers who combine outstanding oral and written communication skills, and the ability to code up prototypes and productionalize using a large range of tools, algorithms, and languages. Most importantly they need to have the ability to autonomously plan and organize their work assignments based on high-level team goals.
You will be responsible for setting an agenda to develop and ship machine learning models that positively impact the business, working with partners across the company including operations and engineering. You will use research results to shape strategy for the company, and help build a foundation of tools and practices used by quantitative staff across the company.
What you will achieve:
Define the research vision for data science, and oversee planning, staffing, and prioritization to make sure the team is advancing that roadmap
Invest in your team’s skills, tools, and processes to improve their velocity, including working with engineering counterparts to shape the roadmap for machine learning needs
Hire, retain, and develop talented and diverse staff through ownership of our data science hiring processes, brand, and functional leadership of data scientists
Evangelise machine learning and AI internally and externally, including attending conferences and being a thought leader in the space
Partner with the executive team and other business leaders to deliver cross-functional research work and models
Strong background in classical machine learning and machine learning deployments is a must and preferably with 4-8 years of experience
Knowledge of deep learning & NLP
Hands-on experience in TensorFlow/PyTorch, Scikit-Learn, Python, Apache Spark & Big Data platforms to manipulate large-scale structured and unstructured datasets.
Experience with GPU computing is a plus.
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization. This could be through technical leadership with ownership over a research agenda, or developing a team as a personnel manager in a new area at a larger company.
Expert-level experience with a wide range of quantitative methods that can be applied to business problems.
Evidence you’ve successfully been able to scope, deliver and sell your own research in a way that shifts the agenda of a large organization.
Excellent written and verbal communication skills on quantitative topics for a variety of audiences: product managers, designers, engineers, and business leaders.
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
Professional experience as a data science leader, setting the vision for how to most effectively use data in your organization
Expert-level experience with machine learning that can be applied to business problems
Evidence you’ve successfully been able to scope, deliver and sell your own work in a way that shifts the agenda of a large organization
Fluent in data fundamentals: SQL, data manipulation using a procedural language, statistics, experimentation, and modeling
Degree in a field that has very applicable use of data science / statistics techniques (e.g. statistics, applied math, computer science, OR a science field with direct statistics application)
5+ years of industry experience in data science and machine learning, preferably at a software product company
3+ years of experience managing data science teams, incl. managing/grooming managers beneath you
3+ years of experience partnering with executive staff on data topics