Real time media streaming Jobs in Pune
Mid / Senior Big Data Engineer
Job Description:
Role: Big Data EngineerNumber of open positions: 5Location: PuneAt Clairvoyant, we're building a thriving big data practice to help enterprises enable and accelerate the adoption of Big data and cloud services. In the big data space, we lead and serve as innovators, troubleshooters, and enablers. Big data practice at Clairvoyant, focuses on solving our customer's business problems by delivering products designed with best in class engineering practices and a commitment to keep the total cost of ownership to a minimum.
Must Have:
- 4-10 years of experience in software development.
- At least 2 years of relevant work experience on large scale Data applications.
- Strong coding experience in Java is mandatory
- Good aptitude, strong problem solving abilities, and analytical skills, ability to take ownership as appropriate
- Should be able to do coding, debugging, performance tuning and deploying the apps to Prod.
- Should have good working experience on
- o Hadoop ecosystem (HDFS, Hive, Yarn, File formats like Avro/Parquet)
- o Kafka
- o J2EE Frameworks (Spring/Hibernate/REST)
- o Spark Streaming or any other streaming technology.
- Strong coding experience in Java is mandatory
- Ability to work on the sprint stories to completion along with Unit test case coverage.
- Experience working in Agile Methodology
- Excellent communication and coordination skills
- Knowledgeable (and preferred hands on) - UNIX environments, different continuous integration tools.
- Must be able to integrate quickly into the team and work independently towards team goals
- Take the complete responsibility of the sprint stories' execution
- Be accountable for the delivery of the tasks in the defined timelines with good quality.
- Follow the processes for project execution and delivery.
- Follow agile methodology
- Work with the team lead closely and contribute to the smooth delivery of the project.
- Understand/define the architecture and discuss the pros-cons of the same with the team
- Involve in the brainstorming sessions and suggest improvements in the architecture/design.
- Work with other team leads to get the architecture/design reviewed.
- Work with the clients and counter-parts (in US) of the project.
- Keep all the stakeholders updated about the project/task status/risks/issues if there are any.
Experience: 4 to 9 years
Keywords: java, scala, spark, software development, hadoop, hive
Locations: Pune
We are hiring for Tier 1 MNC for the software developer with good knowledge in Spark,Hadoop and Scala
- Minimum 1 years of relevant experience, in PySpark (mandatory)
- Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus
- Ability to play lead role and independently manage 3-5 member of Pyspark development team
- EMR ,Python and PYspark mandate.
- Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with
For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
We are looking for a skilled Senior/Lead Bigdata Engineer to join our team. The role is part of the research and development team, where you with enthusiasm and knowledge are going to be our technical evangelist for the development of our inspection technology and products.
At Elop we are developing product lines for sustainable infrastructure management using our own patented technology for ultrasound scanners and combine this with other sources to see holistic overview of the concrete structure. At Elop we will provide you with world-class colleagues highly motivated to position the company as an international standard of structural health monitoring. With the right character you will be professionally challenged and developed.
This position requires travel to Norway.
Elop is sister company of Simplifai and co-located together in all geographic locations.
Roles and Responsibilities
- Define technical scope and objectives through research and participation in requirements gathering and definition of processes
- Ingest and Process data from data sources (Elop Scanner) in raw format into Big Data ecosystem
- Realtime data feed processing using Big Data ecosystem
- Design, review, implement and optimize data transformation processes in Big Data ecosystem
- Test and prototype new data integration/processing tools, techniques and methodologies
- Conversion of MATLAB code into Python/C/C++.
- Participate in overall test planning for the application integrations, functional areas and projects.
- Work with cross functional teams in an Agile/Scrum environment to ensure a quality product is delivered.
Desired Candidate Profile
- Bachelor's degree in Statistics, Computer or equivalent
- 7+ years of experience in Big Data ecosystem, especially Spark, Kafka, Hadoop, HBase.
- 7+ years of hands-on experience in Python/Scala is a must.
- Experience in architecting the big data application is needed.
- Excellent analytical and problem solving skills
- Strong understanding of data analytics and data visualization, and must be able to help development team with visualization of data.
- Experience with signal processing is plus.
- Experience in working on client server architecture is plus.
- Knowledge about database technologies like RDBMS, Graph DB, Document DB, Apache Cassandra, OpenTSDB
- Good communication skills, written and oral, in English
We can Offer
- An everyday life with exciting and challenging tasks with the development of socially beneficial solutions
- Be a part of companys research and Development team to create unique and innovative products
- Colleagues with world-class expertise, and an organization that has ambitions and is highly motivated to position the company as an international player in maintenance support and monitoring of critical infrastructure!
- Good working environment with skilled and committed colleagues an organization with short decision paths.
- Professional challenges and development
We have a requirement for Collibra Developer
Experience required- 5-12 yrs
Having experience in Data Governence , Data Quality management
either one of Java, Scala or Python
Experience in Bigdata Technologies (Hadoop/Spark/Hive/Presto/
platforms (Kafka/NiFi/Storm)
Experience in Distributed Search (Solr/Elastic Search), In-memory data-grid
(Redis/Ignite), Cloud native apps and Kubernetes is a plus
Experience in building REST services and API’s following best practices of service
abstractions, Micro-services. Experience in Orchestration frameworks is a plus
Experience in Agile methodology and CICD - tool integration, automation,
configuration management
Added advantage for being a committer in one of the open-source Bigdata
technologies - Spark, Hive, Kafka, Yarn, Hadoop/HDFS
Job Role : Associate Manager (Database Development)
Key Responsibilities:
- Optimizing performances of many stored procedures, SQL queries to deliver big amounts of data under a few seconds.
- Designing and developing numerous complex queries, views, functions, and stored procedures
- to work seamlessly with the Application/Development team’s data needs.
- Responsible for providing solutions to all data related needs to support existing and new
- applications.
- Creating scalable structures to cater to large user bases and manage high workloads
- Responsible in every step from the beginning stages of the projects from requirement gathering to implementation and maintenance.
- Developing custom stored procedures and packages to support new enhancement needs.
- Working with multiple teams to design, develop and deliver early warning systems.
- Reviewing query performance and optimizing code
- Writing queries used for front-end applications
- Designing and coding database tables to store the application data
- Data modelling to visualize database structure
- Working with application developers to create optimized queries
- Maintaining database performance by troubleshooting problems.
- Accomplishing platform upgrades and improvements by supervising system programming.
- Securing database by developing policies, procedures, and controls.
- Designing and managing deep statistical systems.
Desired Skills and Experience :
- 7+ years of experience in database development
- Minimum 4+ years of experience in PostgreSQL is a must
- Experience and in-depth knowledge in PL/SQL
- Ability to come up with multiple possible ways of solving a problem and deciding on the most optimal approach for implementation that suits the work case the most
- Have knowledge of Database Administration and have the ability and experience of using the CLI tools for administration
- Experience in Big Data technologies is an added advantage
- Secondary platforms: MS SQL 2005/2008, Oracle, MySQL
- Ability to take ownership of tasks and flexibility to work individually or in team
- Ability to communicate with teams and clients across time zones and global regions
- Good communication and self-motivated
- Should have the ability to work under pressure
- Knowledge of NoSQL and Cloud Architecture will be an advantage
Develop complex queries, pipelines and software programs to solve analytics and data mining problems
Interact with other data scientists, product managers, and engineers to understand business problems, technical requirements to deliver predictive and smart data solutions
Prototype new applications or data systems
Lead data investigations to troubleshoot data issues that arise along the data pipelines
Collaborate with different product owners to incorporate data science solutions
Maintain and improve data science platform
Must Have
BS/MS/PhD in Computer Science, Electrical Engineering or related disciplines
Strong fundamentals: data structures, algorithms, database
5+ years of software industry experience with 2+ years in analytics, data mining, and/or data warehouse
Fluency with Python
Experience developing web services using REST approaches.
Proficiency with SQL/Unix/Shell
Experience in DevOps (CI/CD, Docker, Kubernetes)
Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multi-task and manage expectations
Preferred
Industry experience with big data processing technologies such as Spark and Kafka
Experience with machine learning algorithms and/or R a plus
Experience in Java/Scala a plus
Experience with any MPP analytics engines like Vertica
Experience with data integration tools like Pentaho/SAP Analytics Cloud
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
support
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
This will include:
Scorecards
Strategies
MIS
The verticals included are:
Risk
Marketing
Product