About Simplilearn Solutions
Simplilearn is the most popular online boot camp for the teaching of digital skills, and it assists students in learning the abilities necessary to flourish in today's digital economy. They provide intensive training in a range of areas, such as data science, cloud computing, cyber security, digital marketing, and project management, all of which may be completed online. To put it another way, Simplilearn focuses its efforts on niches that are characterized by rapidly advancing technology and standards of practice, as well as a significant disparity between the demand for and supply of qualified individuals.
Simplilearn provides a wide range of comprehensive certification programs, individual courses, and partnerships with some of the most prestigious universities in the world. Through these offerings, the company assists millions of professionals in developing the work-ready skills they need to excel in their careers, as well as thousands of organizations in meeting the employee upskilling and corporate training needs of their businesses. 85 percent of Simplilearn's learners have either advanced in their current jobs or found new ones as a direct result of the program's hands-on, practical approach.
Data Engineer- Senior
Cubera is a data company revolutionizing big data analytics and Adtech through data share value principles wherein the users entrust their data to us. We refine the art of understanding, processing, extracting, and evaluating the data that is entrusted to us. We are a gateway for brands to increase their lead efficiency as the world moves towards web3.
What are you going to do?
Design & Develop high performance and scalable solutions that meet the needs of our customers.
Closely work with the Product Management, Architects and cross functional teams.
Build and deploy large-scale systems in Java/Python.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Create data tools for analytics and data scientist team members that assist them in building and optimizing their algorithms.
Follow best practices that can be adopted in Bigdata stack.
Use your engineering experience and technical skills to drive the features and mentor the engineers.
What are we looking for ( Competencies) :
Bachelor’s degree in computer science, computer engineering, or related technical discipline.
Overall 5 to 8 years of programming experience in Java, Python including object-oriented design.
Data handling frameworks: Should have a working knowledge of one or more data handling frameworks like- Hive, Spark, Storm, Flink, Beam, Airflow, Nifi etc.
Data Infrastructure: Should have experience in building, deploying and maintaining applications on popular cloud infrastructure like AWS, GCP etc.
Data Store: Must have expertise in one of general-purpose No-SQL data stores like Elasticsearch, MongoDB, Redis, RedShift, etc.
Strong sense of ownership, focus on quality, responsiveness, efficiency, and innovation.
Ability to work with distributed teams in a collaborative and productive manner.
Competitive Salary Packages and benefits.
Collaborative, lively and an upbeat work environment with young professionals.
Job Category: Development
Job Type: Full Time
Job Location: Bangalore
- Must be able to write quality code and build secure, highly available systems.
- Assemble large, complex datasets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing datadelivery, re-designing infrastructure for greater scalability, etc with the guidance.
- Create datatools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Monitoring performance and advising any necessary infrastructure changes.
- Defining dataretention policies.
- Implementing the ETL process and optimal data pipeline architecture
- Build analytics tools that utilize the datapipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Develop, test, and implement datasolutions based on finalized design documents.
- Work with dataand analytics experts to strive for greater functionality in our data
- Proactively identify potential production issues and recommend and implement solutions
- Good understanding of optimal extraction, transformation, and loading of datafrom a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable datasize (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of datafrom multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Creation of DAGs for dataengineering
- Expert at Python /Scala programming, especially for dataengineering/ ETL purposes
We are looking for an exceptional Software Developer for our Data Engineering India team who can-
contribute to building a world-class big data engineering stack that will be used to fuel us
Analytics and Machine Learning products. This person will be contributing to the architecture,
operation, and enhancement of:
Our petabyte-scale data platform with a key focus on finding solutions that can support
Analytics and Machine Learning product roadmap. Everyday terabytes of ingested data
need to be processed and made available for querying and insights extraction for
various use cases.
About the Organisation:
- It provides a dynamic, fun workplace filled with passionate individuals. We are at the cutting edge of advertising technology and there is never a dull moment at work.
- We have a truly global footprint, with our headquarters in Singapore and offices in Australia, United States, Germany, United Kingdom, and India.
- You will gain work experience in a global environment. We speak over 20 different languages, from more than 16 different nationalities and over 42% of our staff are multilingual.
Software Developer, Data Engineering team
Location: Pune(Initially 100% Remote due to Covid 19 for coming 1 year)
- Our bespoke Machine Learning pipelines. This will also provide opportunities to
contribute to the prototyping, building, and deployment of Machine Learning models.
- Have at least 4+ years’ Experience.
- Deep technical understanding of Java or Golang.
- Production experience with Python is a big plus, extremely valuable supporting skill for
- Exposure to modern Big Data tech: Cassandra/Scylla, Kafka, Ceph, the Hadoop Stack,
Spark, Flume, Hive, Druid etc… while at the same time understanding that certain
problems may require completely novel solutions.
- Exposure to one or more modern ML tech stacks: Spark ML-Lib, TensorFlow, Keras,
GCP ML Stack, AWS Sagemaker - is a plus.
- Experience includes working in Agile/Lean model
- Experience with supporting and troubleshooting large systems
- Exposure to configuration management tools such as Ansible or Salt
- Exposure to IAAS platforms such as AWS, GCP, Azure…
- Good addition - Experience working with large-scale data
- Good addition - Good to have experience architecting, developing, and operating data
warehouses, big data analytics platforms, and high velocity data pipelines
**** Not looking for a Big Data Developer / Hadoop Developer
Responsibilities: - Write and maintain production level code in Python for deploying machine learning models - Create and maintain deployment pipelines through CI/CD tools (preferribly GitLab CI) - Implement alerts and monitoring for prediction accuracy and data drift detection - Implement automated pipelines for training and replacing models - Work closely with with the data science team to deploy new models to production Required Qualifications: - Degree in Computer Science, Data Science, IT or a related discipline. - 2+ years of experience in software engineering or data engineering. - Programming experience in Python - Experience in data profiling, ETL development, testing and implementation - Experience in deploying machine learning models
Good to have: - Experience in AWS resources for ML and data engineering (SageMaker, Glue, Athena, Redshift, S3) - Experience in deploying TensorFlow models - Experience in deploying and managing ML Flow
- Improve robustness of Leena AI current NLP stack
- Increase zero shot learning capability of Leena AI current NLP stack
- Opportunity to add/build new NLP architectures based on requirements
- Manage End to End lifecycle of the data in the system till it achieves more than 90% accuracy
- Manage a NLP team
- Strong understanding of linear algebra, optimisation, probability, statistics
- Experience in the data science methodology from exploratory data analysis, feature engineering, model selection, deployment of the model at scale and model evaluation
- Experience in deploying NLP architectures in production
- Understanding of latest NLP architectures like transformers is good to have
- Experience in adversarial attacks/robustness of DNN is good to have
- Experience with Python Web Framework (Django), Analytics and Machine Learning frameworks like Tensorflow/Keras/Pytorch.
· 10+ years of Information Technology experience, preferably with Telecom / wireless service providers.
· Experience in designing data solution following Agile practices (SAFe methodology); designing for testability, deployability and releaseability; rapid prototyping, data modeling, and decentralized innovation
· To be able to demonstrate an understanding and ideally use of, at least one recognised architecture framework or standard e.g. TOGAF, Zachman Architecture Framework etc
· The ability to apply data, research, and professional judgment and experience to ensure our products are making the biggest difference to consumers
· Demonstrated ability to work collaboratively
· Excellent written, verbal and social skills - You will be interacting with all types of people (user experience designers, developers, managers, marketers, etc.)
· Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision
· Technologies: .NET, AWS, Azure; Azure Synapse, Nifi, RDS, Apache Kafka, Azure Data bricks, Azure datalake storage, Power BI, Reporting Analytics, QlickView, SQL on-prem Datawarehouse; BSS, OSS & Enterprise Support Systems
SQL, Python, Numpy,Pandas,Knowledge of Hive and Data warehousing concept will be a plus point.
- Strong analytical skills with the ability to collect, organise, analyse and interpret trends or patterns in complex data sets and provide reports & visualisations.
- Work with management to prioritise business KPIs and information needs Locate and define new process improvement opportunities.
- Technical expertise with data models, database design and development, data mining and segmentation techniques
- Proven success in a collaborative, team-oriented environment
- Working experience with geospatial data will be a plus.
• Excellent understanding of machine learning techniques and algorithms, such as SVM, Decision Forests, k-NN, Naive Bayes etc.
• Experience in selecting features, building and optimizing classifiers using machine learning techniques.
• Prior experience with data visualization tools, such as D3.js, GGplot, etc..
• Good knowledge on statistics skills, such as distributions, statistical testing, regression, etc..
• Adequate presentation and communication skills to explain results and methodologies to non-technical stakeholders.
• Basic understanding of the banking industry is value add
Develop, process, cleanse and enhance data collection procedures from multiple data sources.
• Conduct & deliver experiments and proof of concepts to validate business ideas and potential value.
• Test, troubleshoot and enhance the developed models in a distributed environments to improve it's accuracy.
• Work closely with product teams to implement algorithms with Python and/or R.
• Design and implement scalable predictive models, classifiers leveraging machine learning, data regression.
• Facilitate integration with enterprise applications using APIs to enrich implementations