About Saama Technologies
Similar jobs
Publicis Sapient Overview:
The Senior Associate People Senior Associate L1 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
.
Job Summary:
As Senior Associate L2 in Data Engineering, you will translate client requirements into technical design, and implement components for data engineering solution. Utilize deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution
The role requires a hands-on technologist who has strong programming background like Java / Scala / Python, should have experience in Data Ingestion, Integration and data Wrangling, Computation, Analytics pipelines and exposure to Hadoop ecosystem components. You are also required to have hands-on knowledge on at least one of AWS, GCP, Azure cloud platforms.
Role & Responsibilities:
Your role is focused on Design, Development and delivery of solutions involving:
• Data Integration, Processing & Governance
• Data Storage and Computation Frameworks, Performance Optimizations
• Analytics & Visualizations
• Infrastructure & Cloud Computing
• Data Management Platforms
• Implement scalable architectural models for data processing and storage
• Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time mode
• Build functionality for data analytics, search and aggregation
Experience Guidelines:
Mandatory Experience and Competencies:
# Competency
1.Overall 5+ years of IT experience with 3+ years in Data related technologies
2.Minimum 2.5 years of experience in Big Data technologies and working exposure in at least one cloud platform on related data services (AWS / Azure / GCP)
3.Hands-on experience with the Hadoop stack – HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline.
4.Strong experience in at least of the programming language Java, Scala, Python. Java preferable
5.Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc
6.Well-versed and working knowledge with data platform related services on at least 1 cloud platform, IAM and data security
Preferred Experience and Knowledge (Good to Have):
# Competency
1.Good knowledge of traditional ETL tools (Informatica, Talend, etc) and database technologies (Oracle, MySQL, SQL Server, Postgres) with hands on experience
2.Knowledge on data governance processes (security, lineage, catalog) and tools like Collibra, Alation etc
3.Knowledge on distributed messaging frameworks like ActiveMQ / RabbiMQ / Solace, search & indexing and Micro services architectures
4.Performance tuning and optimization of data pipelines
5.CI/CD – Infra provisioning on cloud, auto build & deployment pipelines, code quality
6.Cloud data specialty and other related Big data technology certifications
Personal Attributes:
• Strong written and verbal communication skills
• Articulation skills
• Good team player
• Self-starter who requires minimal oversight
• Ability to prioritize and manage multiple tasks
• Process orientation and the ability to define and set up processes
- Minimum 2.5 years of experience as a Python Developer.
- Minimum 2.5 years of experience in any framework like Django/Flask/Fast API
- Minimum 2.5 years of experience in SQL/ Postgress
- Minimum 2.5 years of experience in Git/Gitlab/Bit-Bucket
- Minimum 2+ years of experience in deployment (CICD with Jenkins)
- Minimum 2.5 years of experience in any cloud like AWS/GCP/Azure
A proficient, independent contributor that assists in technical design, development, implementation, and support of data pipelines; beginning to invest in less-experienced engineers.
Responsibilities:
- Design, Create and maintain on premise and cloud based data integration pipelines.
- Assemble large, complex data sets that meet functional/non functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources.
- Build analytics tools that utilize the data pipeline to provide actionable insights into key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Create data pipelines to enable BI, Analytics and Data Science teams that assist them in building and optimizing their systems
- Assists in the onboarding, training and development of team members.
- Reviews code changes and pull requests for standardization and best practices
- Evolve existing development to be automated, scalable, resilient, self-serve platforms
- Assist the team in the design and requirements gathering for technical and non technical work to drive the direction of projects
Technical & Business Expertise:
-Hands on integration experience in SSIS/Mulesoft
- Hands on experience Azure Synapse
- Proven advanced level of writing database experience in SQL Server
- Proven advanced level of understanding about Data Lake
- Proven intermediate level of writing Python or similar programming language
- Intermediate understanding of Cloud Platforms (GCP)
- Intermediate understanding of Data Warehousing
- Advanced Understanding of Source Control (Github)
Role : Senior Customer Scientist
Experience : 6-8 Years
Location : Chennai (Hybrid)
Who are we?
A young, fast-growing AI and big data company, with an ambitious vision to simplify the world’s choices. Our clients are top-tier enterprises in the banking, e-commerce and travel spaces. They use our core AI-based choice engine http://maya.ai/">maya.ai, to deliver personal digital experiences centered around taste. The http://maya.ai/">maya.ai platform now touches over 125M customers globally. You’ll find Crayon Boxes in Chennai and Singapore. But you’ll find Crayons in every corner of the world. Especially where our client projects are – UAE, India, SE Asia and pretty soon the US.
Life in the Crayon Box is a little chaotic, largely dynamic and keeps us on our toes! Crayons are a diverse and passionate bunch. Challenges excite us. Our mission drives us. And good food, caffeine (for the most part) and youthful energy fuel us. Over the last year alone, Crayon has seen a growth rate of 3x, and we believe this is just the start.
We’re looking for young and young-at-heart professionals with a relentless drive to help Crayon double its growth. Leaders, doers, innovators, dreamers, implementers and eccentric visionaries, we have a place for you all.
Can you say “Yes, I have!” to the below?
- Experience with exploratory analysis, statistical analysis, and model development
- Knowledge of advanced analytics techniques, including Predictive Modelling (Logistic regression), segmentation, forecasting, data mining, and optimizations
- Knowledge of software packages such as SAS, R, Rapidminer for analytical modelling and data management.
- Strong experience in SQL/ Python/R working efficiently at scale with large data sets
- Experience in using Business Intelligence tools such as PowerBI, Tableau, Metabase for business applications
Can you say “Yes, I will!” to the below?
- Drive clarity and solve ambiguous, challenging business problems using data-driven approaches. Propose and own data analysis (including modelling, coding, analytics) to drive business insight and facilitate decisions.
- Develop creative solutions and build prototypes to business problems using algorithms based on machine learning, statistics, and optimisation, and work with engineering to deploy those algorithms and create impact in production.
- Perform time-series analyses, hypothesis testing, and causal analyses to statistically assess the relative impact and extract trends
- Coordinate individual teams to fulfil client requirements and manage deliverable
- Communicate and present complex concepts to business audiences
- Travel to client locations when necessary
Crayon is an equal opportunity employer. Employment is based on a person's merit and qualifications and professional competences. Crayon does not discriminate against any employee or applicant because of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, marital status, pregnancy or related.
More about Crayon: https://www.crayondata.com/">https://www.crayondata.com/
More about http://maya.ai/">maya.ai: https://maya.ai/">https://maya.ai/
Location: Pune/Nagpur,Goa,Hyderabad/
Job Requirements:
- 9 years and above of total experience preferably in bigdata space.
- Creating spark applications using Scala to process data.
- Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
- Experience in spark job performance tuning and optimizations.
- Should have experience in processing data using Kafka/Pyhton.
- Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
- Should be proficient in writing SQL queries to process data in Data Warehouse.
- Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
- Experience on AWS services like EMR.
1. Use Python Scrapy to crawl the website
2. Work on dynamic websites and solve crawling challenges
3. Work in a fast-paced startup environment
Knowledge of Hadoop ecosystem installation, initial-configuration and performance tuning.
Expert with Apache Ambari, Spark, Unix Shell scripting, Kubernetes and Docker
Knowledge on python would be desirable.
Experience with HDP Manager/clients and various dashboards.
Understanding on Hadoop Security (Kerberos, Ranger and Knox) and encryption and Data masking.
Experience with automation/configuration management using Chef, Ansible or an equivalent.
Strong experience with any Linux distribution.
Basic understanding of network technologies, CPU, memory and storage.
Database administration a plus.
Qualifications and Education Requirements
2 to 4 years of experience with and detailed knowledge of Core Hadoop Components solutions and
dashboards running on Big Data technologies such as Hadoop/Spark.
Bachelor degree or equivalent in Computer Science or Information Technology or related fields.
Primary Responsibilities
- Understand current state architecture, including pain points.
- Create and document future state architectural options to address specific issues or initiatives using Machine Learning.
- Innovate and scale architectural best practices around building and operating ML workloads by collaborating with stakeholders across the organization.
- Develop CI/CD & ML pipelines that help to achieve end-to-end ML model development lifecycle from data preparation and feature engineering to model deployment and retraining.
- Provide recommendations around security, cost, performance, reliability, and operational efficiency and implement them
- Provide thought leadership around the use of industry standard tools and models (including commercially available models and tools) by leveraging experience and current industry trends.
- Collaborate with the Enterprise Architect, consulting partners and client IT team as warranted to establish and implement strategic initiatives.
- Make recommendations and assess proposals for optimization.
- Identify operational issues and recommend and implement strategies to resolve problems.
Must have:
- 3+ years of experience in developing CI/CD & ML pipelines for end-to-end ML model/workloads development
- Strong knowledge in ML operations and DevOps workflows and tools such as Git, AWS CodeBuild & CodePipeline, Jenkins, AWS CloudFormation, and others
- Background in ML algorithm development, AI/ML Platforms, Deep Learning, ML Operations in the cloud environment.
- Strong programming skillset with high proficiency in Python, R, etc.
- Strong knowledge of AWS cloud and its technologies such as S3, Redshift, Athena, Glue, SageMaker etc.
- Working knowledge of databases, data warehouses, data preparation and integration tools, along with big data parallel processing layers such as Apache Spark or Hadoop
- Knowledge of pure and applied math, ML and DL frameworks, and ML techniques, such as random forest and neural networks
- Ability to collaborate with Data scientist, Data Engineers, Leaders, and other IT teams
- Ability to work with multiple projects and work streams at one time. Must be able to deliver results based upon project deadlines.
- Willing to flex daily work schedule to allow for time-zone differences for global team communications
- Strong interpersonal and communication skills
ETL Developer – Talend
Job Duties:
- ETL Developer is responsible for Design and Development of ETL Jobs which follow standards,
best practices and are maintainable, modular and reusable.
- Proficiency with Talend or Pentaho Data Integration / Kettle.
- ETL Developer will analyze and review complex object and data models and the metadata
repository in order to structure the processes and data for better management and efficient
access.
- Working on multiple projects, and delegating work to Junior Analysts to deliver projects on time.
- Training and mentoring Junior Analysts and building their proficiency in the ETL process.
- Preparing mapping document to extract, transform, and load data ensuring compatibility with
all tables and requirement specifications.
- Experience in ETL system design and development with Talend / Pentaho PDI is essential.
- Create quality rules in Talend.
- Tune Talend / Pentaho jobs for performance optimization.
- Write relational(sql) and multidimensional(mdx) database queries.
- Functional Knowledge of Talend Administration Center/ Pentaho data integrator, Job Servers &
Load balancing setup, and all its administrative functions.
- Develop, maintain, and enhance unit test suites to verify the accuracy of ETL processes,
dimensional data, OLAP cubes and various forms of BI content including reports, dashboards,
and analytical models.
- Exposure in Map Reduce components of Talend / Pentaho PDI.
- Comprehensive understanding and working knowledge in Data Warehouse loading, tuning, and
maintenance.
- Working knowledge of relational database theory and dimensional database models.
- Creating and deploying Talend / Pentaho custom components is an add-on advantage.
- Nice to have java knowledge.
Skills and Qualification:
- BE, B.Tech / MS Degree in Computer Science, Engineering or a related subject.
- Having an experience of 3+ years.
- Proficiency with Talend or Pentaho Data Integration / Kettle.
- Ability to work independently.
- Ability to handle a team.
- Good written and oral communication skills.