• Problem Solving:. Resolving production issues to fix service P1-4 issues. Problems relating to
introducing new technology, and resolving major issues in the platform and/or service.
• Software Development Concepts: Understands and is experienced with the use of a wide range of
programming concepts and is also aware of and has applied a range of algorithms.
• Commercial & Risk Awareness: Able to understand & evaluate both obvious and subtle commercial
risks, especially in relation to a programme.
Experience you would be expected to have
• Cloud: experience with one of the following cloud vendors: AWS, Azure or GCP
• GCP : Experience prefered, but learning essential.
• Big Data: Experience with Big Data methodology and technologies
• Programming : Python or Java worked with Data (ETL)
• DevOps: Understand how to work in a Dev Ops and agile way / Versioning / Automation / Defect
Management – Mandatory
• Agile methodology - knowledge of Jira
Similar jobs
Data Engineering : Senior Engineer / Manager
As Senior Engineer/ Manager in Data Engineering, you will translate client requirements into technical design, and implement components for a data engineering solutions. Utilize a deep understanding of data integration and big data design principles in creating custom solutions or implementing package solutions. You will independently drive design discussions to insure the necessary health of the overall solution.
Must Have skills :
1. GCP
2. Spark streaming : Live data streaming experience is desired.
3. Any 1 coding language: Java/Pyhton /Scala
Skills & Experience :
- Overall experience of MINIMUM 5+ years with Minimum 4 years of relevant experience in Big Data technologies
- Hands-on experience with the Hadoop stack - HDFS, sqoop, kafka, Pulsar, NiFi, Spark, Spark Streaming, Flink, Storm, hive, oozie, airflow and other components required in building end to end data pipeline. Working knowledge on real-time data pipelines is added advantage.
- Strong experience in at least of the programming language Java, Scala, Python. Java preferable
- Hands-on working knowledge of NoSQL and MPP data platforms like Hbase, MongoDb, Cassandra, AWS Redshift, Azure SQLDW, GCP BigQuery etc.
- Well-versed and working knowledge with data platform related services on GCP
- Bachelor's degree and year of work experience of 6 to 12 years or any combination of education, training and/or experience that demonstrates the ability to perform the duties of the position
Your Impact :
- Data Ingestion, Integration and Transformation
- Data Storage and Computation Frameworks, Performance Optimizations
- Analytics & Visualizations
- Infrastructure & Cloud Computing
- Data Management Platforms
- Build functionality for data ingestion from multiple heterogeneous sources in batch & real-time
- Build functionality for data analytics, search and aggregation
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional
business requirements.
● Building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Maintain, organize & automate data processes for various use cases.
● Identifying trends, doing follow-up analysis, preparing visualizations.
● Creating daily, weekly and monthly reports of product KPIs.
● Create informative, actionable and repeatable reporting that highlights
relevant business trends and opportunities for improvement.
Required Skills And Experience:
● 2-5 years of work experience in data analytics- including analyzing large data sets.
● BTech in Mathematics/Computer Science
● Strong analytical, quantitative and data interpretation skills.
● Hands-on experience with Python, Apache Spark, Hadoop, NoSQL
databases(MongoDB preferred), Linux is a must.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Experience with Google Cloud Data Analytics Products such as BigQuery, Dataflow, Dataproc etc. (or similar cloud-based platforms).
● Experience working within a Linux computing environment, and use of
command-line tools including knowledge of shell/Python scripting for
automating common tasks.
● Previous experience working at startups and/or in fast-paced environments.
● Previous experience as a data engineer or in a similar role.
From building entire infrastructures or platforms to solving complex IT challenges, Cambridge Technology helps businesses accelerate their digital transformation and become AI-first businesses. With over 20 years of expertise as a technology services company, we enable our customers to stay ahead of the curve by helping them figure out the perfect approach, solutions, and ecosystem for their business. Our experts help customers leverage the right AI, big data, cloud solutions, and intelligent platforms that will help them become and stay relevant in a rapidly changing world.
No Of Positions: 1
Skills required:
- The ideal candidate will have a bachelor’s degree in data science, statistics, or a related discipline with 4-6 years of experience, or a master’s degree with 4-6 years of experience. A strong candidate will also possess many of the following characteristics:
- Strong problem-solving skills with an emphasis on achieving proof-of-concept
- Knowledge of statistical techniques and concepts (regression, statistical tests, etc.)
- Knowledge of machine learning and deep learning fundamentals
- Experience with Python implementations to build ML and deep learning algorithms (e.g., pandas, numpy, sci-kit-learn, Stats Models, Keras, PyTorch, etc.)
- Experience writing and debugging code in an IDE
- Experience using managed web services (e.g., AWS, GCP, etc.)
- Strong analytical and communication skills
- Curiosity, flexibility, creativity, and a strong tolerance for ambiguity
- Ability to learn new tools from documentation and internet resources.
Roles and responsibilities :
- You will work on a small, core team alongside other engineers and business leaders throughout Cambridge with the following responsibilities:
- Collaborate with client-facing teams to design and build operational AI solutions for client engagements.
- Identify relevant data sources for data wrangling and EDA
- Identify model architectures to use for client business needs.
- Build full-stack data science solutions up to MVP that can be deployed into existing client business processes or scaled up based on clear documentation.
- Present findings to teammates and key stakeholders in a clear and repeatable manner.
Experience :
2 - 14 Yrs
• Solid technical / data-mining skills and ability to work with large volumes of data; extract
and manipulate large datasets using common tools such as Python and SQL other
programming/scripting languages to translate data into business decisions/results
• Be data-driven and outcome-focused
• Must have good business judgment with demonstrated ability to think creatively and
strategically
• Must be an intuitive, organized analytical thinker, with the ability to perform detailed
analysis
• Takes personal ownership; Self-starter; Ability to drive projects with minimal guidance
and focus on high impact work
• Learns continuously; Seeks out knowledge, ideas and feedback.
• Looks for opportunities to build owns skills, knowledge and expertise.
• Experience with big data and cloud computing viz. Spark, Hadoop (MapReduce, PIG,
HIVE)
• Experience in risk and credit score domains preferred
• Comfortable with ambiguity and frequent context-switching in a fast-paced
environment
This profile will include the following responsibilities:
- Develop Parsers for XML and JSON Data sources/feeds
- Write Automation Scripts for product development
- Build API Integrations for 3rd Party product integration
- Perform Data Analysis
- Research on Machine learning algorithms
- Understand AWS cloud architecture and work with 3 party vendors for deployments
- Resolve issues in AWS environmentWe are looking for candidates with:
Qualification: BE/BTech/Bsc-IT/MCA
Programming Language: Python
Web Development: Basic understanding of Web Development. Working knowledge of Python Flask is desirable
Database & Platform: AWS/Docker/MySQL/MongoDB
Basic Understanding of Machine Learning Models & AWS Fundamentals is recommended.
Job Description:
We are looking for a Big Data Engineer who have worked across the entire ETL stack. Someone who has ingested data in a batch and live stream format, transformed large volumes of daily and built Data-warehouse to store the transformed data and has integrated different visualization dashboards and applications with the data stores. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
Responsibilities:
- Develop, test, and implement data solutions based on functional / non-functional business requirements.
- You would be required to code in Scala and PySpark daily on Cloud as well as on-prem infrastructure
- Build Data Models to store the data in a most optimized manner
- Identify, design, and implement process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implementing the ETL process and optimal data pipeline architecture
- Monitoring performance and advising any necessary infrastructure changes.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Proactively identify potential production issues and recommend and implement solutions
- Must be able to write quality code and build secure, highly available systems.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Review peer-codes and pipelines before deploying to Production for optimization issues and code standards
Skill Sets:
- Good understanding of optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable data size (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Creation of DAGs for data engineering
- Expert at Python /Scala programming, especially for data engineering/ ETL purposes
Basic Qualifications
- Need to have a working knowledge of AWS Redshift.
- Minimum 1 year of designing and implementing a fully operational production-grade large-scale data solution on Snowflake Data Warehouse.
- 3 years of hands-on experience with building productized data ingestion and processing pipelines using Spark, Scala, Python
- 2 years of hands-on experience designing and implementing production-grade data warehousing solutions
- Expertise and excellent understanding of Snowflake Internals and integration of Snowflake with other data processing and reporting technologies
- Excellent presentation and communication skills, both written and verbal
- Ability to problem-solve and architect in an environment with unclear requirements
We at Datametica Solutions Private Limited are looking for an SQL Lead / Architect who has a passion for the cloud with knowledge of different on-premises and cloud Data implementation in the field of Big Data and Analytics including and not limiting to Teradata, Netezza, Exadata, Oracle, Cloudera, Hortonworks and alike.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.
Job Description :
Experience: 6+ Years
Work Location: Pune / Hyderabad
Technical Skills :
- Good programming experience as an Oracle PL/SQL, MySQL, and SQL Server Developer
- Knowledge of database performance tuning techniques
- Rich experience in a database development
- Experience in Designing and Implementation Business Applications using the Oracle Relational Database Management System
- Experience in developing complex database objects like Stored Procedures, Functions, Packages and Triggers using SQL and PL/SQL
Required Candidate Profile :
- Excellent communication, interpersonal, analytical skills and strong ability to drive teams
- Analyzes data requirements and data dictionary for moderate to complex projects • Leads data model related analysis discussions while collaborating with Application Development teams, Business Analysts, and Data Analysts during joint requirements analysis sessions
- Translate business requirements into technical specifications with an emphasis on highly available and scalable global solutions
- Stakeholder management and client engagement skills
- Strong communication skills (written and verbal)
About Us!
A global leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.
We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.
Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.
We have our own products!
Eagle Data warehouse Assessment & Migration Planning Product
Raven Automated Workload Conversion Product
Pelican Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.
Why join us!
Datametica is a place to innovate, bring new ideas to live, and learn new things. We believe in building a culture of innovation, growth, and belonging. Our people and their dedication over these years are the key factors in achieving our success.
Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy
Check out more about us on our website below!
www.datametica.com
We are looking for a savvy Data Engineer to join our growing team of analytics experts.
The hire will be responsible for:
- Expanding and optimizing our data and data pipeline architecture
- Optimizing data flow and collection for cross functional teams.
- Will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
- Must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
- Experience with Azure : ADLS, Databricks, Stream Analytics, SQL DW, COSMOS DB, Analysis Services, Azure Functions, Serverless Architecture, ARM Templates
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with object-oriented/object function scripting languages: Python, SQL, Scala, Spark-SQL etc.
Nice to have experience with :
- Big data tools: Hadoop, Spark and Kafka
- Data pipeline and workflow management tools: Azkaban, Luigi, Airflow
- Stream-processing systems: Storm
Database : SQL DB
Programming languages : PL/SQL, Spark SQL
Looking for candidates with Data Warehousing experience, strong domain knowledge & experience working as a Technical lead.
The right candidate will be excited by the prospect of optimizing or even re-designing our company's data architecture to support our next generation of products and data initiatives.
- Hands-on programming expertise in Java OR Python
- Strong production experience with Spark (Minimum of 1-2 years)
- Experience in data pipelines using Big Data technologies (Hadoop, Spark, Kafka, etc.,) on large scale unstructured data sets
- Working experience and good understanding of public cloud environments (AWS OR Azure OR Google Cloud)
- Experience with IAM policy and role management is a plus