Pipelines should be optimised to handle both real time data, batch update data and historical data.
Establish scalable, efficient, automated processes for complex, large scale data analysis.
Write high quality code to gather and manage large data sets (both real time and batch data) from multiple sources, perform ETL and store it in a data warehouse.
Manipulate and analyse complex, high-volume, high-dimensional data from varying sources using a variety of tools and data analysis techniques.
Participate in data pipelines health monitoring and performance optimisations as well as quality documentation.
Interact with end users/clients and translate business language into technical requirements.
Acts independently to expose and resolve problems.
Job Requirements :-
2+ years experience working in software development & data pipeline development for enterprise analytics.
2+ years of working with Python with exposure to various warehousing tools
In-depth working with any of commercial tools like AWS Glue, Ta-lend, Informatica, Data-stage, etc.
Experience with various relational databases like MySQL, MSSql, Oracle etc. is a must.
Experience with analytics and reporting tools (Tableau, Power BI, SSRS, SSAS).
Experience in various DevOps practices helping the client to deploy and scale the systems as per requirement.
Strong verbal and written communication skills with other developers and business client.
Knowledge of Logistics and/or Transportation Domain is a plus.
Hands-on with traditional databases and ERP systems like Sybase and People-soft.
About Data ToBiz
Similar jobs
- 3+ years experience in practical implementation and deployment of ML based systems preferred.
- BE/B Tech or M Tech (preferred) in CS/Engineering with strong mathematical/statistical background
- Strong mathematical and analytical skills, especially statistical and ML techniques, with familiarity with different supervised and unsupervised learning algorithms
- Implementation experiences and deep knowledge of Classification, Time Series Analysis, Pattern Recognition, Reinforcement Learning, Deep Learning, Dynamic Programming and Optimisation
- Experience in working on modeling graph structures related to spatiotemporal systems
- Programming skills in Python
- Experience in developing and deploying on cloud (AWS or Google or Azure)
- Good verbal and written communication skills
- Familiarity with well-known ML frameworks such as Pandas, Keras, TensorFlow
TOP 3 SKILLS
Python (Language)
Spark Framework
Spark Streaming
Docker/Jenkins/ Spinakar
AWS
Hive Queries
He/She should be good coder.
Preff: - Airflow
Must have experience: -
Python
Spark framework and streaming
exposure to Machine Learning Lifecycle is mandatory.
Project:
This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data
scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.
INTERVIEW INFORMATION
3-4 rounds.
1st round based on data engineering batching experience.
2nd round based on data engineering streaming experience.
3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous
feedbacks otherwise 4th round will be a functional round if required.
Title: Data Engineer – Snowflake
Location: Mysore (Hybrid model)
Exp-2-8 yrs
Type: Full Time
Walk-in date: 25th Jan 2023 @Mysore
Job Role: We are looking for an experienced Snowflake developer to join our team as a Data Engineer who will work as part of a team to help design and develop data-driven solutions that deliver insights to the business. The ideal candidate is a data pipeline builder and data wrangler who enjoys building data-driven systems that drive analytical solutions and building them from the ground up. You will be responsible for building and optimizing our data as well as building automated processes for production jobs. You will support our software developers, database architects, data analysts and data scientists on data initiatives
Key Roles & Responsibilities:
- Use advanced complex Snowflake/Python and SQL to extract data from source systems for ingestion into a data pipeline.
- Design, develop and deploy scalable and efficient data pipelines.
- Analyze and assemble large, complex datasets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements. For example: automating manual processes, optimizing data delivery, re-designing data platform infrastructure for greater scalability.
- Build required infrastructure for optimal extraction, loading, and transformation (ELT) of data from various data sources using AWS and Snowflake leveraging Python or SQL technologies.
- Monitor cloud-based systems and components for availability, performance, reliability, security and efficiency
- Create and configure appropriate cloud resources to meet the needs of the end users.
- As needed, document topology, processes, and solution architecture.
- Share your passion for staying on top of tech trends, experimenting with and learning new technologies
Qualifications & Experience
Qualification & Experience Requirements:
- Bachelor's degree in computer science, computer engineering, or a related field.
- 2-8 years of experience working with Snowflake
- 2+ years of experience with the AWS services.
- Candidate should able to write the stored procedure and function in Snowflake.
- At least 2 years’ experience in snowflake developer.
- Strong SQL Knowledge.
- Data injection in snowflake using Snowflake procedure.
- ETL Experience is Must (Could be any tool)
- Candidate should be aware of snowflake architecture.
- Worked on the Migration project
- DW Concept (Optional)
- Experience with cloud data storage and compute components including lambda functions, EC2s, containers.
- Experience with data pipeline and workflow management tools: Airflow, etc.
- Experience cleaning, testing, and evaluating data quality from a wide variety of ingestible data sources
- Experience working with Linux and UNIX environments.
- Experience with profiling data, with and without data definition documentation
- Familiar with Git
- Familiar with issue tracking systems like JIRA (Project Management Tool) or Trello.
- Experience working in an agile environment.
Desired Skills:
- Experience in Snowflake. Must be willing to be Snowflake certified in the first 3 months of employment.
- Experience with a stream-processing system: Snowpipe
- Working knowledge of AWS or Azure
- Experience in migrating from on-prem to cloud systems
Designation: Principal Data Engineer
Experience: Experienced
Position Type: Full Time Position
Location: Hyderabad
Office Timings: 9AM to 6PM
Compensation: As Per Industry standards
About Monarch:
At Monarch, we’re leading the digital transformation of farming. Monarch Tractor augments both muscle and mind with fully loaded hardware, software, and service machinery that will spur future generations of farming technologies. With our farmer-first mentality, we are building a smart tractor that will enhance (not replace) the existing farm ecosystem, alleviate labor availability, and cost issues, and provide an avenue for competitive organic and beyond farming by providing mechanical solutions to replace harmful chemical solutions. Despite all the cutting-edge technology we will incorporate, our tractor will still plow, till, and haul better than any other tractor in its class. We have all the necessary ingredients to develop, build and scale the Monarch Tractor and digitally transform farming around the world.
Description:
Monarch Tractor likes to invite an experience Python data engineer to lead our internal data engineering team in India. This is a unique opportunity to work on computer vision AI data pipelines for electric tractors. You will be dealing with data from a farm environment like videos, images, tractor logs, GPS coordinates and map polygons. You will be responsible for collecting data for research and development. For example, this includes setting up ETL data pipelines to extract data from tractors, loading these data into the cloud and recording AI training results.
This role includes, but not limited to, the following tasks:
● Lead data engineering team
● Own and contribute to more than 50% of the data engineering code base
● Scope out new project requirements
● Costing data pipeline solutions
● Create data engineering tooling
● Design custom data structures for efficient processing of data
Data engineering skills we are looking for:
● Able to work with large amounts of text log data, image data, and video data
● Fluently use AWS cloud solutions like S3, Lambda, and EC2
● Able to work with data from Robot Operating System
Required Experience:
● 3 to 5 years of experience using Python
● 3 to 5 years of experience using PostgreSQL
● 3 to 5 years of experience using AWS EC2, S3, Lambda
● 3 to 5 years of experience using Ubuntu OS or WSL
Good to have experience:
● Ray
● Robot Operating System
What you will get:
At Monarch Tractor, you’ll play a key role on a capable, dedicated, high-performing team of rock stars. Our compensation package includes a competitive salary, excellent health, dental and vision benefits, and company equity commensurate with the role you’ll play in our success.
- Proficient with SQL Server/T-SQL programming in creation and optimization of complex Stored Procedures, UDF, CTE and Triggers
- Overall Experience should be between 4 to 7 years
- Experience working in a data warehouse environment and a strong understanding of dimensional data modeling concepts. Experience in SQL server, DW principles and SSIS.
- Should have strong experience in building data transformations with SSIS including importing data from files, and moving data from source to destination.
- Creating new SSIS packages or modifying existing SSIS packages using SQL server
- Debug and fine-tune SSIS processes to ensure accurate and efficient movement of data. Experience with ETL testing & data validation.
- 1+ years of experience with Azure services like Azure Data Factory, Data flow, Azure blob Storage, etc.
- 1+ years of experience with developing Azure Data Factory Objects - ADF pipeline, configuration, parameters, variables, Integration services runtime.
- Must be able to build Business Intelligence solutions in a collaborative, agile development environment.
- Reporting experience with Power BI or SSRS is a plus.
- Experience working on an Agile/Scrum team preferred.
- Proven strong problem-solving skills, troubleshooting, and root cause analysis.
- Excellent written and verbal communication skills.
- Collaborate with the business teams to understand the data environment in the organization; develop and lead the Data Scientists team to test and scale new algorithms through pilots and subsequent scaling up of the solutions
- Influence, build and maintain the large-scale data infrastructure required for the AI projects, and integrate with external IT infrastructure/service
- Act as the single point source for all data related queries; strong understanding of internal and external data sources; provide inputs in deciding data-schemas
- Design, develop and maintain the framework for the analytics solutions pipeline
- Provide inputs to the organization’s initiatives on data quality and help implement frameworks and tools for the various related initiatives
- Work in cross-functional teams of software/machine learning engineers, data scientists, product managers, and others to build the AI ecosystem
- Collaborate with the external organizations including vendors, where required, in respect of all data-related queries as well as implementation initiatives
In 2018-19, the mobile games market in India generated over $600 million in revenues. With close to 450 people in its Mumbai and Bangalore offices, Games24x7 is India’s largest mobile games business today and is very well positioned to become the 800-pound gorilla of what will be a $2 billion market by 2022. While Games24x7 continues to invest aggressively in its India centric mobile games, it is also diversifying its business by investing in international gaming and other tech opportunities.
Summary of Role
Position/Role Description :
The candidate will be part of a team managing databases (MySQL, MongoDB, Cassandra) and will be involved in designing, configuring and maintaining databases.
Job Responsibilities:
• Complete involvement in the database requirement starting from the design phase for every project.
• Deploying required database assets on production (DDL, DML)
• Good understanding of MySQL Replication (Master-slave, Master-Master, GTID-based)
• Understanding of MySQL partitioning.
• A better understanding of MySQL logs and Configuration.
• Ways to schedule backup and restoration.
• Good understanding of MySQL versions and their features.
• Good understanding of InnoDB-Engine.
• Exploring ways to optimize the current environment and also lay a good platform for new projects.
• Able to understand and resolve any database related production outages.
Job Requirements:
• BE/B.Tech from a reputed institute
• Experience in python scripting.
• Experience in shell scripting.
• General understanding of system hardware.
• Experience in MySQL is a must.
• Experience in MongoDB, Cassandra, Graph db will be preferred.
• Experience with Pecona MySQL tools.
• 6 - 8 years of experience.
Job Location: Bengaluru
Role : Talend developer
Location : Coimbatore
Experience : 4+Years
Skills : Talend, any DB
Notice period : Immediate to 15 Days
Your mission is to help lead team towards creating solutions that improve the way our business is run. Your knowledge of design, development, coding, testing and application programming will help your team raise their game, meeting your standards, as well as satisfying both business and functional requirements. Your expertise in various technology domains will be counted on to set strategic direction and solve complex and mission critical problems, internally and externally. Your quest to embracing leading-edge technologies and methodologies inspires your team to follow suit.
Responsibilities and Duties :
- As a Data Engineer you will be responsible for the development of data pipelines for numerous applications handling all kinds of data like structured, semi-structured &
unstructured. Having big data knowledge specially in Spark & Hive is highly preferred.
- Work in team and provide proactive technical oversight, advice development teams fostering re-use, design for scale, stability, and operational efficiency of data/analytical solutions
Education level :
- Bachelor's degree in Computer Science or equivalent
Experience :
- Minimum 5+ years relevant experience working on production grade projects experience in hands on, end to end software development
- Expertise in application, data and infrastructure architecture disciplines
- Expert designing data integrations using ETL and other data integration patterns
- Advanced knowledge of architecture, design and business processes
Proficiency in :
- Modern programming languages like Java, Python, Scala
- Big Data technologies Hadoop, Spark, HIVE, Kafka
- Writing decently optimized SQL queries
- Orchestration and deployment tools like Airflow & Jenkins for CI/CD (Optional)
- Responsible for design and development of integration solutions with Hadoop/HDFS, Real-Time Systems, Data Warehouses, and Analytics solutions
- Knowledge of system development lifecycle methodologies, such as waterfall and AGILE.
- An understanding of data architecture and modeling practices and concepts including entity-relationship diagrams, normalization, abstraction, denormalization, dimensional
modeling, and Meta data modeling practices.
- Experience generating physical data models and the associated DDL from logical data models.
- Experience developing data models for operational, transactional, and operational reporting, including the development of or interfacing with data analysis, data mapping,
and data rationalization artifacts.
- Experience enforcing data modeling standards and procedures.
- Knowledge of web technologies, application programming languages, OLTP/OLAP technologies, data strategy disciplines, relational databases, data warehouse development and Big Data solutions.
- Ability to work collaboratively in teams and develop meaningful relationships to achieve common goals
Skills :
Must Know :
- Core big-data concepts
- Spark - PySpark/Scala
- Data integration tool like Pentaho, Nifi, SSIS, etc (at least 1)
- Handling of various file formats
- Cloud platform - AWS/Azure/GCP
- Orchestration tool - Airflow