Designation: Principal Data Engineer
Experience: Experienced
Position Type: Full Time Position
Location: Hyderabad
Office Timings: 9AM to 6PM
Compensation: As Per Industry standards
About Monarch:
At Monarch, we’re leading the digital transformation of farming. Monarch Tractor augments both muscle and mind with fully loaded hardware, software, and service machinery that will spur future generations of farming technologies. With our farmer-first mentality, we are building a smart tractor that will enhance (not replace) the existing farm ecosystem, alleviate labor availability, and cost issues, and provide an avenue for competitive organic and beyond farming by providing mechanical solutions to replace harmful chemical solutions. Despite all the cutting-edge technology we will incorporate, our tractor will still plow, till, and haul better than any other tractor in its class. We have all the necessary ingredients to develop, build and scale the Monarch Tractor and digitally transform farming around the world.
Description:
Monarch Tractor likes to invite an experience Python data engineer to lead our internal data engineering team in India. This is a unique opportunity to work on computer vision AI data pipelines for electric tractors. You will be dealing with data from a farm environment like videos, images, tractor logs, GPS coordinates and map polygons. You will be responsible for collecting data for research and development. For example, this includes setting up ETL data pipelines to extract data from tractors, loading these data into the cloud and recording AI training results.
This role includes, but not limited to, the following tasks:
● Lead data engineering team
● Own and contribute to more than 50% of the data engineering code base
● Scope out new project requirements
● Costing data pipeline solutions
● Create data engineering tooling
● Design custom data structures for efficient processing of data
Data engineering skills we are looking for:
● Able to work with large amounts of text log data, image data, and video data
● Fluently use AWS cloud solutions like S3, Lambda, and EC2
● Able to work with data from Robot Operating System
Required Experience:
● 3 to 5 years of experience using Python
● 3 to 5 years of experience using PostgreSQL
● 3 to 5 years of experience using AWS EC2, S3, Lambda
● 3 to 5 years of experience using Ubuntu OS or WSL
Good to have experience:
● Ray
● Robot Operating System
What you will get:
At Monarch Tractor, you’ll play a key role on a capable, dedicated, high-performing team of rock stars. Our compensation package includes a competitive salary, excellent health, dental and vision benefits, and company equity commensurate with the role you’ll play in our success.
About Monarch Tractors India
The Monarch Mission
"It is difficult to go green, when you are in the red."™
Monarch Tractor is a mission driven company that is committed to elevating farming practices to enable clean, efficient, and economically viable solutions for today’s farmers and the generations of farmers to come.
We are collaborative partners with a farmer first approach to innovation.
✔️100% Electric
✔️Driver Optional
✔️Data-Driven
Similar jobs
- Experience with Cloud native Data tools/Services such as AWS Athena, AWS Glue, Redshift Spectrum, AWS EMR, AWS Aurora, Big Query, Big Table, S3, etc.
- Strong programming skills in at least one of the following languages: Java, Scala, C++.
- Familiarity with a scripting language like Python as well as Unix/Linux shells.
- Comfortable with multiple AWS components including RDS, AWS Lambda, AWS Glue, AWS Athena, EMR. Equivalent tools in the GCP stack will also suffice.
- Strong analytical skills and advanced SQL knowledge, indexing, query optimization techniques.
- Experience implementing software around data processing, metadata management, and ETL pipeline tools like Airflow.
Experience with the following software/tools is highly desired:
- Apache Spark, Kafka, Hive, etc.
- SQL and NoSQL databases like MySQL, Postgres, DynamoDB.
- Workflow management tools like Airflow.
- AWS cloud services: RDS, AWS Lambda, AWS Glue, AWS Athena, EMR.
- Familiarity with Spark programming paradigms (batch and stream-processing).
- RESTful API services.
TOP 3 SKILLS
Python (Language)
Spark Framework
Spark Streaming
Docker/Jenkins/ Spinakar
AWS
Hive Queries
He/She should be good coder.
Preff: - Airflow
Must have experience: -
Python
Spark framework and streaming
exposure to Machine Learning Lifecycle is mandatory.
Project:
This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data
scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.
INTERVIEW INFORMATION
3-4 rounds.
1st round based on data engineering batching experience.
2nd round based on data engineering streaming experience.
3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous
feedbacks otherwise 4th round will be a functional round if required.
Job Location: Hyderabad/Bangalore/ Chennai/Pune/Nagpur
Notice period: Immediate - 15 days
1. Python Developer with Snowflake
Job Description :
- 5.5+ years of Strong Python Development Experience with Snowflake.
- Strong hands of experience with SQL ability to write complex queries.
- Strong understanding of how to connect to Snowflake using Python, should be able to handle any type of files
- Development of Data Analysis, Data Processing engines using Python
- Good Experience in Data Transformation using Python.
- Experience in Snowflake data load using Python.
- Experience in creating user-defined functions in Snowflake.
- Snowsql implementation.
- Knowledge of query performance tuning will be added advantage.
- Good understanding of Datawarehouse (DWH) concepts.
- Interpret/analyze business requirements & functional specification
- Good to have DBT, FiveTran, and AWS Knowledge.
- 3+ years of Experience majoring in applying AI/ML/ NLP / deep learning / data-driven statistical analysis & modelling solutions.
- Programming skills in Python, knowledge in Statistics.
- Hands-on experience developing supervised and unsupervised machine learning algorithms (regression, decision trees/random forest, neural networks, feature selection/reduction, clustering, parameter tuning, etc.). Familiarity with reinforcement learning is highly desirable.
- Experience in the financial domain and familiarity with financial models are highly desirable.
- Experience in image processing and computer vision.
- Experience working with building data pipelines.
- Good understanding of Data preparation, Model planning, Model training, Model validation, Model deployment and performance tuning.
- Should have hands on experience with some of these methods: Regression, Decision Trees,CART, Random Forest, Boosting, Evolutionary Programming, Neural Networks, Support Vector Machines, Ensemble Methods, Association Rules, Principal Component Analysis, Clustering, ArtificiAl Intelligence
- Should have experience in using larger data sets using Postgres Database.
Job Location: Chennai
Job Summary
The Engineering team is seeking a Data Architect. As a Data Architect, you will drive a
Data Architecture strategy across various Data Lake platforms. You will help develop
reference architecture and roadmaps to build highly available, scalable and distributed
data platforms using cloud based solutions to process high volume, high velocity and
wide variety of structured and unstructured data. This role is also responsible for driving
innovation, prototyping, and recommending solutions. Above all, you will influence how
users interact with Conde Nast’s industry-leading journalism.
Primary Responsibilities
Data Architect is responsible for
• Demonstrated technology and personal leadership experience in architecting,
designing, and building highly scalable solutions and products.
• Enterprise scale expertise in data management best practices such as data integration,
data security, data warehousing, metadata management and data quality.
• Extensive knowledge and experience in architecting modern data integration
frameworks, highly scalable distributed systems using open source and emerging data
architecture designs/patterns.
• Experience building external cloud (e.g. GCP, AWS) data applications and capabilities is
highly desirable.
• Expert ability to evaluate, prototype and recommend data solutions and vendor
technologies and platforms.
• Proven experience in relational, NoSQL, ELT/ETL technologies and in-memory
databases.
• Experience with DevOps, Continuous Integration and Continuous Delivery technologies
is desirable.
• This role requires 15+ years of data solution architecture, design and development
delivery experience.
• Solid experience in Agile methodologies (Kanban and SCRUM)
Required Skills
• Very Strong Experience in building Large Scale High Performance Data Platforms.
• Passionate about technology and delivering solutions for difficult and intricate
problems. Current on Relational Databases and No sql databases on cloud.
• Proven leadership skills, demonstrated ability to mentor, influence and partner with
cross teams to deliver scalable robust solutions..
• Mastery of relational database, NoSQL, ETL (such as Informatica, Datastage etc) /ELT
and data integration technologies.
• Experience in any one of Object Oriented Programming (Java, Scala, Python) and
Spark.
• Creative view of markets and technologies combined with a passion to create the
future.
• Knowledge on cloud based Distributed/Hybrid data-warehousing solutions and Data
Lake knowledge is mandate.
• Good understanding of emerging technologies and its applications.
• Understanding of code versioning tools such as GitHub, SVN, CVS etc.
• Understanding of Hadoop Architecture and Hive SQL
• Knowledge in any one of the workflow orchestration
• Understanding of Agile framework and delivery
•
Preferred Skills:
● Experience in AWS and EMR would be a plus
● Exposure in Workflow Orchestration like Airflow is a plus
● Exposure in any one of the NoSQL database would be a plus
● Experience in Databricks along with PySpark/Spark SQL would be a plus
● Experience with the Digital Media and Publishing domain would be a
plus
● Understanding of Digital web events, ad streams, context models
About Condé Nast
CONDÉ NAST INDIA (DATA)
Over the years, Condé Nast successfully expanded and diversified into digital, TV, and social
platforms - in other words, a staggering amount of user data. Condé Nast made the right
move to invest heavily in understanding this data and formed a whole new Data team
entirely dedicated to data processing, engineering, analytics, and visualization. This team
helps drive engagement, fuel process innovation, further content enrichment, and increase
market revenue. The Data team aimed to create a company culture where data was the
common language and facilitate an environment where insights shared in real-time could
improve performance.
The Global Data team operates out of Los Angeles, New York, Chennai, and London. The
team at Condé Nast Chennai works extensively with data to amplify its brands' digital
capabilities and boost online revenue. We are broadly divided into four groups, Data
Intelligence, Data Engineering, Data Science, and Operations (including Product and
Marketing Ops, Client Services) along with Data Strategy and monetization. The teams built
capabilities and products to create data-driven solutions for better audience engagement.
What we look forward to:
We want to welcome bright, new minds into our midst and work together to create diverse
forms of self-expression. At Condé Nast, we encourage the imaginative and celebrate the
extraordinary. We are a media company for the future, with a remarkable past. We are
Condé Nast, and It Starts Here.
Roles and
Responsibilities
Seeking AWS Cloud Engineer /Data Warehouse Developer for our Data CoE team to
help us in configure and develop new AWS environments for our Enterprise Data Lake,
migrate the on-premise traditional workloads to cloud. Must have a sound
understanding of BI best practices, relational structures, dimensional data modelling,
structured query language (SQL) skills, data warehouse and reporting techniques.
Extensive experience in providing AWS Cloud solutions to various business
use cases.
Creating star schema data models, performing ETLs and validating results with
business representatives
Supporting implemented BI solutions by: monitoring and tuning queries and
data loads, addressing user questions concerning data integrity, monitoring
performance and communicating functional and technical issues.
Job Description: -
This position is responsible for the successful delivery of business intelligence
information to the entire organization and is experienced in BI development and
implementations, data architecture and data warehousing.
Requisite Qualification
Essential
-
AWS Certified Database Specialty or -
AWS Certified Data Analytics
Preferred
Any other Data Engineer Certification
Requisite Experience
Essential 4 -7 yrs of experience
Preferred 2+ yrs of experience in ETL & data pipelines
Skills Required
Special Skills Required
AWS: S3, DMS, Redshift, EC2, VPC, Lambda, Delta Lake, CloudWatch etc.
Bigdata: Databricks, Spark, Glue and Athena
Expertise in Lake Formation, Python programming, Spark, Shell scripting
Minimum Bachelor’s degree with 5+ years of experience in designing, building,
and maintaining AWS data components
3+ years of experience in data component configuration, related roles and
access setup
Expertise in Python programming
Knowledge in all aspects of DevOps (source control, continuous integration,
deployments, etc.)
Comfortable working with DevOps: Jenkins, Bitbucket, CI/CD
Hands on ETL development experience, preferably using or SSIS
SQL Server experience required
Strong analytical skills to solve and model complex business requirements
Sound understanding of BI Best Practices/Methodologies, relational structures,
dimensional data modelling, structured query language (SQL) skills, data
warehouse and reporting techniques
Preferred Skills
Required
Experience working in the SCRUM Environment.
Experience in Administration (Windows/Unix/Network/
plus.
Experience in SQL Server, SSIS, SSAS, SSRS
Comfortable with creating data models and visualization using Power BI
Hands on experience in relational and multi-dimensional data modelling,
including multiple source systems from databases and flat files, and the use of
standard data modelling tools
Ability to collaborate on a team with infrastructure, BI report development and
business analyst resources, and clearly communicate solutions to both
technical and non-technical team members
Key deliverables for the Data Science Engineer would be to help us discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus will be on applying data mining techniques, doing statistical analysis, and building high-quality prediction systems integrated with our products.
What will you do?
- You will be building and deploying ML models to solve specific business problems related to NLP, computer vision, and fraud detection.
- You will be constantly assessing and improving the model using techniques like Transfer learning
- You will identify valuable data sources and automate collection processes along with undertaking pre-processing of structured and unstructured data
- You will own the complete ML pipeline - data gathering/labeling, cleaning, storage, modeling, training/testing, and deployment.
- Assessing the effectiveness and accuracy of new data sources and data gathering techniques.
- Building predictive models and machine-learning algorithms to apply to data sets.
- Coordinate with different functional teams to implement models and monitor outcomes.
- Presenting information using data visualization techniques and proposing solutions and strategies to business challenges
We would love to hear from you if :
- You have 2+ years of experience as a software engineer at a SaaS or technology company
- Demonstrable hands-on programming experience with Python/R Data Science Stack
- Ability to design and implement workflows of Linear and Logistic Regression, Ensemble Models (Random Forest, Boosting) using R/Python
- Familiarity with Big Data Platforms (Databricks, Hadoop, Hive), AWS Services (AWS, Sagemaker, IAM, S3, Lambda Functions, Redshift, Elasticsearch)
- Experience in Probability and Statistics, ability to use ideas of Data Distributions, Hypothesis Testing and other Statistical Tests.
- Demonstrable competency in Data Visualisation using the Python/R Data Science Stack.
- Preferable Experience Experienced in web crawling and data scraping
- Strong experience in NLP. Worked on libraries such as NLTK, Spacy, Pattern, Gensim etc.
- Experience with text mining, pattern matching and fuzzy matching
Why Tartan?
- Brand new Macbook
- Stock Options
- Health Insurance
- Unlimited Sick Leaves
- Passion Fund (Invest in yourself or your passion project)
- Wind Down
We are actively seeking a Senior Data Engineer experienced in building data pipelines and integrations from 3rd party data sources by writing custom automated ETL jobs using Python. The role will work in partnership with other members of the Business Analytics team to support the development and implementation of new and existing data warehouse solutions for our clients. This includes designing database import/export processes used to generate client data warehouse deliverables.
- 2+ Years experience as an ETL developer with strong data architecture knowledge around data warehousing concepts, SQL development and optimization, and operational support models.
- Experience using Python to automate ETL/Data Processes jobs.
- Design and develop ETL and data processing solutions using data integration tools, python scripts, and AWS / Azure / On-Premise Environment.
- Experience / Willingness to learn AWS Glue / AWS Data Pipeline / Azure Data Factory for Data Integration.
- Develop and create transformation queries, views, and stored procedures for ETL processes, and process automation.
- Document data mappings, data dictionaries, processes, programs, and solutions as per established standards for data governance.
- Work with the data analytics team to assess and troubleshoot potential data quality issues at key intake points such as validating control totals at intake and then upon transformation, and transparently build lessons learned into future data quality assessments
- Solid experience with data modeling, business logic, and RESTful APIs.
- Solid experience in the Linux environment.
- Experience with NoSQL / PostgreSQL preferred
- Experience working with databases such as MySQL, NoSQL, and Postgres, and enterprise-level connectivity experience (such as connecting over TLS and through proxies).
- Experience with NGINX and SSL.
- Performance tune data processes and SQL queries, and recommend and implement data process optimization and query tuning techniques.
What's the role?
Your role as a Principal Engineer will involve working with various team. As a principal engineer, will need full knowledge of the software development lifecycle and Agile methodologies. You will demonstrate multi-tasking skills under tight deadlines and constraints. You will regularly contribute to the development of work products (including analyzing, designing, programming, debugging, and documenting software) and may work with customers to resolve challenges and respond to suggestions for improvements and enhancements. You will setup the standard and principal for the product he/she drives.
- Setup coding practice, guidelines & quality of the software delivered.
- Determines operational feasibility by evaluating analysis, problem definition, requirements, solution development, and proposed solutions.
- Documents and demonstrates solutions by developing documentation, flowcharts, layouts, diagrams, charts, code comments and clear code.
- Prepares and installs solutions by determining and designing system specifications, standards, and programming.
- Improves operations by conducting systems analysis; recommending changes in policies and procedures.
- Updates job knowledge by studying state-of-the-art development tools, programming techniques, and computing equipment; participating in educational opportunities; reading professional publications; maintaining personal networks; participating in professional organizations.
- Protects operations by keeping information confidential.
- Develops software solutions by studying information needs; conferring with users; studying systems flow, data usage, and work processes; investigating problem areas; following the software development lifecycle. Who are you? You are a go-getter, with an eye for detail, strong problem-solving and debugging skills, and having a degree in BE/MCA/M.E./ M Tech degree or equivalent degree from reputed college/university.
Essential Skills / Experience:
- 10+ years of engineering experience
- Experience in designing and developing high volume web-services using API protocols and data formats
- Proficient in API modelling languages and annotation
- Proficient in Java programming
- Experience with Scala programming
- Experience with ETL systems
- Experience with Agile methodologies
- Experience with Cloud service & storage
- Proficient in Unix/Linux operating systems
- Excellent oral and written communication skills Preferred:
- Functional programming languages (Scala, etc)
- Scripting languages (bash, Perl, Python, etc)
- Amazon Web Services (Redshift, ECS etc)