Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive
•Build robust and scalable data infrastructure software
•Design and create services and system architecture for your projects
•Improve code quality through writing unit tests, automation, and code reviews
•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
•The candidate has to understand the basics of Kubernetes.
•Build out the production and test infrastructure.
•Develop automation frameworks to reproduce issues and prevent regressions.
•Work closely with other developers providing services to our system.
•Help to analyze and to understand how customers use the product and improve it where necessary.
•Deep familiarity with Java programming language.
•Hands-on experience with distributed systems.
•Knowledge of database concepts, RDBMS internals.
•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
•Has experience working in a distributed team.
•Has 3+ years of experience in software development.
At Anarock Tech, we are building a modern technology platform with automated analytics and reporting tools. This offers timely solutions to our real estate clients while delivering financially favourable and efficient results.
If it excites you to - drive innovation, create industry-first solutions, build new capabilities ground-up, and work with multiple new technologies, ANAROCK is the place for you.
We are looking for a Machine Learning (ML) Engineer to help us create artificial intelligence products. Machine Learning Engineer’s responsibilities include creating machine learning models and retraining systems. To do this job successfully, you need exceptional skills in statistics and programming. If you also have knowledge of data science and software engineering, we’d like to meet you. Your ultimate goal will be to shape and build efficient self-learning applications.
Key job responsibilities
- Designing and developing machine learning and deep learning systems
- Running machine learning tests and experiments
- Implementing appropriate ML algorithms
- Good understanding of Algorithms and data structures.
- Experience in designing scalable systems ML is a plus
- Expert programming experience in any one general programming language (strong OO skills preferred). Experience in at least one general programming language (Ruby, Python, Java, Elixir, C/C++)
- Experience with ML/DL frameworks
Experience with pandas, numpy etc.
Skills that will help you build a success story with us
- Worked in a start-up environment with high levels of ownership and full dedication.
- Experience in NoSQL datastores like Redis, MongoDB, Couchdb etc with an understanding of underlying sharding and scaling techniques
- Experience in building highly scalable business applications, which involve implementing large complex business flows and dealing with a huge amount of data
Experience: 1 - 4 years
Locations: Bangalore or Mumbai or Gurgaon
- What to look for at Anarock
- Who are we A glimpse of Anarock Tech, know us better
- Anarock - Media – Visit our media page
Anarock Ethos - Values Over Value:
Our assurance of consistent ethical dealing with clients and partners reflects our motto - Values Over Value.
We value diversity within ANAROCK Group and are committed to offering equal opportunities in employment. We do not discriminate against any team member or applicant for employment based on nationality, race, color, religion, caste, gender identity / expression, sexual orientation, disability, social origin and status, indigenous status, political opinion, age, marital status or any other personal characteristics or status. ANAROCK Group values all talent and will do its utmost to hire, nurture and grow them
● Create and maintain optimal data pipeline architecture.
● Assemble large, complex data sets that meet functional / non-functional
● Building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Maintain, organize & automate data processes for various use cases.
● Identifying trends, doing follow-up analysis, preparing visualizations.
● Creating daily, weekly and monthly reports of product KPIs.
● Create informative, actionable and repeatable reporting that highlights
relevant business trends and opportunities for improvement.
Required Skills And Experience:
● 2-5 years of work experience in data analytics- including analyzing large data sets.
● BTech in Mathematics/Computer Science
● Strong analytical, quantitative and data interpretation skills.
● Hands-on experience with Python, Apache Spark, Hadoop, NoSQL
databases(MongoDB preferred), Linux is a must.
● Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
● Experience with Google Cloud Data Analytics Products such as BigQuery, Dataflow, Dataproc etc. (or similar cloud-based platforms).
● Experience working within a Linux computing environment, and use of
command-line tools including knowledge of shell/Python scripting for
automating common tasks.
● Previous experience working at startups and/or in fast-paced environments.
● Previous experience as a data engineer or in a similar role.
Who Are We?
Vahak (https://www.vahak.in) is India’s largest & most trusted online transport marketplace & directory for road transport businesses and individual commercial vehicle (Trucks, Trailers, Containers, Hyva, LCVs) owners for online truck and load booking, transport business branding and transport business network expansion. Lorry owners can find intercity and intracity loads from all over India and connect with other businesses to find trusted transporters and best deals in the Indian logistics services market. With the Vahak app, users can book loads and lorries from a live transport marketplace with over 7 Lakh + Transporters and Lorry owners in over 10,000+ locations for daily transport requirements.
Vahak has raised a capital of $5+ Million in a Pre-Series A round from RTP Global along with participation from Luxor Capital and Leo Capital. The other marquee angel investors include Kunal Shah, Founder and CEO, CRED; Jitendra Gupta, Founder and CEO, Jupiter; Vidit Aatrey and Sanjeev Barnwal, Co-founders, Meesho; Mohd Farid, Co-founder, Sharechat; Amrish Rau, CEO, Pine Labs; Harsimarbir Singh, Co-founder, Pristyn Care; Rohit and Kunal Bahl, Co-founders, Snapdeal; and Ravish Naresh, Co-founder and CEO, Khatabook.
Lead Data Engineer:
We at Vahak, are looking for an enthusiastic and passionate Data Engineering lead to join our young & diverse team.You will play a key role in the data science group, working with state of the art big data technologies, building pipelines for various data sources and developing organization’s data late and data warehouse
Our goal as a group is to drive powerful, big data analytics products with scalable results.We love people who are humble and collaborative with hunger for excellence.
- Should act as a technical resource for the Data Science team and be involved in creating and implementing current and future Analytics projects like data lake design, data warehouse design, etc.
- Analysis and design of ETL solutions to store/fetch data from multiple systems like Google Analytics, CleverTap, CRM systems etc.
- Developing and maintaining data pipelines for real time analytics as well as batch analytics use cases.
- Collaborate with data scientists and actively work in the feature engineering and data preparation phase of model building
- Collaborate with product development and dev ops teams in implementing the data collection and aggregation solutions
- Ensure quality and consistency of the data in Data warehouse and follow best data governance practices
- Analyze large amounts of information to discover trends and patterns
- Mine and analyze data from company databases to drive optimization and improvement of product development, marketing techniques and business strategies.
- Bachelor’s or Masters in a highly numerate discipline such as Engineering, Science and Economics
- 5+ years of proven experience working as a Data Engineer preferably in ecommerce/web based or consumer technologies company
- Hands on experience of working with different big data tools like Hadoop, Spark , Flink, Kafka and so on
- Good understanding of AWS ecosystem for big data analytics
- Hands on experience in creating data pipelines either using tools or by independently writing scripts
- Hands on experience in scripting languages like Python, Scala, Unix Shell scripting and so on
- Strong problem solving skills with an emphasis on product development.
- Experience using business intelligence tools e.g. Tableau, Power BI would be an added advantage (not mandatory)
CustomerGlu is a low code interactive user engagement platform. We're backed by Techstars and top-notch VCs from the US like Better Capital and SmartStart.
As we begin building repeatability in our core product offering at CustomerGlu - building a high-quality data infrastructure/applications is emerging as a key requirement to further drive more ROI from our interactive engagement programs and to also get ideas for new campaigns.
Hence we are adding more team members to our existing data team and looking for a Data Engineer.
- Design and build a high-performing data platform that is responsible for the extraction, transformation, and loading of data.
- Develop low-latency real-time data analytics and segmentation applications.
- Setup infrastructure for easily building data products on top of the data platform.
- Be responsible for logging, monitoring, and error recovery of data pipelines.
- Build workflows for automated scheduling of data transformation processes.
- Able to lead a team
- 3+ years of experience and ability to manage a team
- Experience working with databases like MongoDB and DynamoDB.
- Knowledge of building batch data processing applications using Apache Spark.
- Understanding of how backend services like HTTP APIs and Queues work.
- Write good quality, maintainable code in one or more programming languages like Python, Scala, and Java.
- Working knowledge of version control systems like Git.
- Experience in real-time data processing using Apache Kafka or AWS Kinesis.
- Experience with AWS tools like Lambda and Glue.
Who Are We
A research-oriented company with expertise in computer vision and artificial intelligence, at its core, Orbo is a comprehensive platform of AI-based visual enhancement stack. This way, companies can find a suitable product as per their need where deep learning powered technology can automatically improve their Imagery.
ORBO's solutions are helping BFSI, beauty and personal care digital transformation and Ecommerce image retouching industries in multiple ways.
- Join top AI company
- Grow with your best companions
- Continuous pursuit of excellence, equality, respect
- Competitive compensation and benefits
You'll be a part of the core team and will be working directly with the founders in building and iterating upon the core products that make cameras intelligent and images more informative.
To learn more about how we work, please check out
We are looking for a computer vision engineer to lead our team in developing a factory floor analytics SaaS product. This would be a fast-paced role and the person will get an opportunity to develop an industrial grade solution from concept to deployment.
- Research and develop computer vision solutions for industries (BFSI, Beauty and personal care, E-commerce, Defence etc.)
- Lead a team of ML engineers in developing an industrial AI product from scratch
- Setup end-end Deep Learning pipeline for data ingestion, preparation, model training, validation and deployment
- Tune the models to achieve high accuracy rates and minimum latency
- Deploying developed computer vision models on edge devices after optimization to meet customer requirements
- Bachelor’s degree
- Understanding about depth and breadth of computer vision and deep learning algorithms.
- Experience in taking an AI product from scratch to commercial deployment.
- Experience in Image enhancement, object detection, image segmentation, image classification algorithms
- Experience in deployment with OpenVINO, ONNXruntime and TensorRT
- Experience in deploying computer vision solutions on edge devices such as Intel Movidius and Nvidia Jetson
- Experience with any machine/deep learning frameworks like Tensorflow, and PyTorch.
- Proficient understanding of code versioning tools, such as Git
Our perfect candidate is someone that:
- is proactive and an independent problem solver
- is a constant learner. We are a fast growing start-up. We want you to grow with us!
- is a team player and good communicator
What We Offer:
- You will have fun working with a fast-paced team on a product that can impact the business model of E-commerce and BFSI industries. As the team is small, you will easily be able to see a direct impact of what you build on our customers (Trust us - it is extremely fulfilling!)
- You will be in charge of what you build and be an integral part of the product development process
- Technical and financial growth!
- Around 6- 8.5 years of experience and around 4+ years in AI / Machine learning space
- Extensive experience in designing large scale machine learning solution for the ML use case, large scale deployments and establishing continues automated improvement / retraining framework.
- Strong experience in Python and Java is required.
- Hands on experience on Scikit-learn, Pandas, NLTK
- Experience in Handling of Timeseries data and associated techniques like Prophet, LSTM
- Experience in Regression, Clustering, classification algorithms
- Extensive experience in buildings traditional Machine Learning SVM, XGBoost, Decision tree and Deep Neural Network models like RNN, Feedforward is required.
- Experience in AutoML like TPOT or other
- Must have strong hands on experience in Deep learning frameworks like Keras, TensorFlow or PyTorch
- Knowledge of Capsule Network or reinforcement learning, SageMaker is a desirable skill
- Understanding of Financial domain is desirable skill
- Design and implementation of solutions for ML Use cases
- Productionize System and Maintain those
- Lead and implement data acquisition process for ML work
- Learn new methods and model quickly and utilize those in solving use cases
Location: Chennai- Guindy Industrial Estate
Duration: Full time role
Company: Mobile Programming (https://www.mobileprogramming.com/" target="_blank">https://www.
Client Name: Samsung
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be
responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing
data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline
builder and data wrangler who enjoy optimizing data systems and building them from the ground up.
The Data Engineer will support our software developers, database architects, data analysts and data
scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout
ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple
teams, systems and products.
Responsibilities for Data Engineer
Create and maintain optimal data pipeline architecture,
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
Experience building and optimizing big data ETL pipelines, architectures and data sets.
Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
Strong analytic skills related to working with unstructured datasets.
Build processes supporting data transformation, data structures, metadata, dependency and
A successful history of manipulating, processing and extracting value from large disconnected
Working knowledge of message queuing, stream processing and highly scalable ‘big data’ data
Strong project management and organizational skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
We are looking for a candidate with 3-6 years of experience in a Data Engineer role, who has
attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
Experience with big data tools: Spark, Kafka, HBase, Hive etc.
Experience with relational SQL and NoSQL databases
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, Scala, etc.
Skills: Big Data, AWS, Hive, Spark, Python, SQL