About Indix
Similar jobs
About The Company
The client is 17-year-old Multinational Company headquartered in Bangalore, Whitefield, and having another delivery center in Pune, Hinjewadi. It also has offices in US and Germany and are working with several OEM’s and Product Companies in about 12 countries and is a 200+ strong team worldwide.
The Role
Power BI front-end developer in the Data Domain (Manufacturing, Sales & Marketing, Purchasing, Logistics, …).Responsible for the Power BI front-end design, development, and delivery of highly visible data-driven applications in the Compressor Technique. You always take a quality-first approach where you ensure the data is visualized in a clear, accurate, and user-friendly manner. You always ensure standards and best practices are followed and ensure documentation is created and maintained. Where needed, you take initiative and make
recommendations to drive improvements. In this role you will also be involved in the tracking, monitoring and performance analysis
of production issues and the implementation of bugfixes and enhancements.
Skills & Experience
• The ideal candidate has a degree in Computer Science, Information Technology or equal through experience.
• Strong knowledge on BI development principles, time intelligence, functions, dimensional modeling and data visualization is required.
• Advanced knowledge and 5-10 years experience with professional BI development & data visualization is preferred.
• You are familiar with data warehouse concepts.
• Knowledge on MS Azure (data lake, databricks, SQL) is considered as a plus.
• Experience and knowledge on scripting languages such as PowerShell and Python to setup and automate Power BI platform related activities is an asset.
• Good knowledge (oral and written) of English is required.
Role:
We're seeking a proactive leader to head a team and tackle complex business logic problems. This role requires an individual who can contribute independently, providing valuable insights rather than needing constant guidance. We value the depth of knowledge and hands-on experience over spoon-fed solutions.
Technical Requirements:
Proficiency in data processing technologies, including SQL and PostgreSQL, with the ability to write complex queries. Strong coding skills in Python, with hands-on experience in developing solutions. Knowledge of Scala or Julia is advantageous. Experience leading teams and working on projects from inception. Must be Familiar with ML with frameworks such as TensorFlow, OpenAI, LLP, and BUD. Some exposure to Airflow is preferred. Must have knowledge and Hands-on experience with DevOps coding and migration. Must have knowledge of Flask or FastAPI. Experience working with both Linux and Windows environments.
Experience:5+ Years.
(Note: The resource should have hands-on exp with all the must-have skills mentioned above)
-
Fix issues with plugins for our Python-based ETL pipelines
-
Help with automation of standard workflow
-
Deliver Python microservices for provisioning and managing cloud infrastructure
-
Responsible for any refactoring of code
-
Effectively manage challenges associated with handling large volumes of data working to tight deadlines
-
Manage expectations with internal stakeholders and context-switch in a fast-paced environment
-
Thrive in an environment that uses AWS and Elasticsearch extensively
-
Keep abreast of technology and contribute to the engineering strategy
-
Champion best development practices and provide mentorship to others
-
First and foremost you are a Python developer, experienced with the Python Data stack
-
You love and care about data
-
Your code is an artistic manifest reflecting how elegant you are in what you do
-
You feel sparks of joy when a new abstraction or pattern arises from your code
-
You support the manifests DRY (Don’t Repeat Yourself) and KISS (Keep It Short and Simple)
-
You are a continuous learner
-
You have a natural willingness to automate tasks
-
You have critical thinking and an eye for detail
-
Excellent ability and experience of working to tight deadlines
-
Sharp analytical and problem-solving skills
-
Strong sense of ownership and accountability for your work and delivery
-
Excellent written and oral communication skills
-
Mature collaboration and mentoring abilities
-
We are keen to know your digital footprint (community talks, blog posts, certifications, courses you have participated in or you are keen to, your personal projects as well as any kind of contributions to the open-source communities if any)
-
Delivering complex software, ideally in a FinTech setting
-
Experience with CI/CD tools such as Jenkins, CircleCI
-
Experience with code versioning (git / mercurial / subversion)
- Big data developer with 8+ years of professional IT experience with expertise in Hadoop ecosystem components in ingestion, Data modeling, querying, processing, storage, analysis, Data Integration and Implementing enterprise level systems spanning Big Data.
- A skilled developer with strong problem solving, debugging and analytical capabilities, who actively engages in understanding customer requirements.
- Expertise in Apache Hadoop ecosystem components like Spark, Hadoop Distributed File Systems(HDFS), HiveMapReduce, Hive, Sqoop, HBase, Zookeeper, YARN, Flume, Pig, Nifi, Scala and Oozie.
- Hands on experience in creating real - time data streaming solutions using Apache Spark core, Spark SQL & DataFrames, Kafka, Spark streaming and Apache Storm.
- Excellent knowledge of Hadoop architecture and daemons of Hadoop clusters, which include Name node,Data node, Resource manager, Node Manager and Job history server.
- Worked on both Cloudera and Horton works in Hadoop Distributions. Experience in managing Hadoop clustersusing Cloudera Manager tool.
- Well versed in installation, Configuration, Managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on experience in coding MapReduce/Yarn Programs using Java, Scala and Python for analyzing Big Data.
- Exposure to Cloudera development environment and management using Cloudera Manager.
- Extensively worked on Spark using Scala on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL/Oracle .
- Implemented Spark using PYTHON and utilizing Data frames and Spark SQL API for faster processing of data and handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, MapReduce and then loading data into HDFS.
- Used Spark Data Frames API over Cloudera platform to perform analytics on Hive data.
- Hands on experience in MLlib from Spark which are used for predictive intelligence, customer segmentation and for smooth maintenance in Spark streaming.
- Experience in using Flume to load log files into HDFS and Oozie for workflow design and scheduling.
- Experience in optimizing MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
- Working on creating data pipeline for different events of ingestion, aggregation, and load consumer response data into Hive external tables in HDFS location to serve as feed for tableau dashboards.
- Hands on experience in using Sqoop to import data into HDFS from RDBMS and vice-versa.
- In-depth Understanding of Oozie to schedule all Hive/Sqoop/HBase jobs.
- Hands on expertise in real time analytics with Apache Spark.
- Experience in converting Hive/SQL queries into RDD transformations using Apache Spark, Scala and Python.
- Extensive experience in working with different ETL tool environments like SSIS, Informatica and reporting tool environments like SQL Server Reporting Services (SSRS).
- Experience in Microsoft cloud and setting cluster in Amazon EC2 & S3 including the automation of setting & extending the clusters in AWS Amazon cloud.
- Extensively worked on Spark using Python on cluster for computational (analytics), installed it on top of Hadoop performed advanced analytical application by making use of Spark with Hive and SQL.
- Strong experience and knowledge of real time data analytics using Spark Streaming, Kafka and Flume.
- Knowledge in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4) distributions and on Amazon web services (AWS).
- Experienced in writing Ad Hoc queries using Cloudera Impala, also used Impala analytical functions.
- Experience in creating Data frames using PySpark and performing operation on the Data frames using Python.
- In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS and MapReduce Programming Paradigm, High Availability and YARN architecture.
- Establishing multiple connections to different Redshift clusters (Bank Prod, Card Prod, SBBDA Cluster) and provide the access for pulling the information we need for analysis.
- Generated various kinds of knowledge reports using Power BI based on Business specification.
- Developed interactive Tableau dashboards to provide a clear understanding of industry specific KPIs using quick filters and parameters to handle them more efficiently.
- Well Experience in projects using JIRA, Testing, Maven and Jenkins build tools.
- Experienced in designing, built, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high-availability, fault tolerance, and auto-scaling.
- Good experience with use-case development, with Software methodologies like Agile and Waterfall.
- Working knowledge of Amazon's Elastic Cloud Compute( EC2 ) infrastructure for computational tasks and Simple Storage Service ( S3 ) as Storage mechanism.
- Good working experience in importing data using Sqoop, SFTP from various sources like RDMS, Teradata, Mainframes, Oracle, Netezza to HDFS and performed transformations on it using Hive, Pig and Spark .
- Extensive experience in Text Analytics, developing different Statistical Machine Learning solutions to various business problems and generating data visualizations using Python and R.
- Proficient in NoSQL databases including HBase, Cassandra, MongoDB and its integration with Hadoop cluster.
- Hands on experience in Hadoop Big data technology working on MapReduce, Pig, Hive as Analysis tool, Sqoop and Flume data import/export tools.
About Quadratyx:
We are a product-centric insight & automation services company globally. We help the world’s organizations make better & faster decisions using the power of insight & intelligent automation. We build and operationalize their next-gen strategy, through Big Data, Artificial Intelligence, Machine Learning, Unstructured Data Processing and Advanced Analytics. Quadratyx can boast more extensive experience in data sciences & analytics than most other companies in India.
We firmly believe in Excellence Everywhere.
Job Description
Purpose of the Job/ Role:
• As a Technical Lead, your work is a combination of hands-on contribution, customer engagement and technical team management. Overall, you’ll design, architect, deploy and maintain big data solutions.
Key Requisites:
• Expertise in Data structures and algorithms.
• Technical management across the full life cycle of big data (Hadoop) projects from requirement gathering and analysis to platform selection, design of the architecture and deployment.
• Scaling of cloud-based infrastructure.
• Collaborating with business consultants, data scientists, engineers and developers to develop data solutions.
• Led and mentored a team of data engineers.
• Hands-on experience in test-driven development (TDD).
• Expertise in No SQL like Mongo, Cassandra etc, preferred Mongo and strong knowledge of relational databases.
• Good knowledge of Kafka and Spark Streaming internal architecture.
• Good knowledge of any Application Servers.
• Extensive knowledge of big data platforms like Hadoop; Hortonworks etc.
• Knowledge of data ingestion and integration on cloud services such as AWS; Google Cloud; Azure etc.
Skills/ Competencies Required
Technical Skills
• Strong expertise (9 or more out of 10) in at least one modern programming language, like Python, or Java.
• Clear end-to-end experience in designing, programming, and implementing large software systems.
• Passion and analytical abilities to solve complex problems Soft Skills.
• Always speaking your mind freely.
• Communicating ideas clearly in talking and writing, integrity to never copy or plagiarize intellectual property of others.
• Exercising discretion and independent judgment where needed in performing duties; not needing micro-management, maintaining high professional standards.
Academic Qualifications & Experience Required
Required Educational Qualification & Relevant Experience
• Bachelor’s or Master’s in Computer Science, Computer Engineering, or related discipline from a well-known institute.
• Minimum 7 - 10 years of work experience as a developer in an IT organization (preferably Analytics / Big Data/ Data Science / AI background.
o Strong Python development skills, with 7+ yrs. experience with SQL.
o A bachelor or master’s degree in Computer Science or related areas
o 5+ years of experience in data integration and pipeline development
o Experience in Implementing Databricks Delta lake and data lake
o Expertise designing and implementing data pipelines using modern data engineering approach and tools: SQL, Python, Delta Lake, Databricks, Snowflake Spark
o Experience in working with multiple file formats (Parque, Avro, Delta Lake) & API
o experience with AWS Cloud on data integration with S3.
o Hands on Development experience with Python and/or Scala.
o Experience with SQL and NoSQL databases.
o Experience in using data modeling techniques and tools (focused on Dimensional design)
o Experience with micro-service architecture using Docker and Kubernetes
o Have experience working with one or more of the public cloud providers i.e. AWS, Azure or GCP
o Experience in effectively presenting and summarizing complex data to diverse audiences through visualizations and other means
o Excellent verbal and written communications skills and strong leadership capabilities
Skills:
ML
MOdelling
Python
SQL
Azure Data Lake, dataFactory, Databricks, Delta Lake
Basic Qualifications:
∙Bachelors in Computer Science/Mathematics + Research (Machine Learning, Deep Learning, Statistics, Data Mining, Game Theory or core mathematical areas) from Tier1 tech institutes.
∙3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems.
∙1 year or more of experience specifically with deep learning (CNN, RNN, LSTM, RBM etc).
∙Strong working knowledge of deep learning, machine learning, and statistics.
- Deep domain understanding of Personalization, Search and Visual.
∙Strong math skills with statistical modeling / machine learning.
∙Hands-on experience building models with deep learning frameworks like MXNet or Tensorflow.
∙Experience in using Python, statistical/machine learning libs.
∙Ability to think creatively and solve problems.
∙Data presentation skills.
Preferred:
∙MS/ Ph.D. (Machine Learning, Deep Learning, Statistics, Data Mining, Game Theory or core mathematical areas) from IISc and other Top Global Universities.
∙Or, Publications in highly accredited journals (If available, please share links to your published work.).
∙Or, history of scaling ML/Deep learning algorithm at massively large scale.