Cutshort logo
Data cleansing Jobs in Chennai

11+ Data cleansing Jobs in Chennai | Data cleansing Job openings in Chennai

Apply to 11+ Data cleansing Jobs in Chennai on CutShort.io. Explore the latest Data cleansing Job opportunities across top companies like Google, Amazon & Adobe.

icon
Kaleidofin

at Kaleidofin

3 recruiters
Poornima B
Posted by Poornima B
Chennai, Bengaluru (Bangalore)
2 - 4 yrs
Best in industry
skill iconMachine Learning (ML)
skill iconPython
SQL
Customer Acquisition
Big Data
+2 more
Responsibility
  • Partnering with internal business owners (product, marketing, edit, etc.) to understand needs and develop custom analysis to optimize for user engagement and retention
  • Good understanding of the underlying business and workings of cross functional teams for successful execution
  • Design and develop analyses based on business requirement needs and challenges.
  • Leveraging statistical analysis on consumer research and data mining projects, including segmentation, clustering, factor analysis, multivariate regression, predictive modeling, etc.
  • Providing statistical analysis on custom research projects and consult on A/B testing and other statistical analysis as needed. Other reports and custom analysis as required.
  • Identify and use appropriate investigative and analytical technologies to interpret and verify results.
  • Apply and learn a wide variety of tools and languages to achieve results
  • Use best practices to develop statistical and/ or machine learning techniques to build models that address business needs.

Requirements
  • 2 - 4 years  of relevant experience in Data science.
  • Preferred education: Bachelor's degree in a technical field or equivalent experience.
  • Experience in advanced analytics, model building, statistical modeling, optimization, and machine learning algorithms.
  • Machine Learning Algorithms: Crystal clear understanding, coding, implementation, error analysis, model tuning knowledge on Linear Regression, Logistic Regression, SVM, shallow Neural Networks, clustering, Decision Trees, Random forest, XGBoost, Recommender Systems, ARIMA and Anomaly Detection. Feature selection, hyper parameters tuning, model selection and error analysis, boosting and ensemble methods.
  • Strong with programming languages like Python and data processing using SQL or equivalent and ability to experiment with newer open source tools.
  • Experience in normalizing data to ensure it is homogeneous and consistently formatted to enable sorting, query and analysis.
  • Experience designing, developing, implementing and maintaining a database and programs to manage data analysis efforts.
  • Experience with big data and cloud computing viz. Spark, Hadoop (MapReduce, PIG, HIVE).
  • Experience in risk and credit score domains preferred.
Read more
Ganit Business Solutions

at Ganit Business Solutions

3 recruiters
Viswanath Subramanian
Posted by Viswanath Subramanian
Chennai, Bengaluru (Bangalore), Mumbai
4 - 6 yrs
₹7L - ₹15L / yr
SQL
skill iconAmazon Web Services (AWS)
Data Warehouse (DWH)
Informatica
ETL
+1 more

Responsibilities:

  • Must be able to write quality code and build secure, highly available systems.
  • Assemble large, complex datasets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing datadelivery, re-designing infrastructure for greater scalability, etc with the guidance.
  • Create datatools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Monitoring performance and advising any necessary infrastructure changes.
  • Defining dataretention policies.
  • Implementing the ETL process and optimal data pipeline architecture
  • Build analytics tools that utilize the datapipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
  • Create design documents that describe the functionality, capacity, architecture, and process.
  • Develop, test, and implement datasolutions based on finalized design documents.
  • Work with dataand analytics experts to strive for greater functionality in our data
  • Proactively identify potential production issues and recommend and implement solutions

Skillsets:

  • Good understanding of optimal extraction, transformation, and loading of datafrom a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Proficient understanding of distributed computing principles
  • Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
  • Implemented complex projects dealing with the considerable datasize (PB).
  • Optimization techniques (performance, scalability, monitoring, etc.)
  • Experience with integration of datafrom multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Creation of DAGs for dataengineering
  • Expert at Python /Scala programming, especially for dataengineering/ ETL purposes
Read more
Agiletech Info Solutions pvt ltd
Chennai
4 - 8 yrs
₹4L - ₹15L / yr
ETL
Informatica
Data Warehouse (DWH)
Spark
SQL
+1 more
We are looking for a Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoy optimizing data systems and building them from the ground up.

The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
• Create and maintain optimal data pipeline architecture,
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
• Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
• Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
• Experience building and optimizing big data ETL pipelines, architectures and data sets.
• Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
• Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
• A successful history of manipulating, processing and extracting value from large disconnected
datasets.
Read more
vThink Global Technologies
Balasubramanian Ramaiyar
Posted by Balasubramanian Ramaiyar
Chennai
4 - 7 yrs
₹8L - ₹15L / yr
SQL
ETL
Informatica
Data Warehouse (DWH)
Stored Procedures
+1 more
We are looking for a strong SQL Developer well versed and hands-on in SQL, Stored Procedures, Joins and ETL. A data-savvy individual with advanced SQL skills.
Read more
Chennai, Hyderabad
5 - 10 yrs
₹10L - ₹25L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+2 more

Bigdata with cloud:

 

Experience : 5-10 years

 

Location : Hyderabad/Chennai

 

Notice period : 15-20 days Max

 

1.  Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight

2.  Experience in developing lambda functions with AWS Lambda

3.  Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark

4.  Should be able to code in Python and Scala.

5.  Snowflake experience will be a plus

Read more
Agiletech Info Solutions pvt ltd
Kalaithendral Nagarajan
Posted by Kalaithendral Nagarajan
Chennai
4 - 9 yrs
₹4L - ₹12L / yr
skill iconData Analytics
Data Visualization
PowerBI
Tableau
Qlikview
+5 more

4 - 8 overall experience.

  • 1-2 years’ experience in Azure Data Factory - schedule Jobs in Flows and ADF Pipelines, Performance Tuning, Error logging etc..
  • 1+ years of experience with Power BI - designing and developing reports, dashboards, metrics and visualizations in Powe BI.
  • (Required) Participate in video conferencing calls - daily stand-up meetings and all day working with team members on cloud migration planning, development, and support.
  • Proficiency in relational database concepts & design using star, Azure Datawarehouse, and data vault.
  • Requires 2-3 years of experience with SQL scripting (merge, joins, and stored procedures) and best practices.
  • Knowledge on deploying and run SSIS packages in Azure.
  • Knowledge of Azure Data Bricks.
  • Ability to write and execute complex SQL queries and stored procedures.
Read more
Amazon India

at Amazon India

1 video
58 recruiters
Tanya Thakur
Posted by Tanya Thakur
Chennai
5 - 12 yrs
₹10L - ₹22L / yr
Spotfire
Qlikview
Tableau
PowerBI
Data Visualization
+5 more

BASIC QUALIFICATIONS

 

  • 2+ years experience in program or project management
  • Project handling experience using six sigma/Lean processes
  • Experience interpreting data to make business recommendations

  • Bachelor’s degree or higher in Operations, Business, Project Management, Engineering
  • 5-10 years' experience in project / Customer Satisfaction, with proven success record
  • Understand basic and systematic approaches to manage projects/programs
  • Structured problem solving approach to identify & fix problems
  • Open-minded, creative and proactive thinking
  • Pioneer to invent and make differences
  • Understanding of customer experience, listening to customers' voice and work backwards to improve business process and operations
  • Certification in 6 Sigma

 

PREFERRED QUALIFICATIONS

 

  • Automation Skills with experience in Advance SQL, Python, Tableau
Read more
DFCS Technologies
Agency job
via dfcs Technologies by SheikDawood Ali
Remote, Chennai, Anywhere India
1 - 5 yrs
₹9L - ₹14L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+5 more
  • Create and maintain optimal data pipeline architecture,
  • Assemble large, complex data sets that meet functional / non-functional business requirements.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
  • Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
  • Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
  • Work with data and analytics experts to strive for greater functionality in our data systems.

  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • Strong project management and organizational skills.
  • Experience supporting and working with cross-functional teams in a dynamic environment.
  • We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools: Experience with big
    • data tools: Hadoop, Spark, Kafka, etc.
    • Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
    • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
    • Experience with AWS cloud services: EC2, EMR, RDS, Redshift
    • Experience with stream-processing systems: Storm, Spark-Streaming, etc.
    • Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Read more
Bungee Tech India
Abigail David
Posted by Abigail David
Remote, NCR (Delhi | Gurgaon | Noida), Chennai
5 - 10 yrs
₹10L - ₹30L / yr
Big Data
Hadoop
Apache Hive
Spark
ETL
+3 more

Company Description

At Bungee Tech, we help retailers and brands meet customers everywhere and, on every occasion, they are in. We believe that accurate, high-quality data matched with compelling market insights empowers retailers and brands to keep their customers at the center of all innovation and value they are delivering. 

 

We provide a clear and complete omnichannel picture of their competitive landscape to retailers and brands. We collect billions of data points every day and multiple times in a day from publicly available sources. Using high-quality extraction, we uncover detailed information on products or services, which we automatically match, and then proactively track for price, promotion, and availability. Plus, anything we do not match helps to identify a new assortment opportunity.

 

Empowered with this unrivalled intelligence, we unlock compelling analytics and insights that once blended with verified partner data from trusted sources such as Nielsen, paints a complete, consolidated picture of the competitive landscape.

We are looking for a Big Data Engineer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.

You will also be responsible for integrating them with the architecture used in the company.

 

We're working on the future. If you are seeking an environment where you can drive innovation, If you want to apply state-of-the-art software technologies to solve real world problems, If you want the satisfaction of providing visible benefit to end-users in an iterative fast paced environment, this is your opportunity.

 

Responsibilities

As an experienced member of the team, in this role, you will:

 

  • Contribute to evolving the technical direction of analytical Systems and play a critical role their design and development

 

  • You will research, design and code, troubleshoot and support. What you create is also what you own.

 

  • Develop the next generation of automation tools for monitoring and measuring data quality, with associated user interfaces.

 

  • Be able to broaden your technical skills and work in an environment that thrives on creativity, efficient execution, and product innovation.

 

BASIC QUALIFICATIONS

  • Bachelor’s degree or higher in an analytical area such as Computer Science, Physics, Mathematics, Statistics, Engineering or similar.
  • 5+ years relevant professional experience in Data Engineering and Business Intelligence
  • 5+ years in with Advanced SQL (analytical functions), ETL, Data Warehousing.
  • Strong knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures, data modeling and performance tuning.
  • Ability to effectively communicate with both business and technical teams.
  • Excellent coding skills in Java, Python, C++, or equivalent object-oriented programming language
  • Understanding of relational and non-relational databases and basic SQL
  • Proficiency with at least one of these scripting languages: Perl / Python / Ruby / shell script

 

PREFERRED QUALIFICATIONS

 

  • Experience with building data pipelines from application databases.
  • Experience with AWS services - S3, Redshift, Spectrum, EMR, Glue, Athena, ELK etc.
  • Experience working with Data Lakes.
  • Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
  • Sharp problem solving skills and ability to resolve ambiguous requirements
  • Experience on working with Big Data
  • Knowledge and experience on working with Hive and the Hadoop ecosystem
  • Knowledge of Spark
  • Experience working with Data Science teams
Read more
LatentView Analytics
Bengaluru (Bangalore), Chennai
9 - 14 yrs
₹9L - ₹14L / yr
Data Structures
Business Development
skill iconData Analytics
Regression Testing
skill iconMachine Learning (ML)
+4 more
Required Skill Set: -5+ years of hands-on experience in delivering results-driven analytics solutions with proven business value - Great consulting and quantitative skills, detail-oriented approach, with proven expertise in developing solutions using SQL, R, Python or such tools - A background in Statistics / Econometrics / Applied Math / Operations Research would be considered a plus -Exposure to working with globally dispersed teams based out of India or other offshore locations Role Description/ Responsibilities: Be the face of LatentView in the client's organization and help define analytics-driven consulting solutions to business problems -Translate business problems into analytic solution requirements and work with the LatentView team to develop high-quality solutions "- Communicate effectively with client / offshore team to manage client expectations and ensure timeliness and quality of insights -Develop expertise in clients business and help translate that into increasingly high value-added advisory solutions to client -Oversee Project Delivery to ensure the team meets the quality, productivity and SLA objectives - Grow the Account in terms of revenue and the size of the team You should Apply if you want to: - Change the world with Math and Models: At the core, we believe that analytics can help drive business transformation and lasting competitive advantage. We work with a heavy mix of algorithms, analysis, large databases and ROI to positively transform many a client- business performance - Make a direct impact on business: Your contribution to delivering results-driven solutions can potentially lead to millions of dollars of additional revenue or profit for our clients - Thrive in a Fast-pace Environment: You work in small teams, in an entrepreneurial environment, and a meritorious culture that values speed, growth, diversity and contribution - Work with great people: Our selection process ensures that we hire only the very best, while more than 50% of our analysts and 90% of our managers are alumni/alumna of prestigious global institutions
Read more
GeakMinds Technologies Pvt Ltd
John Richardson
Posted by John Richardson
Chennai
1 - 5 yrs
₹1L - ₹6L / yr
Hadoop
Big Data
HDFS
Apache Sqoop
Apache Flume
+2 more
• Looking for Big Data Engineer with 3+ years of experience. • Hands-on experience with MapReduce-based platforms, like Pig, Spark, Shark. • Hands-on experience with data pipeline tools like Kafka, Storm, Spark Streaming. • Store and query data with Sqoop, Hive, MySQL, HBase, Cassandra, MongoDB, Drill, Phoenix, and Presto. • Hands-on experience in managing Big Data on a cluster with HDFS and MapReduce. • Handle streaming data in real time with Kafka, Flume, Spark Streaming, Flink, and Storm. • Experience with Azure cloud, Cognitive Services, Databricks is preferred.
Read more
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Why apply via Cutshort?
Connect with actual hiring teams and get their fast response. No spam.
Find more jobs
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort