11+ Data cleansing Jobs in Chennai | Data cleansing Job openings in Chennai
Apply to 11+ Data cleansing Jobs in Chennai on CutShort.io. Explore the latest Data cleansing Job opportunities across top companies like Google, Amazon & Adobe.
- Partnering with internal business owners (product, marketing, edit, etc.) to understand needs and develop custom analysis to optimize for user engagement and retention
- Good understanding of the underlying business and workings of cross functional teams for successful execution
- Design and develop analyses based on business requirement needs and challenges.
- Leveraging statistical analysis on consumer research and data mining projects, including segmentation, clustering, factor analysis, multivariate regression, predictive modeling, etc.
- Providing statistical analysis on custom research projects and consult on A/B testing and other statistical analysis as needed. Other reports and custom analysis as required.
- Identify and use appropriate investigative and analytical technologies to interpret and verify results.
- Apply and learn a wide variety of tools and languages to achieve results
- Use best practices to develop statistical and/ or machine learning techniques to build models that address business needs.
Requirements
- 2 - 4 years of relevant experience in Data science.
- Preferred education: Bachelor's degree in a technical field or equivalent experience.
- Experience in advanced analytics, model building, statistical modeling, optimization, and machine learning algorithms.
- Machine Learning Algorithms: Crystal clear understanding, coding, implementation, error analysis, model tuning knowledge on Linear Regression, Logistic Regression, SVM, shallow Neural Networks, clustering, Decision Trees, Random forest, XGBoost, Recommender Systems, ARIMA and Anomaly Detection. Feature selection, hyper parameters tuning, model selection and error analysis, boosting and ensemble methods.
- Strong with programming languages like Python and data processing using SQL or equivalent and ability to experiment with newer open source tools.
- Experience in normalizing data to ensure it is homogeneous and consistently formatted to enable sorting, query and analysis.
- Experience designing, developing, implementing and maintaining a database and programs to manage data analysis efforts.
- Experience with big data and cloud computing viz. Spark, Hadoop (MapReduce, PIG, HIVE).
- Experience in risk and credit score domains preferred.
Responsibilities:
- Must be able to write quality code and build secure, highly available systems.
- Assemble large, complex datasets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing datadelivery, re-designing infrastructure for greater scalability, etc with the guidance.
- Create datatools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Monitoring performance and advising any necessary infrastructure changes.
- Defining dataretention policies.
- Implementing the ETL process and optimal data pipeline architecture
- Build analytics tools that utilize the datapipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
- Create design documents that describe the functionality, capacity, architecture, and process.
- Develop, test, and implement datasolutions based on finalized design documents.
- Work with dataand analytics experts to strive for greater functionality in our data
- Proactively identify potential production issues and recommend and implement solutions
Skillsets:
- Good understanding of optimal extraction, transformation, and loading of datafrom a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Proficient understanding of distributed computing principles
- Experience in working with batch processing/ real-time systems using various open-source technologies like NoSQL, Spark, Pig, Hive, Apache Airflow.
- Implemented complex projects dealing with the considerable datasize (PB).
- Optimization techniques (performance, scalability, monitoring, etc.)
- Experience with integration of datafrom multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB, etc.,
- Knowledge of various ETL techniques and frameworks, such as Flume
- Experience with various messaging systems, such as Kafka or RabbitMQ
- Good understanding of Lambda Architecture, along with its advantages and drawbacks
- Creation of DAGs for dataengineering
- Expert at Python /Scala programming, especially for dataengineering/ ETL purposes
The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
Responsibilities for Data Engineer
• Create and maintain optimal data pipeline architecture,
• Assemble large, complex data sets that meet functional / non-functional business requirements.
• Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
• Build the infrastructure required for optimal extraction, transformation, and loading of data
from a wide variety of data sources using SQL and AWS big data technologies.
• Build analytics tools that utilize the data pipeline to provide actionable insights into customer
acquisition, operational efficiency and other key business performance metrics.
• Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
• Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
• Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
• Experience building and optimizing big data ETL pipelines, architectures and data sets.
• Advanced working SQL knowledge and experience working with relational databases, query
authoring (SQL) as well as working familiarity with a variety of databases.
• Experience performing root cause analysis on internal and external data and processes to
answer specific business questions and identify opportunities for improvement.
• Strong analytic skills related to working with unstructured datasets.
• Build processes supporting data transformation, data structures, metadata, dependency and
workload management.
• A successful history of manipulating, processing and extracting value from large disconnected
datasets.
at vThink Global Technologies
at Altimetrik
Bigdata with cloud:
Experience : 5-10 years
Location : Hyderabad/Chennai
Notice period : 15-20 days Max
1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus
4 - 8 overall experience.
- 1-2 years’ experience in Azure Data Factory - schedule Jobs in Flows and ADF Pipelines, Performance Tuning, Error logging etc..
- 1+ years of experience with Power BI - designing and developing reports, dashboards, metrics and visualizations in Powe BI.
- (Required) Participate in video conferencing calls - daily stand-up meetings and all day working with team members on cloud migration planning, development, and support.
- Proficiency in relational database concepts & design using star, Azure Datawarehouse, and data vault.
- Requires 2-3 years of experience with SQL scripting (merge, joins, and stored procedures) and best practices.
- Knowledge on deploying and run SSIS packages in Azure.
- Knowledge of Azure Data Bricks.
- Ability to write and execute complex SQL queries and stored procedures.
BASIC QUALIFICATIONS
- 2+ years experience in program or project management
- Project handling experience using six sigma/Lean processes
- Experience interpreting data to make business recommendations
- Bachelor’s degree or higher in Operations, Business, Project Management, Engineering
- 5-10 years' experience in project / Customer Satisfaction, with proven success record
- Understand basic and systematic approaches to manage projects/programs
- Structured problem solving approach to identify & fix problems
- Open-minded, creative and proactive thinking
- Pioneer to invent and make differences
- Understanding of customer experience, listening to customers' voice and work backwards to improve business process and operations
- Certification in 6 Sigma
PREFERRED QUALIFICATIONS
- Automation Skills with experience in Advance SQL, Python, Tableau
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools: Experience with big
- data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Company Description
At Bungee Tech, we help retailers and brands meet customers everywhere and, on every occasion, they are in. We believe that accurate, high-quality data matched with compelling market insights empowers retailers and brands to keep their customers at the center of all innovation and value they are delivering.
We provide a clear and complete omnichannel picture of their competitive landscape to retailers and brands. We collect billions of data points every day and multiple times in a day from publicly available sources. Using high-quality extraction, we uncover detailed information on products or services, which we automatically match, and then proactively track for price, promotion, and availability. Plus, anything we do not match helps to identify a new assortment opportunity.
Empowered with this unrivalled intelligence, we unlock compelling analytics and insights that once blended with verified partner data from trusted sources such as Nielsen, paints a complete, consolidated picture of the competitive landscape.
We are looking for a Big Data Engineer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them.
You will also be responsible for integrating them with the architecture used in the company.
We're working on the future. If you are seeking an environment where you can drive innovation, If you want to apply state-of-the-art software technologies to solve real world problems, If you want the satisfaction of providing visible benefit to end-users in an iterative fast paced environment, this is your opportunity.
Responsibilities
As an experienced member of the team, in this role, you will:
- Contribute to evolving the technical direction of analytical Systems and play a critical role their design and development
- You will research, design and code, troubleshoot and support. What you create is also what you own.
- Develop the next generation of automation tools for monitoring and measuring data quality, with associated user interfaces.
- Be able to broaden your technical skills and work in an environment that thrives on creativity, efficient execution, and product innovation.
BASIC QUALIFICATIONS
- Bachelor’s degree or higher in an analytical area such as Computer Science, Physics, Mathematics, Statistics, Engineering or similar.
- 5+ years relevant professional experience in Data Engineering and Business Intelligence
- 5+ years in with Advanced SQL (analytical functions), ETL, Data Warehousing.
- Strong knowledge of data warehousing concepts, including data warehouse technical architectures, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures, data modeling and performance tuning.
- Ability to effectively communicate with both business and technical teams.
- Excellent coding skills in Java, Python, C++, or equivalent object-oriented programming language
- Understanding of relational and non-relational databases and basic SQL
- Proficiency with at least one of these scripting languages: Perl / Python / Ruby / shell script
PREFERRED QUALIFICATIONS
- Experience with building data pipelines from application databases.
- Experience with AWS services - S3, Redshift, Spectrum, EMR, Glue, Athena, ELK etc.
- Experience working with Data Lakes.
- Experience providing technical leadership and mentor other engineers for the best practices on the data engineering space
- Sharp problem solving skills and ability to resolve ambiguous requirements
- Experience on working with Big Data
- Knowledge and experience on working with Hive and the Hadoop ecosystem
- Knowledge of Spark
- Experience working with Data Science teams