Working closely with the Product group and other teams, the Data Engineer is responsible for the development, deployment and maintenance of our data infrastructure and applications. With a focus on quality, error-free data delivery, the Data Engineer works to ensure our data is appropriately available and fully supporting various constituencies across our organization. This is a multifaceted opportunity to work with a small, talented team on impactful projects that are essential to ACUE’s higher education success. As an early member of our tech group, you’ll have the unique opportunity to build critical systems and features while helping shape the direction of the team and the product.
Similar jobs
About InEvolution
Founded in 2009, InEvolution stands as a beacon of excellence in providing back-office, operations, and customer support services globally. Our team, comprising highly skilled professionals, is committed to delivering top-notch quality services while ensuring cost efficiency. At InEvolution, we value innovation, quality, and our team's growth and development.
About the Role
- Work on building, processing and transferring data and dashboards from existing Domo Platform to Power BI Platform.
- Audit existing data systems and deployments and identify errors or areas for improvement.
- Utilize Power BI to build interactive and visually appealing dashboards and reports.
- Build Data Documentation and explanation on parameters, filters, models and relationships used in the dashboards.
- Review existing SQL data sources to improve, connect and integrate it effortlessly with Power BI.
- Create, test and deploy Power BI scripts, as well as execute efficient migration practices.
- Work closely with the current analytics team, to define requirements, migration steps and, have an open and transparent communication with the team on reviewing the migrated reports and data sources for successful outcomes.
- Ensure all dashboards and data sources are thoroughly reviewed by the team before publishing to the production environment.
- Convert business needs into technical specifications and establish a timeline for job completion.
Requirements & Skills:
- 2+ years of experience in using Power BI to run DAX queries and other advanced interactive functions.
- 2+ years of experience with Data Analysis and Data Visualization tools.
- 1+ years of experience working with Relational Databases and building SQL queries.
- Familiarity with Data Collection, Cleaning and Transformation processes.
- Attention to detail and the ability to work with complex datasets.
Responsibilities
Data associates play a critical role and work on various data or content-focused projects across the organisation. The role is work from home, with a monthly in-person meet up with other team members in local region.
- Analyze large set of unstructured data, extract insights and store in data management systems.
- Research, gather, write informational news articles and stories
- Categorize entities based on a set of rules and gained knowledge and experience, dashboard information, and validation with proprietary algorithms and data-driven heuristics.
- Analyze the market (including competitors) and ensure a rich level of data quality across all products and platforms across.
- Complete ad hoc data retrieval and analysis using relational databases, Excel and other data management systems.
- Monitor existing metrics as well as develop and propose new metrics to make actionable intelligence available to business stakeholders.
- Support cross-functional teams on the day-to-day execution of projects and initiatives.
- Communicate insights to key stakeholders.
Requirements
- Preferred: Bachelors in Engineering or Science.
- GPA of 8+ or an overall score of 80%+.
- Location: Coimbatore / Remote
- Strong analytical and problem solving skills with focus on quality and detail orientation.
- Strong proficiency in English reading, written, communication.
- Strong work ethic and personal initiative, reliable self-starter that is capable of working with a high degree of autonomy.
- Ability to work across global cross-office teams and in a team environment.
- Excellent organizational and task management skills.
- Strong verbal and written communication skills with the ability to articulate results and issues to internal and client teams.
- Process management, improvement focus, and willingness to learn cutting edge tool and technology.
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
Description
About us
Welcome to Decision Foundry!
We are both a high growth startup and one of the longest tenured Salesforce Marketing Cloud Implementation Partners in the ecosystem. Forged from a 19-year-old web analytics company, Decision Foundry is the leader in Salesforce intelligence solutions.
We win as an organization through our core tenets. They include:
- One Team. One Theme.
- We sign it. We deliver it.
- Be Accountable and Expect Accountability.
- Raise Your Hand or Be Willing to Extend it
Requirements
• Strong understanding of data management principles and practices (Preferred experience: AWS Redshift).
• Experience with Tableau server administration, including user management and permissions (preferred, not mandatory).
• Ability to monitor alerts and application logs for data processing issues and troubleshooting.
• Ability to handle and monitor support tickets queues and act accordingly based on SLAs and priority.
• Ability to work collaboratively with cross-functional teams, including Data Engineers and BI team.
• Strong analytical and problem-solving skills.
• Familiar with data warehousing concept and ETL processes.
• Experience with SQL, DBT and database technologies such as Redshift, Postgres, MongoDB, etc.
• Familiar with data integration tools such as Fivetran or Funnel.io
• Familiar with programming languages such as Python.
• Familiar with cloud-based data technologies such as AWS.
• Experience with data ingestion and orchestration tools such as AWS Glue.
• Excellent communication and interpersonal skills.
• Should possess experience of 2+ years.
Cloudera Data Warehouse Hive team looking for a passionate senior developer to join our growing engineering team. This group is targeting the biggest enterprises wanting to utilize Cloudera’s services in a private and public cloud environment. Our product is built on open source technologies like Hive, Impala, Hadoop, Kudu, Spark and so many more providing unlimited learning opportunities.A Day in the LifeOver the past 10+ years, Cloudera has experienced tremendous growth making us the leading contributor to Big Data platforms and ecosystems and a leading provider for enterprise solutions based on Apache Hadoop. You will work with some of the best engineers in the industry who are tackling challenges that will continue to shape the Big Data revolution. We foster an engaging, supportive, and productive work environment where you can do your best work. The team culture values engineering excellence, technical depth, grassroots innovation, teamwork, and collaboration.
You will manage product development for our CDP components, develop engineering tools and scalable services to enable efficient development, testing, and release operations. You will be immersed in many exciting, cutting-edge technologies and projects, including collaboration with developers, testers, product, field engineers, and our external partners, both software and hardware vendors.Opportunity:Cloudera is a leader in the fast-growing big data platforms market. This is a rare chance to make a name for yourself in the industry and in the Open Source world. The candidate will responsible for Apache Hive and CDW projects. We are looking for a candidate who would like to work on these projects upstream and downstream. If you are curious about the project and code quality you can check the project and the code at the following link. You can start the development before you join. This is one of the beauties of the OSS world.Apache Hive
Responsibilities:
•Build robust and scalable data infrastructure software
•Design and create services and system architecture for your projects
•Improve code quality through writing unit tests, automation, and code reviews
•The candidate would write Java code and/or build several services in the Cloudera Data Warehouse.
•Worked with a team of engineers who reviewed each other's code/designs and held each other to an extremely high bar for the quality of code/designs
•The candidate has to understand the basics of Kubernetes.
•Build out the production and test infrastructure.
•Develop automation frameworks to reproduce issues and prevent regressions.
•Work closely with other developers providing services to our system.
•Help to analyze and to understand how customers use the product and improve it where necessary.
Qualifications:
•Deep familiarity with Java programming language.
•Hands-on experience with distributed systems.
•Knowledge of database concepts, RDBMS internals.
•Knowledge of the Hadoop stack, containers, or Kubernetes is a strong plus.
•Has experience working in a distributed team.
•Has 3+ years of experience in software development.
WHAT YOU WILL DO:
-
● Create and maintain optimal data pipeline architecture.
-
● Assemble large, complex data sets that meet functional / non-functional business requirements.
-
● Identify, design, and implement internal process improvements: automating manual processes,
optimizing data delivery, re-designing infrastructure for greater scalability, etc.
-
● Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide
variety of data sources using Spark,Hadoop and AWS 'big data' technologies.(EC2, EMR, S3, Athena).
-
● Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition,
operational efficiency and other key business performance metrics.
-
● Work with stakeholders including the Executive, Product, Data and Design teams to assist with
data-related technical issues and support their data infrastructure needs.
-
● Keep our data separated and secure across national boundaries through multiple data centers and AWS
regions.
-
● Create data tools for analytics and data scientist team members that assist them in building and
optimizing our product into an innovative industry leader.
-
● Work with data and analytics experts to strive for greater functionality in our data systems.
REQUIRED SKILLS & QUALIFICATIONS:
-
● 5+ years of experience in a Data Engineer role.
-
● Advanced working SQL knowledge and experience working with relational databases, query authoring
(SQL) as well as working familiarity with a variety of databases.
-
● Experience building and optimizing 'big data' data pipelines, architectures and data sets.
-
● Experience performing root cause analysis on internal and external data and processes to answer
specific business questions and identify opportunities for improvement.
-
● Strong analytic skills related to working with unstructured datasets.
-
● Build processes supporting data transformation, data structures, metadata, dependency and workload
management.
-
● A successful history of manipulating, processing and extracting value from large disconnected datasets.
-
● Working knowledge of message queuing, stream processing, and highly scalable 'big data' data stores.
-
● Strong project management and organizational skills.
-
● Experience supporting and working with cross-functional teams in a dynamic environment
-
● Experience with big data tools: Hadoop, Spark, Pig, Vetica, etc.
-
● Experience with AWS cloud services: EC2, EMR, S3, Athena
-
● Experience with Linux
-
● Experience with object-oriented/object function scripting languages: Python, Java, Shell, Scala, etc.
PREFERRED SKILLS & QUALIFICATIONS:
● Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
Roles and Responsibilities
- Managing available resources such as hardware, data, and personnel so that deadlines are met.
- Analyzing the ML and Deep Learning algorithms that could be used to solve a given problem and ranking them by their success probabilities
- Exploring data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world
- Defining validation framework and establish a process to ensure acceptable data quality criteria are met
- Supervising the data acquisition and partnership roadmaps to create stronger product for our customers.
- Defining feature engineering process to ensure usage of meaningful features given the business constraints which may vary by market
- Device self-learning strategies through analysis of errors from the models
- Understand business issues and context, devise a framework for solving unstructured problems and articulate clear and actionable solutions underpinned by analytics.
- Manage multiple projects simultaneously while demonstrating business leadership to collaborate & coordinate with different functions to deliver the solutions in a timely, efficient and effective manner.
- Manage project resources optimally to deliver projects on time; drive innovation using residual resources to create strong solution pipeline; provide direction, coaching & training, feedbacks to project team members to enhance performance, support development and encourage value aligned behaviour of the project team members; Provide inputs for periodic performance appraisal of project team members.
Preferred Technical & Professional expertise
- Undergraduate Degree in Computer Science / Engineering / Mathematics / Statistics / economics or other quantitative fields
- At least 2+ years of experience of managing Data Science projects with specializations in Machine Learning
- In-depth knowledge of cloud analytics tools.
- Able to drive Python Code optimization; ability review codes and provide inputs to improve the quality of codes
- Ability to evaluate hardware selection for running ML models for optimal performance
- Up to date with Python libraries and versions for machine learning; Extensive hands-on experience with Regressors; Experience working with data pipelines.
- Deep knowledge of math, probability, statistics and algorithms; Working knowledge of Supervised Learning, Adversarial Learning and Unsupervised learning
- Deep analytical thinking with excellent problem-solving abilities
- Strong verbal and written communication skills with a proven ability to work with all levels of management; effective interpersonal and influencing skills.
- Ability to manage a project team through effectively allocation of tasks, anticipating risks and setting realistic timelines for managing the expectations of key stakeholders
- Strong organizational skills and an ability to balance and handle multiple concurrent tasks and/or issues simultaneously.
- Ensure that the project team understand and abide by compliance framework for policies, data, systems etc. as per group, region and local standards
Job Description:
Roles & Responsibilities:
· You will be involved in every part of the project lifecycle, right from identifying the business problem and proposing a solution, to data collection, cleaning, and preprocessing, to training and optimizing ML/DL models and deploying them to production.
· You will often be required to design and execute proof-of-concept projects that can demonstrate business value and build confidence with CloudMoyo’s clients.
· You will be involved in designing and delivering data visualizations that utilize the ML models to generate insights and intuitively deliver business value to CXOs.
Desired Skill Set:
· Candidates should have strong Python coding skills and be comfortable working with various ML/DL frameworks and libraries.
· Hands-on skills and industry experience in one or more of the following areas is necessary:
1) Deep Learning (CNNs/RNNs, Reinforcement Learning, VAEs/GANs)
2) Machine Learning (Regression, Random Forests, SVMs, K-means, ensemble methods)
3) Natural Language Processing
4) Graph Databases (Neo4j, Apache Giraph)
5) Azure Bot Service
6) Azure ML Studio / Azure Cognitive Services
7) Log Analytics with NLP/ML/DL
· Previous experience with data visualization, C# or Azure Cloud platform and services will be a plus.
· Candidates should have excellent communication skills and be highly technical, with the ability to discuss ideas at any level from executive to developer.
· Creative problem-solving, unconventional approaches and a hacker mindset is highly desired.