Technologies & Languages
- SQL Sever
- Data Cleaning
- Azure Devops
- Intermediate Python/Pyspark
- Intermediate SQL
- Beginners' knowledge/willingness to learn Spotfire
- Data Ingestion
- Familiarity with CI/CD or Agile
- Azure – VM, Data Lake, Data Bricks, Data Factory, Azure DevOps
- Python/Spark (PySpark)
Good to have:
He/she should have a good understanding in:
- How to build pipelines – ETL and Injection
- Data Warehousing
Must be able to write quality code and build secure, highly available systems.
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc with the guidance.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Monitoring performance and advising any necessary infrastructure changes.
Defining data retention policies.
Implementing the ETL process and optimal data pipeline architecture
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Create design documents that describe the functionality, capacity, architecture, and process.
Develop, test, and implement data solutions based on finalized design documents.
Work with data and analytics experts to strive for greater functionality in our data systems.
Proactively identify potential production issues and recommend and implement solutions
About Ganit Business Solutions
- Design, plan and control the implementation of business solutions requests/demands.
- Execution of best practices, design, and codification, guiding the rest of the team in accordance with it.
- Gather the requirements and specifications to understand the client requirements in a detailed manner and translate the same into system requirements
- Drive complex technical projects from planning through execution
- Perform code review and manage technical debt
- Handling release deployments and production issues
- Coordinate stress tests, stability evaluations, and support for the concurrent processing of specific solutions
- Participate in project estimation, provide inputs for solution delivery, conduct technical risk planning, perform code reviews and unit test plan reviews
- Degree in Informatics Engineering, Computer Science, or in similar areas
- Minimum of 5+ years’ work experience in the similar roles
- Expert knowledge in developing cloud-based applications with Java, Spring Boot, Spring Rest, SpringJPA, and Spring Cloud
- Strong understanding of Azure Data Services
- Strong working knowledge of SQL Server, SQL Azure Database, No SQL, Data Modeling, Azure AD, ADFS, Identity & Access Management.
- Hands-on experience in ThingWorx platform (Application development, Mashups creation, Installation of ThingWorx and ThingWorx components)
- Strong knowledge of IoT Platform
- Development experience in Microservices Architectures best practices and, Docker, Kubernetes
- Experience designing /maintaining/tuning high-performance code to ensure optimal performance
- Strong knowledge of web security practice
- Experience working in Agile Development
- Knowledge about Google CloudPlatform and Kubernetes
- Good understanding of Git, source control procedures, and feature branching
- Fluent in English - written and spoken (mandatory)
Power BI Developer(Azure Developer )
Senior visualization engineer with understanding in Azure Data Factory & Databricks to develop and deliver solutions that enable delivery of information to audiences in support of key business processes.
Ensure code and design quality through execution of test plans and assist in development of standards & guidelines working closely with internal and external design, business and technical counterparts.
- Strong designing concepts of data visualization centered on business user and a knack of communicating insights visually.
- Ability to produce any of the charting methods available with drill down options and action-based reporting. This includes use of right graphs for the underlying data with company themes and objects.
- Publishing reports & dashboards on reporting server and providing role-based access to users.
- Ability to create wireframes on any tool for communicating the reporting design.
- Creation of ad-hoc reports & dashboards to visually communicate data hub metrics (metadata information) for top management understanding.
- Should be able to handle huge volume of data from databases such as SQL Server, Synapse, Delta Lake or flat files and create high performance dashboards.
- Should be good in Power BI development
- Expertise in 2 or more BI (Visualization) tools in building reports and dashboards.
- Understanding of Azure components like Azure Data Factory, Data lake Store, SQL Database, Azure Databricks
- Strong knowledge in SQL queries
- Must have worked in full life-cycle development from functional design to deployment
- Intermediate understanding to format, process and transform data
- Should have working knowledge of GIT, SVN
- Good experience in establishing connection with heterogeneous sources like Hadoop, Hive, Amazon, Azure, Salesforce, SAP, HANA, API’s, various Databases etc.
- Basic understanding of data modelling and ability to combine data from multiple sources to create integrated reports
- Bachelor's degree in Computer Science or Technology
- Proven success in contributing to a team-oriented environment
- 5+ years of industry experience in administering (including setting up, managing, monitoring) data processing pipelines (both streaming and batch) using frameworks such as Kafka Streams, Py Spark, and streaming databases like druid or equivalent like Hive
- Strong industry expertise with containerization technologies including kubernetes (EKS/AKS), Kubeflow
- Experience with cloud platform services such as AWS, Azure or GCP especially with EKS, Managed Kafka
- 5+ Industry experience in python
- Experience with popular modern web frameworks such as Spring boot, Play framework, or Django
- Experience with scripting languages. Python experience highly desirable. Experience in API development using Swagger
- Implementing automated testing platforms and unit tests
- Proficient understanding of code versioning tools, such as Git
- Familiarity with continuous integration, Jenkins
- Architect, Design and Implement Large scale data processing pipelines using Kafka Streams, PySpark, Fluentd and Druid
- Create custom Operators for Kubernetes, Kubeflow
- Develop data ingestion processes and ETLs
- Assist in dev ops operations
- Design and Implement APIs
- Identify performance bottlenecks and bugs, and devise solutions to these problems
- Help maintain code quality, organization, and documentation
- Communicate with stakeholders regarding various aspects of solution.
- Mentor team members on best practices
Should be able to use the transformations components to transform the data
Should possess knowledge on incremental load, full load etc.
Should Design, build and deploy effective packages
Should be able to schedule these packages through task schedulers
Implement stored procedures and effectively query a database
Translate requirements from the business and analyst into technical code
Identify and test for bugs and bottlenecks in the ETL solution
Ensure the best possible performance and quality in the packages
Provide support and fix issues in the packages
Writes advanced SQL including some query tuning
Experience in the identification of data quality
Some database design experience is helpful
Experience designing and building complete ETL/SSIS processes moving and transforming data for
ODS, Staging, and Data Warehousing
Designation: Specialist - Cloud Service Developer (ABL_SS_600)
- The person would be primary responsible for developing solutions using AWS services. Ex: Fargate, Lambda, ECS, ALB, NLB, S3 etc.
- Apply advanced troubleshooting techniques to provide Solutions to issues pertaining to Service Availability, Performance, and Resiliency
- Monitor & Optimize the performance using AWS dashboards and logs
- Partner with Engineering leaders and peers in delivering technology solutions that meet the business requirements
- Work with the cloud team in agile approach and develop cost optimized solutions
- Develop solutions using AWS services includiing Fargate, Lambda, ECS, ALB, NLB, S3 etc.
- Reporting Designation: Head - Big Data Engineering and Cloud Development (ABL_SS_414)
- Reporting Department: Application Development (2487)
- AWS certification would be preferred
- Good understanding in Monitoring (Cloudwatch, alarms, logs, custom metrics, Trust SNS configuration)
- Good experience with Fargate, Lambda, ECS, ALB, NLB, S3, Glue, Aurora and other AWS services.
- Preferred to have Knowledge on Storage (S3, Life cycle management, Event configuration)
- Good in data structure, programming in (pyspark / python / golang / Scala)
Datametica is looking for talented SQL engineers who would get training & the opportunity to work on Cloud and Big Data Analytics.
- Strong in SQL development
- Hands-on at least one scripting language - preferably shell scripting
- Development experience in Data warehouse projects
- Selected candidates will be provided training opportunities on one or more of the following: Google Cloud, AWS, DevOps Tools, Big Data technologies like Hadoop, Pig, Hive, Spark, Sqoop, Flume, and KafkaWould get a chance to be part of the enterprise-grade implementation of Cloud and Big Data systems
- Will play an active role in setting up the Modern data platform based on Cloud and Big Data
- Would be part of teams with rich experience in various aspects of distributed systems and computing
Responsibilities for Data Engineer
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Qualifications for Data Engineer
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytic skills related to working with unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- A successful history of manipulating, processing and extracting value from large disconnected datasets.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Strong project management and organizational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- We are looking for a candidate with 5+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools:
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS cloud services: EC2, EMR, RDS, Redshift
- Experience with stream-processing systems: Storm, Spark-Streaming, etc.
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Partnering with internal business owners (product, marketing, edit, etc.) to understand needs and develop custom analysis to optimize for user engagement and retention
- Good understanding of the underlying business and workings of cross functional teams for successful execution
- Design and develop analyses based on business requirement needs and challenges.
- Leveraging statistical analysis on consumer research and data mining projects, including segmentation, clustering, factor analysis, multivariate regression, predictive modeling, etc.
- Providing statistical analysis on custom research projects and consult on A/B testing and other statistical analysis as needed. Other reports and custom analysis as required.
- Identify and use appropriate investigative and analytical technologies to interpret and verify results.
- Apply and learn a wide variety of tools and languages to achieve results
- Use best practices to develop statistical and/ or machine learning techniques to build models that address business needs.
- 2 - 4 years of relevant experience in Data science.
- Preferred education: Bachelor's degree in a technical field or equivalent experience.
- Experience in advanced analytics, model building, statistical modeling, optimization, and machine learning algorithms.
- Machine Learning Algorithms: Crystal clear understanding, coding, implementation, error analysis, model tuning knowledge on Linear Regression, Logistic Regression, SVM, shallow Neural Networks, clustering, Decision Trees, Random forest, XGBoost, Recommender Systems, ARIMA and Anomaly Detection. Feature selection, hyper parameters tuning, model selection and error analysis, boosting and ensemble methods.
- Strong with programming languages like Python and data processing using SQL or equivalent and ability to experiment with newer open source tools.
- Experience in normalizing data to ensure it is homogeneous and consistently formatted to enable sorting, query and analysis.
- Experience designing, developing, implementing and maintaining a database and programs to manage data analysis efforts.
- Experience with big data and cloud computing viz. Spark, Hadoop (MapReduce, PIG, HIVE).
- Experience in risk and credit score domains preferred.