5-7 years of experience in Data Engineering with solid experience in design, development and implementation of end-to-end data ingestion and data processing system in AWS platform.
2-3 years of experience in AWS Glue, Lambda, Appflow, EventBridge, Python, PySpark, Lake House, S3, Redshift, Postgres, API Gateway, CloudFormation, Kinesis, Athena, KMS, IAM.
Experience in modern data architecture, Lake House, Enterprise Data Lake, Data Warehouse, API interfaces, solution patterns, standards and optimizing data ingestion.
Experience in build of data pipelines from source systems like SAP Concur, Veeva Vault, Azure Cost, various social media platforms or similar source systems.
Expertise in analyzing source data and designing a robust and scalable data ingestion framework and pipelines adhering to client Enterprise Data Architecture guidelines.
Proficient in design and development of solutions for real-time (or near real time) stream data processing as well as batch processing on the AWS platform.
Work closely with business analysts, data architects, data engineers, and data analysts to ensure that the data ingestion solutions meet the needs of the business.
Troubleshoot and provide support for issues related to data quality and data ingestion solutions. This may involve debugging data pipeline processes, optimizing queries, or troubleshooting application performance issues.
Experience in working in Agile/Scrum methodologies, CI/CD tools and practices, coding standards, code reviews, source management (GITHUB), JIRA, JIRA Xray and Confluence.
Experience or exposure to design and development using Full Stack tools.
Strong analytical and problem-solving skills, excellent communication (written and oral), and interpersonal skills.
Bachelor's or master's degree in computer science or related field.
About one-to-one, one-to-many, and many-to-many
Similar jobs
Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.
Key Responsibilities:
- Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
- Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
- Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
- Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
- Optimize and tune data pipelines for performance, scalability, and reliability.
- Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
- Implement security measures and best practices to protect sensitive data.
- Stay updated with emerging technologies and best practices in data engineering and data visualization.
- Document processes, workflows, and configurations to maintain a comprehensive knowledge base.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
- Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
- Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
- Solid understanding of data analytics concepts and techniques.
- Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
- Proficiency in programming languages such as Python and SQL.
- Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
- Excellent analytical and problem-solving skills with attention to detail.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
- Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.
Preferred Qualifications:
- Advanced degree in Computer Science, Engineering, or related field.
- Certifications in Power BI or related technologies.
- Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
- Knowledge of machine learning concepts and frameworks.
Responsibilities include:
- Convert the machine learning models into application program interfaces (APIs) so that other applications can use it
- Build AI models from scratch and help the different components of the organization (such as product managers and stakeholders) understand what results they gain from the model
- Build data ingestion and data transformation infrastructure
- Automate infrastructure that the data science team uses
- Perform statistical analysis and tune the results so that the organization can make better-informed decisions
- Set up and manage AI development and product infrastructure
- Be a good team player, as coordinating with others is a must
ADF Developer with top Conglomerates for Kochi location_ Air India
conducting F2F Interviews on 22nd April 2023
Experience - 2-12 years.
Location - Kochi only (work from the office only)
Notice period - 1 month only.
If you are interested, please share the following information at your earliest
Senior Big Data Engineer
Note: Notice Period : 45 days
Banyan Data Services (BDS) is a US-based data-focused Company that specializes in comprehensive data solutions and services, headquartered in San Jose, California, USA.
We are looking for a Senior Hadoop Bigdata Engineer who has expertise in solving complex data problems across a big data platform. You will be a part of our development team based out of Bangalore. This team focuses on the most innovative and emerging data infrastructure software and services to support highly scalable and available infrastructure.
It's a once-in-a-lifetime opportunity to join our rocket ship startup run by a world-class executive team. We are looking for candidates that aspire to be a part of the cutting-edge solutions and services we offer that address next-gen data evolution challenges.
Key Qualifications
· 5+ years of experience working with Java and Spring technologies
· At least 3 years of programming experience working with Spark on big data; including experience with data profiling and building transformations
· Knowledge of microservices architecture is plus
· Experience with any NoSQL databases such as HBase, MongoDB, or Cassandra
· Experience with Kafka or any streaming tools
· Knowledge of Scala would be preferable
· Experience with agile application development
· Exposure of any Cloud Technologies including containers and Kubernetes
· Demonstrated experience of performing DevOps for platforms
· Strong Skillsets in Data Structures & Algorithm in using efficient way of code complexity
· Exposure to Graph databases
· Passion for learning new technologies and the ability to do so quickly
· A Bachelor's degree in a computer-related field or equivalent professional experience is required
Key Responsibilities
· Scope and deliver solutions with the ability to design solutions independently based on high-level architecture
· Design and develop the big data-focused micro-Services
· Involve in big data infrastructure, distributed systems, data modeling, and query processing
· Build software with cutting-edge technologies on cloud
· Willing to learn new technologies and research-orientated projects
· Proven interpersonal skills while contributing to team effort by accomplishing related results as needed
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
-
Azure Synapse or Azure SQL data warehouse
-
Spark on Azure is available in HD insights and data bricks
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.
- Creating, designing and developing data models
- Prepare plans for all ETL (Extract/Transformation/Load) procedures and architectures
- Validating results and creating business reports
- Monitoring and tuning data loads and queries
- Develop and prepare a schedule for a new data warehouse
- Analyze large databases and recommend appropriate optimization for the same
- Administer all requirements and design various functional specifications for data
- Provide support to the Software Development Life cycle
- Prepare various code designs and ensure efficient implementation of the same
- Evaluate all codes and ensure the quality of all project deliverables
- Monitor data warehouse work and provide subject matter expertise
- Hands-on BI practices, data structures, data modeling, SQL skills
- Minimum 1 year experience in Pyspark
GREETINGS FROM CODEMANTRA !!!
EXCELLENT OPPORTUNITY FOR DATA SCIENCE/AI AND ML ARCHITECT !!!
Skills and Qualifications
*Strong Hands-on experience in Python Programming
*** Working experience with Computer Vision models - Object Detection Model, Image Classification
* Good experience in feature extraction, feature selection techniques and transfer learning
* Working Experience in building deep learning NLP Models for text classification, image analytics-CNN,RNN,LSTM.
* Working Experience in any of the AWS/GCP cloud platforms, exposure in fetching data from various sources.
* Good experience in exploratory data analysis, data visualisation, and other data pre-processing techniques.
* Knowledge in any one of the DL frameworks like Tensorflow, Pytorch, Keras, Caffe Good knowledge in statistics, distribution of data and in supervised and unsupervised machine learning algorithms.
* Exposure to OpenCV Familiarity with GPUs + CUDA Experience with NVIDIA software for cluster management and provisioning such as nvsm, dcgm and DeepOps.
* We are looking for a candidate with 9+ years of relevant experience , who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. They should also have experience using the following software/tools: *Experience with big data tools: Hadoop, Spark, Kafka, etc.
*Experience with AWS cloud services: EC2, RDS, AWS-Sagemaker(Added advantage)
*Experience with object-oriented/object function scripting languages in any: Python, Java, C++, Scala, etc.
Responsibilities
*Selecting features, building and optimizing classifiers using machine learning techniques
*Data mining using state-of-the-art methods
*Enhancing data collection procedures to include information that is relevant for building analytic systems
*Processing, cleansing, and verifying the integrity of data used for analysis
*Creating automated anomaly detection systems and constant tracking of its performance
*Assemble large, complex data sets that meet functional / non-functional business requirements.
*Secure and manage when needed GPU cluster resources for events
*Write comprehensive internal feedback reports and find opportunities for improvements
*Manage GPU instances/machines to increase the performance and efficiency of the ML/DL model
Regards
Ranjith PR