11+ MapReduce Jobs in Hyderabad | MapReduce Job openings in Hyderabad
Apply to 11+ MapReduce Jobs in Hyderabad on CutShort.io. Explore the latest MapReduce Job opportunities across top companies like Google, Amazon & Adobe.
Ideal candidates should have technical experience in migrations and the ability to help customers get value from Datametica's tools and accelerators.
Job Description
Experience : 7+ years
Location : Pune / Hyderabad
Skills :
- Drive and participate in requirements gathering workshops, estimation discussions, design meetings and status review meetings
- Participate and contribute in Solution Design and Solution Architecture for implementing Big Data Projects on-premise and on cloud
- Technical Hands on experience in design, coding, development and managing Large Hadoop implementation
- Proficient in SQL, Hive, PIG, Spark SQL, Shell Scripting, Kafka, Flume, Scoop with large Big Data and Data Warehousing projects with either Java, Python or Scala based Hadoop programming background
- Proficient with various development methodologies like waterfall, agile/scrum and iterative
- Good Interpersonal skills and excellent communication skills for US and UK based clients
About Us!
A global Leader in the Data Warehouse Migration and Modernization to the Cloud, we empower businesses by migrating their Data/Workload/ETL/Analytics to the Cloud by leveraging Automation.
We have expertise in transforming legacy Teradata, Oracle, Hadoop, Netezza, Vertica, Greenplum along with ETLs like Informatica, Datastage, AbInitio & others, to cloud-based data warehousing with other capabilities in data engineering, advanced analytics solutions, data management, data lake and cloud optimization.
Datametica is a key partner of the major cloud service providers - Google, Microsoft, Amazon, Snowflake.
We have our own products!
Eagle – Data warehouse Assessment & Migration Planning Product
Raven – Automated Workload Conversion Product
Pelican - Automated Data Validation Product, which helps automate and accelerate data migration to the cloud.
Why join us!
Datametica is a place to innovate, bring new ideas to live and learn new things. We believe in building a culture of innovation, growth and belonging. Our people and their dedication over these years are the key factors in achieving our success.
Benefits we Provide!
Working with Highly Technical and Passionate, mission-driven people
Subsidized Meals & Snacks
Flexible Schedule
Approachable leadership
Access to various learning tools and programs
Pet Friendly
Certification Reimbursement Policy
Check out more about us on our website below!
www.datametica.com
Skills
- Data Engineer
Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON
Mandatory Requirements
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
CORE RESPONSIBILITIES
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
QUALIFICATIONS
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
- AWS certification
- Spark Streaming
- Kafka Streaming / Kafka Connect
- ELK Stack
- Cassandra / MongoDB
- CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
Roles and Responsibilities
Should design and operate data pipe lines.
Build and manage analytics platform using Elastic search, Redshift, Mongo db.
Strong programming fundamentals in Datastructures and algorithms.
at Altimetrik
Bigdata with cloud:
Experience : 5-10 years
Location : Hyderabad/Chennai
Notice period : 15-20 days Max
1. Expertise in building AWS Data Engineering pipelines with AWS Glue -> Athena -> Quick sight
2. Experience in developing lambda functions with AWS Lambda
3. Expertise with Spark/PySpark – Candidate should be hands on with PySpark code and should be able to do transformations with Spark
4. Should be able to code in Python and Scala.
5. Snowflake experience will be a plus
• Experience with Advanced SQL
• Experience with Azure data factory, data bricks,
• Experience with Azure IOT, Cosmos DB, BLOB Storage
• API management, FHIR API development,
• Proficient with Git and CI/CD best practices
• Experience working with Snowflake is a plus
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.
Summary
Our Kafka developer has a combination of technical skills, communication skills and business knowledge. The developer should be able to work on multiple medium to large projects. The successful candidate will have excellent technical skills of Apache/Confluent Kafka, Enterprise Data WareHouse preferable GCP BigQuery or any equivalent Cloud EDW and also will be able to take oral and written business requirements and develop efficient code to meet set deliverables.
Must Have Skills
- Participate in the development, enhancement and maintenance of data applications both as an individual contributor and as a lead.
- Leading in the identification, isolation, resolution and communication of problems within the production environment.
- Leading developer and applying technical skills Apache/Confluent Kafka (Preferred) AWS Kinesis (Optional), Cloud Enterprise Data Warehouse Google BigQuery (Preferred) or AWS RedShift or SnowFlakes (Optional)
- Design recommending best approach suited for data movement from different sources to Cloud EDW using Apache/Confluent Kafka
- Performs independent functional and technical analysis for major projects supporting several corporate initiatives.
- Communicate and Work with IT partners and user community with various levels from Sr Management to detailed developer to business SME for project definition .
- Works on multiple platforms and multiple projects concurrently.
- Performs code and unit testing for complex scope modules, and projects
- Provide expertise and hands on experience working on Kafka connect using schema registry in a very high volume environment (~900 Million messages)
- Provide expertise in Kafka brokers, zookeepers, KSQL, KStream and Kafka Control center.
- Provide expertise and hands on experience working on AvroConverters, JsonConverters, and StringConverters.
- Provide expertise and hands on experience working on Kafka connectors such as MQ connectors, Elastic Search connectors, JDBC connectors, File stream connector, JMS source connectors, Tasks, Workers, converters, Transforms.
- Provide expertise and hands on experience on custom connectors using the Kafka core concepts and API.
- Working knowledge on Kafka Rest proxy.
- Ensure optimum performance, high availability and stability of solutions.
- Create topics, setup redundancy cluster, deploy monitoring tools, alerts and has good knowledge of best practices.
- Create stubs for producers, consumers and consumer groups for helping onboard applications from different languages/platforms. Leverage Hadoop ecosystem knowledge to design, and develop capabilities to deliver our solutions using Spark, Scala, Python, Hive, Kafka and other things in the Hadoop ecosystem.
- Use automation tools like provisioning using Jenkins, Udeploy or relevant technologies
- Ability to perform data related benchmarking, performance analysis and tuning.
- Strong skills in In-memory applications, Database Design, Data Integration.
at TransPacks Technologies IIT Kanpur
Responsibilities
- Build and mentor the computer vision team at TransPacks
- Drive to productionize algorithms (industrial level) developed through hard-core research
- Own the design, development, testing, deployment, and craftsmanship of the team’s infrastructure and systems capable of handling massive amounts of requests with high reliability and scalability
- Leverage the deep and broad technical expertise to mentor engineers and provide leadership on resolving complex technology issues
- Entrepreneurial and out-of-box thinking essential for a technology startup
- Guide the team for unit-test code for robustness, including edge cases, usability, and general reliability
Eligibility
- Tech in Computer Science and Engineering/Electronics/Electrical Engineering, with demonstrated interest in Image Processing/Computer vision (courses, projects etc) and 6-8 years of experience
- Tech in Computer Science and Engineering/Electronics/Electrical Engineering, with demonstrated interest in Image Processing/Computer vision (Thesis work) and 4-7 years of experience
- D in Computer Science and Engineering/Electronics/Electrical Engineering, with demonstrated interest in Image Processing/Computer vision (Ph. D. Dissertation) and inclination to working in Industry to provide innovative solutions to practical problems
Requirements
- In-depth understanding of image processing algorithms, pattern recognition methods, and rule-based classifiers
- Experience in feature extraction, object recognition and tracking, image registration, noise reduction, image calibration, and correction
- Ability to understand, optimize and debug imaging algorithms
- Understating and experience in openCV library
- Fundamental understanding of mathematical techniques involved in ML and DL schemas (Instance-based methods, Boosting methods, PGM, Neural Networks etc.)
- Thorough understanding of state-of-the-art DL concepts (Sequence modeling, Attention, Convolution etc.) along with knack to imagine new schemas that work for the given data.
- Understanding of engineering principles and a clear understanding of data structures and algorithms
- Experience in writing production level codes using either C++ or Java
- Experience with technologies/libraries such as python pandas, numpy, scipy
- Experience with tensorflow and scikit.
- 5+ years of experience in a Data Engineer role
- Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Experience with big data tools: Hadoop, Spark, Kafka, etc.
- Experience with relational SQL and NoSQL databases such as Cassandra.
- Experience with AWS cloud services: EC2, EMR, Athena
- Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with unstructured datasets.
- Deep problem-solving skills to perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.