11+ Apache Pig Jobs in Bangalore (Bengaluru) | Apache Pig Job openings in Bangalore (Bengaluru)
Apply to 11+ Apache Pig Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest Apache Pig Job opportunities across top companies like Google, Amazon & Adobe.
Must Have Skills:
- Solid Knowledge on DWH, ETL and Big Data Concepts
- Excellent SQL Skills (With knowledge of SQL Analytics Functions)
- Working Experience on any ETL tool i.e. SSIS / Informatica
- Working Experience on any Azure or AWS Big Data Tools.
- Experience on Implementing Data Jobs (Batch / Real time Streaming)
- Excellent written and verbal communication skills in English, Self-motivated with strong sense of ownership and Ready to learn new tools and technologies
Preferred Skills:
- Experience on Py-Spark / Spark SQL
- AWS Data Tools (AWS Glue, AWS Athena)
- Azure Data Tools (Azure Databricks, Azure Data Factory)
Other Skills:
- Knowledge about Azure Blob, Azure File Storage, AWS S3, Elastic Search / Redis Search
- Knowledge on domain/function (across pricing, promotions and assortment).
- Implementation Experience on Schema and Data Validator framework (Python / Java / SQL),
- Knowledge on DQS and MDM.
Key Responsibilities:
- Independently work on ETL / DWH / Big data Projects
- Gather and process raw data at scale.
- Design and develop data applications using selected tools and frameworks as required and requested.
- Read, extract, transform, stage and load data to selected tools and frameworks as required and requested.
- Perform tasks such as writing scripts, web scraping, calling APIs, write SQL queries, etc.
- Work closely with the engineering team to integrate your work into our production systems.
- Process unstructured data into a form suitable for analysis.
- Analyse processed data.
- Support business decisions with ad hoc analysis as needed.
- Monitoring data performance and modifying infrastructure as needed.
Responsibility: Smart Resource, having excellent communication skills
Position: ETL Developer
Location: Mumbai
Exp.Level: 4+ Yrs
Required Skills:
* Strong scripting knowledge such as: Python and Shell
* Strong relational database skills especially with DB2/Sybase
* Create high quality and optimized stored procedures and queries
* Strong with scripting language such as Python and Unix / K-Shell
* Strong knowledge base of relational database performance and tuning such as: proper use of indices, database statistics/reorgs, de-normalization concepts.
* Familiar with lifecycle of a trade and flows of data in an investment banking operation is a plus.
* Experienced in Agile development process
* Java Knowledge is a big plus but not essential
* Experience in delivery of metrics / reporting in an enterprise environment (e.g. demonstrated experience in BI tools such as Business Objects, Tableau, report design & delivery) is a plus
* Experience on ETL processes and tools such as Informatica is a plus. Real time message processing experience is a big plus.
* Good team player; Integrity & ownership
Key Responsibilities :
- Development of proprietary processes and procedures designed to process various data streams around critical databases in the org
- Manage technical resources around data technologies, including relational databases, NO SQL DBs, business intelligence databases, scripting languages, big data tools and technologies, visualization tools.
- Creation of a project plan including timelines and critical milestones to success in support of the project
- Identification of the vital skill sets/staff required to complete the project
- Identification of crucial sources of the data needed to achieve the objective.
Skill Requirement :
- Experience with data pipeline processes and tools
- Well versed in the Data domains (Data Warehousing, Data Governance, MDM, Data Quality, Data Catalog, Analytics, BI, Operational Data Store, Metadata, Unstructured Data, ETL, ESB)
- Experience with an existing ETL tool e.g Informatica and Ab initio etc
- Deep understanding of big data systems like Hadoop, Spark, YARN, Hive, Ranger, Ambari
- Deep knowledge of Qlik ecosystems like Qlikview, Qliksense, and Nprinting
- Python, or a similar programming language
- Exposure to data science and machine learning
- Comfort working in a fast-paced environment
Soft attributes :
- Independence: Must have the ability to work on his/her own without constant direction or supervision. He/she must be self-motivated and possess a strong work ethic to strive to put forth extra effort continually
- Creativity: Must be able to generate imaginative, innovative solutions that meet the needs of the organization. You must be a strategic thinker/solution seller and should be able to think of integrated solutions (with field force apps, customer apps, CCT solutions etc.). Hence, it would be best to approach each unique situation/challenge in different ways using the same tools.
- Resilience: Must remain effective in high-pressure situations, using both positive and negative outcomes as an incentive to move forward toward fulfilling commitments to achieving personal and team goals.
Experience – 3 – 12 yrs
Budget - Open
Location - PAN India (Noida/Bangaluru/Hyderabad/Chennai)
Presto Developer (4)
Understanding of distributed SQL query engine running on Hadoop
Design and develop core components for Presto
Contribute to the ongoing Presto development by implementing new features, bug fixes, and other improvements
Develop new and extend existing Presto connectors to various data sources
Lead complex and technically challenging projects from concept to completion
Write tests and contribute to ongoing automation infrastructure development
Run and analyze software performance metrics
Collaborate with teams globally across multiple time zones and operate in an Agile development environment
Hands-on experience and interest with Hadoop
data domain.
● Should have experience in architecting data ecosystem for streaming data and
analytical platforms.
● Expert level experience in building fault-tolerant & scalable big-data platforms and
big-data solutions primarily based on the Hadoop ecosystem.
● Expert level experience with Java, Python or Scala programming.
● Expert level experience designing high throughput data services.
● Familiarity with machine learning and AI.
● Experience with Big-Data Technologies (Hive, HBase, Spark, Kafka, Storm,
MapReduce, HDFS, Zookeeper, Scylla, Cassandra, Yarn), understands the
concepts and technology ecosystem around both real-time and batch processing in
Hadoop.
● Strong spoken and written communication skills.
● B.E/B.Tech/MS in Computer Science (or equivalent).
● Effective listening skills and strong collaboration
AI enabled SAAS organisation
• 2+ years of experience in data engineering & strong understanding of data engineering principles using big data technologies
• Excellent programming skills in Python is mandatory
• Expertise in relational databases (MSSQL/MySQL/Postgres) and expertise in SQL. Exposure to NoSQL such as Cassandra. MongoDB will be a plus.
• Exposure to deploying ETL pipelines such as AirFlow, Docker containers & Lambda functions
• Experience in AWS loud services such as AWS CLI, Glue, Kinesis etc
• Experience using Tableau for data visualization is a plus
• Ability to demonstrate a portfolio of projects (GitHub, papers, etc.) is a plus
• Motivated, can-do attitude and desire to make a change is a must
• Excellent communication skills
Role & Responsibilities
This role will be leading analytics at WarehouseNow. You will be expected to drive data driven decision making with senior management and own product metrics
- Develop an In-depth understanding of user journeys on WarehouseNow Platforms and generate data driven insights & recommendations to help product business in meticulous decision making.
- End-to-end ownership of key metrics, work with respective product owners to understand areas we need to measure and ensure the needle is moving in the right direction.
- Develop strong hypothesis, execute A/B experiments and identify area of opportunities with strong confidence level.
- Work cross-functionally to define problem statements, collect data, build analytical models and make recommendations
- Identify and implement streamlined processes for data reporting, dashboarding and communication
- Collaborate with Product for data tracking and implementation of tools like Google analytics, Firebase, etc.
Key Competencies
- 3+ years of experience in core business analytics/product analytics.
- Excellent SQL, Excel abilities with working knowledge of scripting (Python, R, shell etc.)
- Background in Engineering /Statistics/Operational Research
- Bonus: Prior experience in a B2C Company
Roles and responsibilities:
- Responsible for development and maintenance of applications with technologies involving Enterprise Java and Distributed technologies.
- Experience in Hadoop, Kafka, Spark, Elastic Search, SQL, Kibana, Python, experience w/ machine learning and Analytics etc.
- Collaborate with developers, product manager, business analysts and business users in conceptualizing, estimating and developing new software applications and enhancements..
- Collaborate with QA team to define test cases, metrics, and resolve questions about test results.
- Assist in the design and implementation process for new products, research and create POC for possible solutions.
- Develop components based on business and/or application requirements
- Create unit tests in accordance with team policies & procedures
- Advise, and mentor team members in specialized technical areas as well as fulfill administrative duties as defined by support process
- Work with cross-functional teams during crisis to address and resolve complex incidents and problems in addition to assessment, analysis, and resolution of cross-functional issues.
Data Platform engineering at Uber is looking for a strong Technical Lead (Level 5a Engineer) who has built high quality platforms and services that can operate at scale. 5a Engineer at Uber exhibits following qualities:
- Demonstrate tech expertise › Demonstrate technical skills to go very deep or broad in solving classes of problems or creating broadly leverageable solutions.
- Execute large scale projects › Define, plan and execute complex and impactful projects. You communicate the vision to peers and stakeholders.
- Collaborate across teams › Domain resource to engineers outside your team and help them leverage the right solutions. Facilitate technical discussions and drive to a consensus.
- Coach engineers › Coach and mentor less experienced engineers and deeply invest in their learning and success. You give and solicit feedback, both positive and negative, to others you work with to help improve the entire team.
- Tech leadership › Lead the effort to define the best practices in your immediate team, and help the broader organization establish better technical or business processes.
What You’ll Do
- Build a scalable, reliable, operable and performant data analytics platform for Uber’s engineers, data scientists, products and operations teams.
- Work alongside the pioneers of big data systems such as Hive, Yarn, Spark, Presto, Kafka, Flink to build out a highly reliable, performant, easy to use software system for Uber’s planet scale of data.
- Become proficient of multi-tenancy, resource isolation, abuse prevention, self-serve debuggability aspects of a high performant, large scale, service while building these capabilities for Uber's engineers and operation folks.
What You’ll Need
- 7+ years experience in building large scale products, data platforms, distributed systems in a high caliber environment.
- Architecture: Identify and solve major architectural problems by going deep in your field or broad across different teams. Extend, improve, or, when needed, build solutions to address architectural gaps or technical debt.
- Software Engineering/Programming: Create frameworks and abstractions that are reliable and reusable. advanced knowledge of at least one programming language, and are happy to learn more. Our core languages are Java, Python, Go, and Scala.
- Data Engineering: Expertise in one of the big data analytics technologies we currently use such as Apache Hadoop (HDFS and YARN), Apache Hive, Impala, Drill, Spark, Tez, Presto, Calcite, Parquet, Arrow etc. Under the hood experience with similar systems such as Vertica, Apache Impala, Drill, Google Borg, Google BigQuery, Amazon EMR, Amazon RedShift, Docker, Kubernetes, Mesos etc.
- Execution & Results: You tackle large technical projects/problems that are not clearly defined. You anticipate roadblocks and have strategies to de-risk timelines. You orchestrate work that spans multiple teams and keep your stakeholders informed.
- A team player: You believe that you can achieve more on a team that the whole is greater than the sum of its parts. You rely on others’ candid feedback for continuous improvement.
- Business acumen: You understand requirements beyond the written word. Whether you’re working on an API used by other developers, an internal tool consumed by our operation teams, or a feature used by millions of customers, your attention to details leads to a delightful user experience.
The candidate,
1. Must have a very good hands-on technical experience of 3+ years with JAVA or Python
2. Working experience and good understanding of AWS Cloud; Advanced experience with IAM policy and role management
3. Infrastructure Operations: 5+ years supporting systems infrastructure operations, upgrades, deployments using Terraform, and monitoring
4. Hadoop: Experience with Hadoop (Hive, Spark, Sqoop) and / or AWS EMR
5. Knowledge on PostgreSQL/MySQL/Dynamo DB backend operations
6. DevOps: Experience with DevOps automation - Orchestration/Configuration Management and CI/CD tools (Jenkins)
7. Version Control: Working experience with one or more version control platforms like GitHub or GitLab
8. Knowledge on AWS Quick sight reporting
9. Monitoring: Hands on experience with monitoring tools such as AWS CloudWatch, AWS CloudTrail, Datadog and Elastic Search
10. Networking: Working knowledge of TCP/IP networking, SMTP, HTTP, load-balancers (ELB) and high availability architecture
11. Security: Experience implementing role-based security, including AD integration, security policies, and auditing in a Linux/Hadoop/AWS environment. Familiar with penetration testing and scan tools for remediation of security vulnerabilities.
12. Demonstrated successful experience learning new technologies quickly
WHAT WILL BE THE ROLES AND RESPONSIBILITIES?
1. Create procedures/run books for operational and security aspects of AWS platform
2. Improve AWS infrastructure by developing and enhancing automation methods
3. Provide advanced business and engineering support services to end users
4. Lead other admins and platform engineers through design and implementation decisions to achieve balance between strategic design and tactical needs
5. Research and deploy new tools and frameworks to build a sustainable big data platform
6. Assist with creating programs for training and onboarding for new end users
7. Lead Agile/Kanban workflows and team process work
8. Troubleshoot issues to resolve problems
9. Provide status updates to Operations product owner and stakeholders
10. Track all details in the issue tracking system (JIRA)
11. Provide issue review and triage problems for new service/support requests
12. Use DevOps automation tools, including Jenkins build jobs
13. Fulfil any ad-hoc data or report request queries from different functional groups