11+ Apache Drill Jobs in Bangalore (Bengaluru) | Apache Drill Job openings in Bangalore (Bengaluru)
Apply to 11+ Apache Drill Jobs in Bangalore (Bengaluru) on CutShort.io. Explore the latest Apache Drill Job opportunities across top companies like Google, Amazon & Adobe.
Data Platform engineering at Uber is looking for a strong Technical Lead (Level 5a Engineer) who has built high quality platforms and services that can operate at scale. 5a Engineer at Uber exhibits following qualities:
- Demonstrate tech expertise › Demonstrate technical skills to go very deep or broad in solving classes of problems or creating broadly leverageable solutions.
- Execute large scale projects › Define, plan and execute complex and impactful projects. You communicate the vision to peers and stakeholders.
- Collaborate across teams › Domain resource to engineers outside your team and help them leverage the right solutions. Facilitate technical discussions and drive to a consensus.
- Coach engineers › Coach and mentor less experienced engineers and deeply invest in their learning and success. You give and solicit feedback, both positive and negative, to others you work with to help improve the entire team.
- Tech leadership › Lead the effort to define the best practices in your immediate team, and help the broader organization establish better technical or business processes.
What You’ll Do
- Build a scalable, reliable, operable and performant data analytics platform for Uber’s engineers, data scientists, products and operations teams.
- Work alongside the pioneers of big data systems such as Hive, Yarn, Spark, Presto, Kafka, Flink to build out a highly reliable, performant, easy to use software system for Uber’s planet scale of data.
- Become proficient of multi-tenancy, resource isolation, abuse prevention, self-serve debuggability aspects of a high performant, large scale, service while building these capabilities for Uber's engineers and operation folks.
What You’ll Need
- 7+ years experience in building large scale products, data platforms, distributed systems in a high caliber environment.
- Architecture: Identify and solve major architectural problems by going deep in your field or broad across different teams. Extend, improve, or, when needed, build solutions to address architectural gaps or technical debt.
- Software Engineering/Programming: Create frameworks and abstractions that are reliable and reusable. advanced knowledge of at least one programming language, and are happy to learn more. Our core languages are Java, Python, Go, and Scala.
- Data Engineering: Expertise in one of the big data analytics technologies we currently use such as Apache Hadoop (HDFS and YARN), Apache Hive, Impala, Drill, Spark, Tez, Presto, Calcite, Parquet, Arrow etc. Under the hood experience with similar systems such as Vertica, Apache Impala, Drill, Google Borg, Google BigQuery, Amazon EMR, Amazon RedShift, Docker, Kubernetes, Mesos etc.
- Execution & Results: You tackle large technical projects/problems that are not clearly defined. You anticipate roadblocks and have strategies to de-risk timelines. You orchestrate work that spans multiple teams and keep your stakeholders informed.
- A team player: You believe that you can achieve more on a team that the whole is greater than the sum of its parts. You rely on others’ candid feedback for continuous improvement.
- Business acumen: You understand requirements beyond the written word. Whether you’re working on an API used by other developers, an internal tool consumed by our operation teams, or a feature used by millions of customers, your attention to details leads to a delightful user experience.
Responsibilities
- Work with large and complex blockchain data sets and derive investment relevant metrics in close partnership with financial analysts and blockchain engineers.
- Apply knowledge of statistics, programming, data modeling, simulation, and advanced mathematics to recognize patterns, identify opportunities, pose business questions, and make valuable discoveries leading to the development of fundamental metrics needed to evaluate various crypto assets.
- Build a strong understanding of existing metrics used to value various decentralized applications and protocols.
- Build customer facing metrics and dashboards.
- Work closely with analysts, engineers, Product Managers and provide feedback as we develop our data analytics and research platform.
Qualifications
- Bachelor's degree in Mathematics, Statistics, a relevant technical field, or equivalent practical experience (or) degree in an analytical field (e.g. Computer Science, Engineering, Mathematics, Statistics, Operations Research, Management Science)
- 3+ years experience with data analysis and metrics development
- 3+ years experience analyzing and interpreting data, drawing conclusions, defining recommended actions, and reporting results across stakeholders
- 2+ years experience writing SQL queries
- 2+ years experience scripting in Python
- Demonstrated curiosity in and excitement for Web3/blockchain technologies
Job Description:
As an Azure Data Engineer, your role will involve designing, developing, and maintaining data solutions on the Azure platform. You will be responsible for building and optimizing data pipelines, ensuring data quality and reliability, and implementing data processing and transformation logic. Your expertise in Azure Databricks, Python, SQL, Azure Data Factory (ADF), PySpark, and Scala will be essential for performing the following key responsibilities:
Designing and developing data pipelines: You will design and implement scalable and efficient data pipelines using Azure Databricks, PySpark, and Scala. This includes data ingestion, data transformation, and data loading processes.
Data modeling and database design: You will design and implement data models to support efficient data storage, retrieval, and analysis. This may involve working with relational databases, data lakes, or other storage solutions on the Azure platform.
Data integration and orchestration: You will leverage Azure Data Factory (ADF) to orchestrate data integration workflows and manage data movement across various data sources and targets. This includes scheduling and monitoring data pipelines.
Data quality and governance: You will implement data quality checks, validation rules, and data governance processes to ensure data accuracy, consistency, and compliance with relevant regulations and standards.
Performance optimization: You will optimize data pipelines and queries to improve overall system performance and reduce processing time. This may involve tuning SQL queries, optimizing data transformation logic, and leveraging caching techniques.
Monitoring and troubleshooting: You will monitor data pipelines, identify performance bottlenecks, and troubleshoot issues related to data ingestion, processing, and transformation. You will work closely with cross-functional teams to resolve data-related problems.
Documentation and collaboration: You will document data pipelines, data flows, and data transformation processes. You will collaborate with data scientists, analysts, and other stakeholders to understand their data requirements and provide data engineering support.
Skills and Qualifications:
Strong experience with Azure Databricks, Python, SQL, ADF, PySpark, and Scala.
Proficiency in designing and developing data pipelines and ETL processes.
Solid understanding of data modeling concepts and database design principles.
Familiarity with data integration and orchestration using Azure Data Factory.
Knowledge of data quality management and data governance practices.
Experience with performance tuning and optimization of data pipelines.
Strong problem-solving and troubleshooting skills related to data engineering.
Excellent collaboration and communication skills to work effectively in cross-functional teams.
Understanding of cloud computing principles and experience with Azure services.
Ganit has flipped the data science value chain as we do not start with a technique but for us, consumption comes first. With this philosophy, we have successfully scaled from being a small start-up to a 200 resource company with clients in the US, Singapore, Africa, UAE, and India.
We are looking for experienced data enthusiasts who can make the data talk to them.
You will:
- Understand business problems and translate business requirements into technical requirements.
- Conduct complex data analysis to ensure data quality & reliability i.e., make the data talk by extracting, preparing, and transforming it.
- Identify, develop and implement statistical techniques and algorithms to address business challenges and add value to the organization.
- Gather requirements and communicate findings in the form of a meaningful story with the stakeholders
- Build & implement data models using predictive modelling techniques. Interact with clients and provide support for queries and delivery adoption.
- Lead and mentor data analysts.
We are looking for someone who has:
- Apart from your love for data and ability to code even while sleeping you would need the following.
- Minimum of 02 years of experience in designing and delivery of data science solutions.
- You should have successful projects of retail/BFSI/FMCG/Manufacturing/QSR in your kitty to show-off.
- Deep understanding of various statistical techniques, mathematical models, and algorithms to start the conversation with the data in hand.
- Ability to choose the right model for the data and translate that into a code using R, Python, VBA, SQL, etc.
- Bachelors/Masters degree in Engineering/Technology or MBA from Tier-1 B School or MSc. in Statistics or Mathematics
Skillset Required:
- Regression
- Classification
- Predictive Modelling
- Prescriptive Modelling
- Python
- R
- Descriptive Modelling
- Time Series
- Clustering
What is in it for you:
- Be a part of building the biggest brand in Data science.
- An opportunity to be a part of a young and energetic team with a strong pedigree.
- Work on awesome projects across industries and learn from the best in the industry, while growing at a hyper rate.
Please Note:
At Ganit, we are looking for people who love problem solving. You are encouraged to apply even if your experience does not precisely match the job description above. Your passion and skills will stand out and set you apart—especially if your career has taken some extraordinary twists and turns over the years. We welcome diverse perspectives, people who think rigorously and are not afraid to challenge assumptions in a problem. Join us and punch above your weight!
Ganit is an equal opportunity employer and is committed to providing a work environment that is free from harassment and discrimination.
All recruitment, selection procedures and decisions will reflect Ganit’s commitment to providing equal opportunity. All potential candidates will be assessed according to their skills, knowledge, qualifications, and capabilities. No regard will be given to factors such as age, gender, marital status, race, religion, physical impairment, or political opinions.
Key Responsibilities : ( Data Developer Python, Spark)
Exp : 2 to 9 Yrs
Development of data platforms, integration frameworks, processes, and code.
Develop and deliver APIs in Python or Scala for Business Intelligence applications build using a range of web languages
Develop comprehensive automated tests for features via end-to-end integration tests, performance tests, acceptance tests and unit tests.
Elaborate stories in a collaborative agile environment (SCRUM or Kanban)
Familiarity with cloud platforms like GCP, AWS or Azure.
Experience with large data volumes.
Familiarity with writing rest-based services.
Experience with distributed processing and systems
Experience with Hadoop / Spark toolsets
Experience with relational database management systems (RDBMS)
Experience with Data Flow development
Knowledge of Agile and associated development techniques including:
n
-
Responsibilities
- Responsible for implementation and ongoing administration of Hadoop
infrastructure.
- Aligning with the systems engineering team to propose and deploy new
hardware and software environments required for Hadoop and to expand existing
environments.
- Working with data delivery teams to setup new Hadoop users. This job includes
setting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig
and MapReduce access for the new users.
- Cluster maintenance as well as creation and removal of nodes using tools like
Ganglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools
- Performance tuning of Hadoop clusters and Hadoop MapReduce routines
- Screen Hadoop cluster job performances and capacity planning
- Monitor Hadoop cluster connectivity and security
- Manage and review Hadoop log files.
- File system management and monitoring.
- Diligently teaming with the infrastructure, network, database, application and
business intelligence teams to guarantee high data quality and availability
- Collaboration with application teams to install operating system and Hadoop
updates, patches, version upgrades when required.
READ MORE OF THE JOB DESCRIPTION
Qualifications
Qualifications
- Bachelors Degree in Information Technology, Computer Science or other
relevant fields
- General operational expertise such as good troubleshooting skills,
understanding of systems capacity, bottlenecks, basics of memory, CPU, OS,
storage, and networks.
- Hadoop skills like HBase, Hive, Pig, Mahout
- Ability to deploy Hadoop cluster, add and remove nodes, keep track of jobs,
monitor critical parts of the cluster, configure name node high availability, schedule
and configure it and take backups.
- Good knowledge of Linux as Hadoop runs on Linux.
- Familiarity with open source configuration management and deployment tools
such as Puppet or Chef and Linux scripting.
Nice to Have
- Knowledge of Troubleshooting Core Java Applications is a plus.
We are looking for a savvy Data Engineer to join our growing team of analytics experts.
The hire will be responsible for:
- Expanding and optimizing our data and data pipeline architecture
- Optimizing data flow and collection for cross functional teams.
- Will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
- Must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
- Experience with Azure : ADLS, Databricks, Stream Analytics, SQL DW, COSMOS DB, Analysis Services, Azure Functions, Serverless Architecture, ARM Templates
- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.
- Experience with object-oriented/object function scripting languages: Python, SQL, Scala, Spark-SQL etc.
Nice to have experience with :
- Big data tools: Hadoop, Spark and Kafka
- Data pipeline and workflow management tools: Azkaban, Luigi, Airflow
- Stream-processing systems: Storm
Database : SQL DB
Programming languages : PL/SQL, Spark SQL
Looking for candidates with Data Warehousing experience, strong domain knowledge & experience working as a Technical lead.
The right candidate will be excited by the prospect of optimizing or even re-designing our company's data architecture to support our next generation of products and data initiatives.
Object-oriented languages (e.g. Python, PySpark, Java, C#, C++ ) and frameworks (e.g. J2EE or .NET)
SQL, Python, Numpy,Pandas,Knowledge of Hive and Data warehousing concept will be a plus point.
JD
- Strong analytical skills with the ability to collect, organise, analyse and interpret trends or patterns in complex data sets and provide reports & visualisations.
- Work with management to prioritise business KPIs and information needs Locate and define new process improvement opportunities.
- Technical expertise with data models, database design and development, data mining and segmentation techniques
- Proven success in a collaborative, team-oriented environment
- Working experience with geospatial data will be a plus.