11+ Apache Drill Jobs in India
Apply to 11+ Apache Drill Jobs on CutShort.io. Find your next job, effortlessly. Browse Apache Drill Jobs and apply today!
Data Platform engineering at Uber is looking for a strong Technical Lead (Level 5a Engineer) who has built high quality platforms and services that can operate at scale. 5a Engineer at Uber exhibits following qualities:
- Demonstrate tech expertise › Demonstrate technical skills to go very deep or broad in solving classes of problems or creating broadly leverageable solutions.
- Execute large scale projects › Define, plan and execute complex and impactful projects. You communicate the vision to peers and stakeholders.
- Collaborate across teams › Domain resource to engineers outside your team and help them leverage the right solutions. Facilitate technical discussions and drive to a consensus.
- Coach engineers › Coach and mentor less experienced engineers and deeply invest in their learning and success. You give and solicit feedback, both positive and negative, to others you work with to help improve the entire team.
- Tech leadership › Lead the effort to define the best practices in your immediate team, and help the broader organization establish better technical or business processes.
What You’ll Do
- Build a scalable, reliable, operable and performant data analytics platform for Uber’s engineers, data scientists, products and operations teams.
- Work alongside the pioneers of big data systems such as Hive, Yarn, Spark, Presto, Kafka, Flink to build out a highly reliable, performant, easy to use software system for Uber’s planet scale of data.
- Become proficient of multi-tenancy, resource isolation, abuse prevention, self-serve debuggability aspects of a high performant, large scale, service while building these capabilities for Uber's engineers and operation folks.
What You’ll Need
- 7+ years experience in building large scale products, data platforms, distributed systems in a high caliber environment.
- Architecture: Identify and solve major architectural problems by going deep in your field or broad across different teams. Extend, improve, or, when needed, build solutions to address architectural gaps or technical debt.
- Software Engineering/Programming: Create frameworks and abstractions that are reliable and reusable. advanced knowledge of at least one programming language, and are happy to learn more. Our core languages are Java, Python, Go, and Scala.
- Data Engineering: Expertise in one of the big data analytics technologies we currently use such as Apache Hadoop (HDFS and YARN), Apache Hive, Impala, Drill, Spark, Tez, Presto, Calcite, Parquet, Arrow etc. Under the hood experience with similar systems such as Vertica, Apache Impala, Drill, Google Borg, Google BigQuery, Amazon EMR, Amazon RedShift, Docker, Kubernetes, Mesos etc.
- Execution & Results: You tackle large technical projects/problems that are not clearly defined. You anticipate roadblocks and have strategies to de-risk timelines. You orchestrate work that spans multiple teams and keep your stakeholders informed.
- A team player: You believe that you can achieve more on a team that the whole is greater than the sum of its parts. You rely on others’ candid feedback for continuous improvement.
- Business acumen: You understand requirements beyond the written word. Whether you’re working on an API used by other developers, an internal tool consumed by our operation teams, or a feature used by millions of customers, your attention to details leads to a delightful user experience.
Analytics Job Description
We are hiring an Analytics Engineer to help drive our Business Intelligence efforts. You will
partner closely with leaders across the organization, working together to understand the how
and why of people, team and company challenges, workflows and culture. The team is
responsible for delivering data and insights that drive decision-making, execution, and
investments for our product initiatives.
You will work cross-functionally with product, marketing, sales, engineering, finance, and our
customer-facing teams enabling them with data and narratives about the customer journey.
You’ll also work closely with other data teams, such as data engineering and product analytics,
to ensure we are creating a strong data culture at Blend that enables our cross-functional partners
to be more data-informed.
Role : DataEngineer
Please find below the JD for the DataEngineer Role..
Location: Guindy,Chennai
How you’ll contribute:
• Develop objectives and metrics, ensure priorities are data-driven, and balance short-
term and long-term goals
• Develop deep analytical insights to inform and influence product roadmaps and
business decisions and help improve the consumer experience
• Work closely with GTM and supporting operations teams to author and develop core
data sets that empower analyses
• Deeply understand the business and proactively spot risks and opportunities
• Develop dashboards and define metrics that drive key business decisions
• Build and maintain scalable ETL pipelines via solutions such as Fivetran, Hightouch,
and Workato
• Design our Analytics and Business Intelligence architecture, assessing and
implementing new technologies that fitting
• Work with our engineering teams to continually make our data pipelines and tooling
more resilient
Who you are:
• Bachelor’s degree or equivalent required from an accredited institution with a
quantitative focus such as Economics, Operations Research, Statistics, Computer Science OR 1-3 Years of Experience as a Data Analyst, Data Engineer, Data Scientist
• Must have strong SQL and data modeling skills, with experience applying skills to
thoughtfully create data models in a warehouse environment.
• A proven track record of using analysis to drive key decisions and influence change
• Strong storyteller and ability to communicate effectively with managers and
executives
• Demonstrated ability to define metrics for product areas, understand the right
questions to ask and push back on stakeholders in the face of ambiguous, complex
problems, and work with diverse teams with different goals
• A passion for documentation.
• A solution-oriented growth mindset. You’ll need to be a self-starter and thrive in a
dynamic environment.
• A bias towards communication and collaboration with business and technical
stakeholders.
• Quantitative rigor and systems thinking.
• Prior startup experience is preferred, but not required.
• Interest or experience in machine learning techniques (such as clustering, decision
tree, and segmentation)
• Familiarity with a scientific computing language, such as Python, for data wrangling
and statistical analysis
• Experience with a SQL focused data transformation framework such as dbt
• Experience with a Business Intelligence Tool such as Mode/Tableau
Mandatory Skillset:
-Very Strong in SQL
-Spark OR pyspark OR Python
-Shell Scripting
Should have Passion to learn and adapt new technologies, understanding,
solving/troubleshooting issues and risks, able to make informed decisions and ability to
lead the projects.
Your Qualifications
- 2-5 Years’ Experience with functional programming
- Experience with functional programming using Scala with Spark framework.
- Strong understanding of Object-oriented programming, data structures and algorithms
- Good experience in any of the cloud platforms (Azure, AWS, GCP) etc.,
- Experience with distributed (multi-tiered) systems, relational databases and NoSql storage solutions
- Desire to learn new technologies and languages
- Participation in software design, development, and code reviews
- High level of proficiency with Computer Science/Software Engineering knowledge and contribution to the technical skills growth of other team members
Your Responsibility
- Design, build and configure applications to meet business process and application requirements
- Proactively identify and communicate potential issues and concerns and recommend/implement alternative solutions as appropriate.
- Troubleshooting & Optimization of existing solution
Provide advice on technical design to ensure solutions are forward looking and flexible for potential future requirements and business needs.
XpressBees – a logistics company started in 2015 – is amongst the fastest growing
companies of its sector. While we started off rather humbly in the space of
ecommerce B2C logistics, the last 5 years have seen us steadily progress towards
expanding our presence. Our vision to evolve into a strong full-service logistics
organization reflects itself in our new lines of business like 3PL, B2B Xpress and cross
border operations. Our strong domain expertise and constant focus on meaningful
innovation have helped us rapidly evolve as the most trusted logistics partner of
India. We have progressively carved our way towards best-in-class technology
platforms, an extensive network reach, and a seamless last mile management
system. While on this aggressive growth path, we seek to become the one-stop-shop
for end-to-end logistics solutions. Our big focus areas for the very near future
include strengthening our presence as service providers of choice and leveraging the
power of technology to improve efficiencies for our clients.
Job Profile
As a Lead Data Engineer in the Data Platform Team at XpressBees, you will build the data platform
and infrastructure to support high quality and agile decision-making in our supply chain and logistics
workflows.
You will define the way we collect and operationalize data (structured / unstructured), and
build production pipelines for our machine learning models, and (RT, NRT, Batch) reporting &
dashboarding requirements. As a Senior Data Engineer in the XB Data Platform Team, you will use
your experience with modern cloud and data frameworks to build products (with storage and serving
systems)
that drive optimisation and resilience in the supply chain via data visibility, intelligent decision making,
insights, anomaly detection and prediction.
What You Will Do
• Design and develop data platform and data pipelines for reporting, dashboarding and
machine learning models. These pipelines would productionize machine learning models
and integrate with agent review tools.
• Meet the data completeness, correction and freshness requirements.
• Evaluate and identify the data store and data streaming technology choices.
• Lead the design of the logical model and implement the physical model to support
business needs. Come up with logical and physical database design across platforms (MPP,
MR, Hive/PIG) which are optimal physical designs for different use cases (structured/semi
structured). Envision & implement the optimal data modelling, physical design,
performance optimization technique/approach required for the problem.
• Support your colleagues by reviewing code and designs.
• Diagnose and solve issues in our existing data pipelines and envision and build their
successors.
Qualifications & Experience relevant for the role
• A bachelor's degree in Computer Science or related field with 6 to 9 years of technology
experience.
• Knowledge of Relational and NoSQL data stores, stream processing and micro-batching to
make technology & design choices.
• Strong experience in System Integration, Application Development, ETL, Data-Platform
projects. Talented across technologies used in the enterprise space.
• Software development experience using:
• Expertise in relational and dimensional modelling
• Exposure across all the SDLC process
• Experience in cloud architecture (AWS)
• Proven track record in keeping existing technical skills and developing new ones, so that
you can make strong contributions to deep architecture discussions around systems and
applications in the cloud ( AWS).
• Characteristics of a forward thinker and self-starter that flourishes with new challenges
and adapts quickly to learning new knowledge
• Ability to work with a cross functional teams of consulting professionals across multiple
projects.
• Knack for helping an organization to understand application architectures and integration
approaches, to architect advanced cloud-based solutions, and to help launch the build-out
of those systems
• Passion for educating, training, designing, and building end-to-end systems.
Responsibilities:
* 3+ years of Data Engineering Experience - Design, develop, deliver and maintain data infrastructures.
* SQL Specialist – Strong knowledge and Seasoned experience with SQL Queries
* Languages: Python
* Good communicator, shows initiative, works well with stakeholders.
* Experience working closely with Data Analysts and provide the data they need and guide them on the issues.
* Solid ETL experience and Hadoop/Hive/Pyspark/Presto/ SparkSQL
* Solid communication and articulation skills
* Able to handle stakeholders independently with less interventions of reporting manager.
* Develop strategies to solve problems in logical yet creative ways.
* Create custom reports and presentations accompanied by strong data visualization and storytelling
We would be excited if you have:
* Excellent communication and interpersonal skills
* Ability to meet deadlines and manage project delivery
* Excellent report-writing and presentation skills
* Critical thinking and problem-solving capabilities
- Expert software implementation and automated testing
- Promoting development standards, code reviews, mentoring, knowledge sharing
- Improving our Agile methodology maturity
- Product and feature design, scrum story writing
- Build, release, and deployment automation
- Product support & troubleshooting
Who we have in mind:
- Demonstrated experience as a Java
- Should have a deep understanding of Enterprise/Distributed Architecture patterns and should be able to demonstrate the relevant usage of the same
- Turn high-level project requirements into application-level architecture and collaborate with the team members to implement the solution
- Strong experience and knowledge in Spring boot framework and microservice architecture
- Experience in working with Apache Spark
- Solid demonstrated object-oriented software development experience with Java, SQL, Maven, relational/NoSQL databases and testing frameworks
- Strong working experience with developing RESTful services
- Should have experience working on Application frameworks such as Spring, Spring Boot, AOP
- Exposure to tools – Jira, Bamboo, Git, Confluence would be an added advantage
- Excellent grasp of the current technology landscape, trends and emerging technologies
Role Summary
We Are looking for an analytically inclined, Insights Driven Product Analyst to make our organisation more data driven. In this role you will be responsible for creating dashboards to drive insights for product and business teams. Be it Day to Day decisions as well as long term impact assessment, Measuring the Efficacy of different products or certain teams, You'll be Empowering each of them. The growing nature of the team will require you to be in touch with all of the teams at upgrad. Are you the "Go-To" person everyone looks at for getting Data, Then this role is for you.
Roles & Responsibilities
- Lead and own the analysis of highly complex data sources, identifying trends and patterns in data and provide insights/recommendations based on analysis results
- Build, maintain, own and communicate detailed reports to assist Marketing, Growth/Learning Experience and Other Business/Executive Teams
- Own the design, development, and maintenance of ongoing metrics, reports, analyses, dashboards, etc. to drive key business decisions.
- Analyze data and generate insights in the form of user analysis, user segmentation, performance reports, etc.
- Facilitate review sessions with management, business users and other team members
- Design and create visualizations to present actionable insights related to data sets and business questions at hand
- Develop intelligent models around channel performance, user profiling, and personalization
Skills Required
- Having 4-6 yrs hands-on experience with Product related analytics and reporting
- Experience with building dashboards in Tableau or other data visualization tools such as D3
- Strong data, statistics, and analytical skills with a good grasp of SQL.
- Programming experience in Python is must
- Comfortable managing large data sets
- Good Excel/data management skills
Tiger Analytics is a global AI & analytics consulting firm. With data and technology at the core of our solutions, we are solving some of the toughest problems out there. Our culture is modeled around expertise and mutual respect with a team first mindset. Working at Tiger, you’ll be at the heart of this AI revolution. You’ll work with teams that push the boundaries of what-is-possible and build solutions that energize and inspire.
We are headquartered in the Silicon Valley and have our delivery centres across the globe. The below role is for our Chennai or Bangalore office, or you can choose to work remotely.
About the Role:
As an Associate Director - Data Science at Tiger Analytics, you will lead data science aspects of endto-end client AI & analytics programs. Your role will be a combination of hands-on contribution, technical team management, and client interaction.
• Work closely with internal teams and client stakeholders to design analytical approaches to
solve business problems
• Develop and enhance a broad range of cutting-edge data analytics and machine learning
problems across a variety of industries.
• Work on various aspects of the ML ecosystem – model building, ML pipelines, logging &
versioning, documentation, scaling, deployment, monitoring and maintenance etc.
• Lead a team of data scientists and engineers to embed AI and analytics into the client
business decision processes.
Desired Skills:
• High level of proficiency in a structured programming language, e.g. Python, R.
• Experience designing data science solutions to business problems
• Deep understanding of ML algorithms for common use cases in both structured and
unstructured data ecosystems.
• Comfortable with large scale data processing and distributed computing
• Excellent written and verbal communication skills
• 10+ years exp of which 8 years of relevant data science experience including hands-on
programming.
Designation will be commensurate with expertise/experience. Compensation packages among the best in the industry.
About the Company:
It is a Data as a Service company that helps businesses harness the power of data. Our technology fuels some of the most interesting big data projects of the word. We are a small bunch of people working towards shaping the imminent data-driven future by solving some of its fundamental and toughest challenges.
Role: We are looking for an experienced team lead to drive data acquisition projects end to end. In this role, you will be working in the web scraping team with data engineers, helping them solve complex web problems and mentor them along the way. You’ll be adept at delivering large-scale web crawling projects, breaking down barriers for your team and planning at a higher level, and getting into the detail to make things happen when needed.
Responsibilities
- Interface with clients and sales team to translate functional requirements into technical requirements
- Plan and estimate tasks with your team, in collaboration with the delivery managers
- Engineer complex data acquisition projects
- Guide and mentor your team of engineers
- Anticipate issues that might arise and proactively consider those into design
- Perform code reviews and suggest design changes
Prerequisites
- Between 5-8 years of relevant experience
- Fluent programming skills and well-versed with scripting languages like Python or Ruby
- Solid foundation in data structures and algorithms
- Excellent tech troubleshooting skills
- Good understanding of web data landscape
- Prior exposure to DOM, XPATH and hands on experience with selenium/automated testing is a plus
Skills and competencies
- Prior experience with team handling and people management is mandatory
- Work independently with little to no supervision
- Extremely high attention to detail
- Ability to juggle between multiple projects
culture and operating norms as a result of the fast-paced nature of a new, high-growth
organization.
• 7+ years of Industry experience primarily related to Unstructured Text Data and NLP
(PhD work and internships will be considered if they are related to unstructured text
in lieu of industry experience but not more than 2 years will be accounted towards
industry experience)
• Develop Natural Language Medical/Healthcare documents comprehension related
products to support Health business objectives, products and improve
processing efficiency, reducing overall healthcare costs
• Gather external data sets; build synthetic data and label data sets as per the needs
for NLP/NLR/NLU
• Apply expert software engineering skills to build Natural Language products to
improve automation and improve user experiences leveraging unstructured data storage, Entity Recognition, POS Tagging, ontologies, taxonomies, data mining,
information retrieval techniques, machine learning approach, distributed and cloud
computing platforms
• Own the Natural Language and Text Mining products — from platforms to systems
for model training, versioning, deploying, storage and testing models with creating
real time feedback loops to fully automated services
• Work closely and collaborate with Data Scientists, Machine Learning engineers, IT
teams and Business stakeholders spread out across various locations in US and India
to achieve business goals
• Provide mentoring to other Data Scientist and Machine Learning Engineers
• Strong understanding of mathematical concepts including but not limited to linear
algebra, Advanced calculus, partial differential equations and statistics including
Bayesian approaches
• Strong programming experience including understanding of concepts in data
structures, algorithms, compression techniques, high performance computing,
distributed computing, and various computer architecture
• Good understanding and experience with traditional data science approaches like
sampling techniques, feature engineering, classification and regressions, SVM, trees,
model evaluations
• Additional course work, projects, research participation and/or publications in
Natural Language processing, reasoning and understanding, information retrieval,
text mining, search, computational linguistics, ontologies, semantics
• Experience with developing and deploying products in production with experience
in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the following
cloud vendors like AWS, Azure, and Google for 2+ years
• Hands on experience with one or more of high-performance computing and
distributed computing like Spark, Dask, Hadoop, CUDA distributed GPU (2+ years)
• Thorough understanding of deep learning architectures and hands on experience
with one or more frameworks like tensorflow, pytorch, keras (2+ years)
• Hands on experience with libraries and tools like Spacy, NLTK, Stanford core NLP,
Genism, johnsnowlabs for 5+ years
• Understanding business use cases and be able to translate them to team with a
vision on how to implement
• Identify enhancements and build best practices that can help to improve the
productivity of the team.