



Location: Bangalore/Pune/Hyderabad/Nagpur
4-5 years of overall experience in software development.
- Experience on Hadoop (Apache/Cloudera/Hortonworks) and/or other Map Reduce Platforms
- Experience on Hive, Pig, Sqoop, Flume and/or Mahout
- Experience on NO-SQL – HBase, Cassandra, MongoDB
- Hands on experience with Spark development, Knowledge of Storm, Kafka, Scala
- Good knowledge of Java
- Good background of Configuration Management/Ticketing systems like Maven/Ant/JIRA etc.
- Knowledge around any Data Integration and/or EDW tools is plus
- Good to have knowledge of using Python/Perl/Shell
Please note - Hbase hive and spark are must.

Similar jobs


We are seeking a skilled and innovative Developer with strong expertise in Scala, Java/Python and Spark/Hadoop to join our dynamic team.
Key Responsibilities:
• Design, develop, and maintain robust and scalable backend systems using Scala, Spark, Hadoop and expertise in Python/Java.
• Build and deploy highly efficient, modular, and maintainable microservices architecture for enterprise-level applications.
• Write and optimize algorithms to enhance application performance and scalability.
Required Skills:
• Programming: Expert in Scala and object-oriented programming.
• Frameworks: Hands-on experience with Spark and Hadoop
• Databases: Experience with relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB).
• Location: Mumbai
• Employment Type: Full-time
Roles and Responsibilities
Requirements
• Extensive and expert programming experience in at least one general programming language (e. g.
Java, C, C++) & tech stack to write maintainable, scalable, unit-tested code.
• Experience with multi-threading and concurrency programming.
• Extensive experience in object oriented design skills, knowledge of design patterns, and a huge passion
and ability to design intuitive modules and class-level interfaces.
• Excellent coding skills - should be able to convert design into code fluently.
• Knowledge of Test Driven Development.
• Good understanding of databases (e. g. MySQL) and NoSQL (e. g. HBase, Elasticsearch, Aerospike etc).
• Strong desire to solve complex and interesting real world problems.
• Experience with full life cycle development in any programming language on a Linux platform.
• Go-getter attitude that reflects in energy and intent behind assigned tasks.
• Worked in a startup-like environment with high levels of ownership and commitment.
• BTech, MTech or Ph. D. in Computer Science or related technical discipline (or equivalent).
• Experience in building highly scalable business applications, which involve implementing large complex
business flows and dealing with huge amounts of data.
• 3+ years of experience in the art of writing code and solving problems on a large scale.
• Open communicator who shares thoughts and opinions frequently, listens intently, and takes
constructive feedback.
The Sr. Analytics Engineer would provide technical expertise in needs identification, data modeling, data movement, and transformation mapping (source to target), automation and testing strategies, translating business needs into technical solutions with adherence to established data guidelines and approaches from a business unit or project perspective.
Understands and leverages best-fit technologies (e.g., traditional star schema structures, cloud, Hadoop, NoSQL, etc.) and approaches to address business and environmental challenges.
Provides data understanding and coordinates data-related activities with other data management groups such as master data management, data governance, and metadata management.
Actively participates with other consultants in problem-solving and approach development.
Responsibilities :
Provide a consultative approach with business users, asking questions to understand the business need and deriving the data flow, conceptual, logical, and physical data models based on those needs.
Perform data analysis to validate data models and to confirm the ability to meet business needs.
Assist with and support setting the data architecture direction, ensuring data architecture deliverables are developed, ensuring compliance to standards and guidelines, implementing the data architecture, and supporting technical developers at a project or business unit level.
Coordinate and consult with the Data Architect, project manager, client business staff, client technical staff and project developers in data architecture best practices and anything else that is data related at the project or business unit levels.
Work closely with Business Analysts and Solution Architects to design the data model satisfying the business needs and adhering to Enterprise Architecture.
Coordinate with Data Architects, Program Managers and participate in recurring meetings.
Help and mentor team members to understand the data model and subject areas.
Ensure that the team adheres to best practices and guidelines.
Requirements :
- Strong working knowledge of at least 3 years of Spark, Java/Scala/Pyspark, Kafka, Git, Unix / Linux, and ETL pipeline designing.
- Experience with Spark optimization/tuning/resource allocations
- Excellent understanding of IN memory distributed computing frameworks like Spark and its parameter tuning, writing optimized workflow sequences.
- Experience of relational databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., Redshift, Bigquery, Cassandra, etc).
- Familiarity with Docker, Kubernetes, Azure Data Lake/Blob storage, AWS S3, Google Cloud storage, etc.
- Have a deep understanding of the various stacks and components of the Big Data ecosystem.
- Hands-on experience with Python is a huge plus
· Core responsibilities to include analyze business requirements and designs for accuracy and completeness. Develops and maintains relevant product.
· BlueYonder is seeking a Senior/Principal Architect in the Data Services department (under Luminate Platform ) to act as one of key technology leaders to build and manage BlueYonder’ s technology assets in the Data Platform and Services.
· This individual will act as a trusted technical advisor and strategic thought leader to the Data Services department. The successful candidate will have the opportunity to lead, participate, guide, and mentor other people in the team on architecture and design in a hands-on manner. You are responsible for technical direction of Data Platform. This position reports to the Global Head, Data Services and will be based in Bangalore, India.
· Core responsibilities to include Architecting and designing (along with counterparts and distinguished Architects) a ground up cloud native (we use Azure) SaaS product in Order management and micro-fulfillment
· The team currently comprises of 60+ global associates across US, India (COE) and UK and is expected to grow rapidly. The incumbent will need to have leadership qualities to also mentor junior and mid-level software associates in our team. This person will lead the Data platform architecture – Streaming, Bulk with Snowflake/Elastic Search/other tools
Our current technical environment:
· Software: Java, Springboot, Gradle, GIT, Hibernate, Rest API, OAuth , Snowflake
· • Application Architecture: Scalable, Resilient, event driven, secure multi-tenant Microservices architecture
· • Cloud Architecture: MS Azure (ARM templates, AKS, HD insight, Application gateway, Virtue Networks, Event Hub, Azure AD)
· Frameworks/Others: Kubernetes, Kafka, Elasticsearch, Spark, NOSQL, RDBMS, Springboot, Gradle GIT, Ignite
- 3+ years of SDE work experience from Product based companies
- Experience in Java, Spring Boot, MySQL, Kafka, Hbase, AWS
- Experience in Multi threading, distributed systems, Best practices of coding, scaling
Your Opportunity
- Own and drive business features into tech requirements
- Design & develop large scale real time server side systems
- Quickly create quality prototypes
- Staying updated on emerging technologies
- Ensuring that all deliverables adhere to our world class standards
- Promote coding best practices
- Mentor and develop junior developers in the team
Required Experience:
- 4+ years of relevant experience as described below
- Excellent grasp of Core Java, Multi Threading and OO design patterns
- Experience with Scala, functional, reactive programming and Akka/Play is a plus
- Excellent understanding of data structures and algorithms
- Solid grasp of large scale distributed real time systems
- Prior experience on building a scalable and resilient micro service
- Solid understanding of relational databases, NoSQL databases and Caching systems
- Good understanding of Big Data technologies such as Spark, Hadoop is a plus
- Experience on one of AWS, Azure or GCP
Who you are :
- You have excellent and effective communication and collaborative skills
- You love problem solving
- You stay up to date with the latest technologies and then apply them in real life
- You love paying attention to detail
- You thrive in meeting tight deadlines and prioritising workloads
- Ability to collaborate across multiple functions
Education:
Bachelor’s degree in Engineering or equivalent experience within the field

Be Part Of Building The Future
Dremio is the Data Lake Engine company. Our mission is to reshape the world of analytics to deliver on the promise of data with a fundamentally new architecture, purpose-built for the exploding trend towards cloud data lake storage such as AWS S3 and Microsoft ADLS. We dramatically reduce and even eliminate the need for the complex and expensive workarounds that have been in use for decades, such as data warehouses (whether on-premise or cloud-native), structural data prep, ETL, cubes, and extracts. We do this by enabling lightning-fast queries directly against data lake storage, combined with full self-service for data users and full governance and control for IT. The results for enterprises are extremely compelling: 100X faster time to insight; 10X greater efficiency; zero data copies; and game-changing simplicity. And equally compelling is the market opportunity for Dremio, as we are well on our way to disrupting a $25BN+ market.
About the Role
The Dremio India team owns the DataLake Engine along with Cloud Infrastructure and services that power it. With focus on next generation data analytics supporting modern table formats like Iceberg, Deltalake, and open source initiatives such as Apache Arrow, Project Nessie and hybrid-cloud infrastructure, this team provides various opportunities to learn, deliver, and grow in career. We are looking for innovative minds with experience in leading and building high quality distributed systems at massive scale and solving complex problems.
Responsibilities & ownership
- Lead, build, deliver and ensure customer success of next-generation features related to scalability, reliability, robustness, usability, security, and performance of the product.
- Work on distributed systems for data processing with efficient protocols and communication, locking and consensus, schedulers, resource management, low latency access to distributed storage, auto scaling, and self healing.
- Understand and reason about concurrency and parallelization to deliver scalability and performance in a multithreaded and distributed environment.
- Lead the team to solve complex and unknown problems
- Solve technical problems and customer issues with technical expertise
- Design and deliver architectures that run optimally on public clouds like GCP, AWS, and Azure
- Mentor other team members for high quality and design
- Collaborate with Product Management to deliver on customer requirements and innovation
- Collaborate with Support and field teams to ensure that customers are successful with Dremio
Requirements
- B.S./M.S/Equivalent in Computer Science or a related technical field or equivalent experience
- Fluency in Java/C++ with 8+ years of experience developing production-level software
- Strong foundation in data structures, algorithms, multi-threaded and asynchronous programming models, and their use in developing distributed and scalable systems
- 5+ years experience in developing complex and scalable distributed systems and delivering, deploying, and managing microservices successfully
- Hands-on experience in query processing or optimization, distributed systems, concurrency control, data replication, code generation, networking, and storage systems
- Passion for quality, zero downtime upgrades, availability, resiliency, and uptime of the platform
- Passion for learning and delivering using latest technologies
- Ability to solve ambiguous, unexplored, and cross-team problems effectively
- Hands on experience of working projects on AWS, Azure, and Google Cloud Platform
- Experience with containers and Kubernetes for orchestration and container management in private and public clouds (AWS, Azure, and Google Cloud)
- Understanding of distributed file systems such as S3, ADLS, or HDFS
- Excellent communication skills and affinity for collaboration and teamwork
- Ability to work individually and collaboratively with other team members
- Ability to scope and plan solution for big problems and mentors others on the same
- Interested and motivated to be part of a fast-moving startup with a fun and accomplished team

