We have an urgent requirements of Big Data Developer profiles in our reputed MNC company.
Location: Pune/Bangalore/Hyderabad/Nagpur
Experience: 4-9yrs
Skills: Pyspark,AWS
or Spark,Scala,AWS
or Python Aws
About Persistent Systems
Similar jobs
Power BI Developer(Azure Developer )
Job Description:
Senior visualization engineer with understanding in Azure Data Factory & Databricks to develop and deliver solutions that enable delivery of information to audiences in support of key business processes.
Ensure code and design quality through execution of test plans and assist in development of standards & guidelines working closely with internal and external design, business and technical counterparts.
Desired Competencies:
- Strong designing concepts of data visualization centered on business user and a knack of communicating insights visually.
- Ability to produce any of the charting methods available with drill down options and action-based reporting. This includes use of right graphs for the underlying data with company themes and objects.
- Publishing reports & dashboards on reporting server and providing role-based access to users.
- Ability to create wireframes on any tool for communicating the reporting design.
- Creation of ad-hoc reports & dashboards to visually communicate data hub metrics (metadata information) for top management understanding.
- Should be able to handle huge volume of data from databases such as SQL Server, Synapse, Delta Lake or flat files and create high performance dashboards.
- Should be good in Power BI development
- Expertise in 2 or more BI (Visualization) tools in building reports and dashboards.
- Understanding of Azure components like Azure Data Factory, Data lake Store, SQL Database, Azure Databricks
- Strong knowledge in SQL queries
- Must have worked in full life-cycle development from functional design to deployment
- Intermediate understanding to format, process and transform data
- Should have working knowledge of GIT, SVN
- Good experience in establishing connection with heterogeneous sources like Hadoop, Hive, Amazon, Azure, Salesforce, SAP, HANA, API’s, various Databases etc.
- Basic understanding of data modelling and ability to combine data from multiple sources to create integrated reports
Preferred Qualifications:
- Bachelor's degree in Computer Science or Technology
- Proven success in contributing to a team-oriented environment
About antuit.ai
Antuit.ai is the leader in AI-powered SaaS solutions for Demand Forecasting & Planning, Merchandising and Pricing. We have the industry’s first solution portfolio – powered by Artificial Intelligence and Machine Learning – that can help you digitally transform your Forecasting, Assortment, Pricing, and Personalization solutions. World-class retailers and consumer goods manufacturers leverage antuit.ai solutions, at scale, to drive outsized business results globally with higher sales, margin and sell-through.
Antuit.ai’s executives, comprised of industry leaders from McKinsey, Accenture, IBM, and SAS, and our team of Ph.Ds., data scientists, technologists, and domain experts, are passionate about delivering real value to our clients. Antuit.ai is funded by Goldman Sachs and Zodius Capital.
The Role:
Antuit.ai is interested in hiring a Principal Data Scientist, this person will facilitate standing up standardization and automation ecosystem for ML product delivery, he will also actively participate in managing implementation, design and tuning of product to meet business needs.
Responsibilities:
Responsibilities includes, but are not limited to the following:
- Manage and provides technical expertise to the delivery team. This includes recommendation of solution alternatives, identification of risks and managing business expectations.
- Design, build reliable and scalable automated processes for large scale machine learning.
- Use engineering expertise to help design solutions to novel problems in software development, data engineering, and machine learning.
- Collaborate with Business, Technology and Product teams to stand-up MLOps process.
- Apply your experience in making intelligent, forward-thinking, technical decisions to delivery ML ecosystem, including implementing new standards, architecture design, and workflows tools.
- Deep dive into complex algorithmic and product issues in production
- Own metrics and reporting for delivery team.
- Set a clear vision for the team members and working cohesively to attain it.
- Mentor and coach team members
Qualifications and Skills:
Requirements
- Engineering degree in any stream
- Has at least 7 years of prior experience in building ML driven products/solutions
- Excellent programming skills in any one of the language C++ or Python or Java.
- Hands on experience on open source libraries and frameworks- Tensorflow,Pytorch, MLFlow, KubeFlow, etc.
- Developed and productized large-scale models/algorithms in prior experience
- Can drive fast prototypes/proof of concept in evaluating various technology, frameworks/performance benchmarks.
- Familiar with software development practices/pipelines (DevOps- Kubernetes, docker containers, CI/CD tools).
- Good verbal, written and presentation skills.
- Ability to learn new skills and technologies.
- 3+ years working with retail or CPG preferred.
- Experience in forecasting and optimization problems, particularly in the CPG / Retail industry preferred.
Information Security Responsibilities
- Understand and adhere to Information Security policies, guidelines and procedure, practice them for protection of organizational data and Information System.
- Take part in Information Security training and act accordingly while handling information.
- Report all suspected security and policy breach to Infosec team or appropriate authority (CISO).
EEOC
Antuit.ai is an at-will, equal opportunity employer. We consider applicants for all positions without regard to race, color, religion, national origin or ancestry, gender identity, sex, age (40+), marital status, disability, veteran status, or any other legally protected status under local, state, or federal law.
Minimum of 8 years of experience of which, 4 years should be of applied data mining
experience in disciplines such as Call Centre Metrics.
Strong experience in advanced statistics and analytics including segmentation, modelling, regression, forecasting etc.
Experience with leading and managing large teams.
Demonstrated pattern of success in using advanced quantitative analytic methods to solve business problems.
Demonstrated experience with Business Intelligence/Data Mining tools to work with
data, investigate anomalies, construct data sets, and build models.
Critical to share details on projects undertaken (preferably on telecom industry)
specifically through analysis from CRM.
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
Requirements:-
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with
For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
Experience Range |
2 Years - 10 Years |
Function | Information Technology |
Desired Skills |
Must Have Skills:
• Good experience in Pyspark - Including Dataframe core functions and Spark SQL
• Good experience in SQL DBs - Be able to write queries including fair complexity.
• Should have excellent experience in Big Data programming for data transformation and aggregations
• Good at ELT architecture. Business rules processing and data extraction from Data Lake into data streams for business consumption.
• Good customer communication.
• Good Analytical skills
|
Education Type | Engineering |
Degree / Diploma | Bachelor of Engineering, Bachelor of Computer Applications, Any Engineering |
Specialization / Subject | Any Specialisation |
Job Type | Full Time |
Job ID | 000018 |
Department | Software Development |
About the Company
Blue Sky Analytics is a Climate Tech startup that combines the power of AI & Satellite data to aid in the creation of a global environmental data stack. Our funders include Beenext and Rainmatter. Over the next 12 months, we aim to expand to 10 environmental data-sets spanning water, land, heat, and more!
We are looking for a Data Lead - someone who works at the intersection of data science, GIS, and engineering. We want a leader who not only understands environmental data but someone who can quickly assemble large scale datasets that are crucial to the well being of our planet. Come save the planet with us!
Your Role
Manage: As a leadership position, this requires long term strategic thinking. You will be in charge of daily operations of the data team. This would include running team standups, planning the execution of data generation and ensuring the algorithms are put in production. You will also be the person in charge to dumb down the data science for the rest of us who do not know what it means.
Love and Live Data: You will also be taking all the responsibility of ensuring that the data we generate is accurate, clean, and is ready to use for our clients. This would entail that you understand what the market needs, calculate feasibilities and build data pipelines. You should understand the algorithms that we use or need to use and take decisions on what would serve the needs of our clients well. We also want our Data Lead to be constantly probing for newer and optimized ways of generating datasets. It would help if they were abreast of all the latest developments in the data science and environmental worlds. The Data Lead also has to be able to work with our Platform team on integrating the data on our platform and API portal.
Collaboration: We use Clubhouse to track and manage our projects across our organization - this will require you to collaborate with the team and follow up with members on a regular basis. About 50% of the work, needs to be the pulse of the platform team. You'll collaborate closely with peers from other functions—Design, Product, Marketing, Sales, and Support to name a few—on our overall product roadmap, on product launches, and on ongoing operations. You will find yourself working with the product management team to define and execute the feature roadmap. You will be expected to work closely with the CTO, reporting on daily operations and development. We don't believe in a top-down hierarchical approach and are transparent with everyone. This means honest and mutual feedback and ability to adapt.
Teaching: Not exactly in the traditional sense. You'll recruit, coach, and develop engineers while ensuring that they are regularly receiving feedback and making rapid progress on personal and professional goals.
Humble and cool: Look we will be upfront with you about one thing - our team is fairly young and is always buzzing with work. In this fast-paced setting, we are looking for someone who can stay cool, is humble, and is willing to learn. You are adaptable, can skill up fast, and are fearless at trying new methods. After all, you're in the business of saving the planet!
Requirements
- A minimum of 5 years of industry experience.
- Hyper-curious!
- Exceptional at Remote Sensing Data, GIS, Data Science.
- Must have big data & data analytics experience
- Very good in documentation & speccing datasets
- Experience with AWS Cloud, Linux, Infra as Code & Docker (containers) is a must
- Coordinate with cross-functional teams (DevOPS, QA, Design etc.) on planning and execution
- Lead, mentor and manage deliverables of a team of talented and highly motivated team of developers
- Must have experience in building, managing, growing & hiring data teams. Has built large-scale datasets from scratch
- Managing work on team's Clubhouse & follows up with the team. ~ 50% of work, needs to be the pulse of the platform team
- Exceptional communication skills & ability to abstract away problems & build systems. Should be able to explain to the management anything & everything
- Quality control - you'll be responsible for maintaining a high quality bar for everything your team ships. This includes documentation and data quality
- Experience of having led smaller teams, would be a plus.
Benefits
- Work from anywhere: Work by the beach or from the mountains.
- Open source at heart: We are building a community where you can use, contribute and collaborate on.
- Own a slice of the pie: Possibility of becoming an owner by investing in ESOPs.
- Flexible timings: Fit your work around your lifestyle.
- Comprehensive health cover: Health cover for you and your dependents to keep you tension free.
- Work Machine of choice: Buy a device and own it after completing a year at BSA.
- Quarterly Retreats: Yes there's work-but then there's all the non-work+fun aspect aka the retreat!
- Yearly vacations: Take time off to rest and get ready for the next big assignment by availing the paid leaves.
We are actively seeking a Senior Data Engineer experienced in building data pipelines and integrations from 3rd party data sources by writing custom automated ETL jobs using Python. The role will work in partnership with other members of the Business Analytics team to support the development and implementation of new and existing data warehouse solutions for our clients. This includes designing database import/export processes used to generate client data warehouse deliverables.
- 2+ Years experience as an ETL developer with strong data architecture knowledge around data warehousing concepts, SQL development and optimization, and operational support models.
- Experience using Python to automate ETL/Data Processes jobs.
- Design and develop ETL and data processing solutions using data integration tools, python scripts, and AWS / Azure / On-Premise Environment.
- Experience / Willingness to learn AWS Glue / AWS Data Pipeline / Azure Data Factory for Data Integration.
- Develop and create transformation queries, views, and stored procedures for ETL processes, and process automation.
- Document data mappings, data dictionaries, processes, programs, and solutions as per established standards for data governance.
- Work with the data analytics team to assess and troubleshoot potential data quality issues at key intake points such as validating control totals at intake and then upon transformation, and transparently build lessons learned into future data quality assessments
- Solid experience with data modeling, business logic, and RESTful APIs.
- Solid experience in the Linux environment.
- Experience with NoSQL / PostgreSQL preferred
- Experience working with databases such as MySQL, NoSQL, and Postgres, and enterprise-level connectivity experience (such as connecting over TLS and through proxies).
- Experience with NGINX and SSL.
- Performance tune data processes and SQL queries, and recommend and implement data process optimization and query tuning techniques.