Strong experience in Scala/Spark
End client: Sapient
Mode of Hiring : FTE
Notice should be less than 30days
About Mirafra Technologies
Similar jobs
Position Overview: We are seeking a talented Data Engineer with expertise in Power BI to join our team. The ideal candidate will be responsible for designing and implementing data pipelines, as well as developing insightful visualizations and reports using Power BI. Additionally, the candidate should have strong skills in Python, data analytics, PySpark, and Databricks. This role requires a blend of technical expertise, analytical thinking, and effective communication skills.
Key Responsibilities:
- Design, develop, and maintain data pipelines and architectures using PySpark and Databricks.
- Implement ETL processes to extract, transform, and load data from various sources into data warehouses or data lakes.
- Collaborate with data analysts and business stakeholders to understand data requirements and translate them into actionable insights.
- Develop interactive dashboards, reports, and visualizations using Power BI to communicate key metrics and trends.
- Optimize and tune data pipelines for performance, scalability, and reliability.
- Monitor and troubleshoot data infrastructure to ensure data quality, integrity, and availability.
- Implement security measures and best practices to protect sensitive data.
- Stay updated with emerging technologies and best practices in data engineering and data visualization.
- Document processes, workflows, and configurations to maintain a comprehensive knowledge base.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or related field. (Master’s degree preferred)
- Proven experience as a Data Engineer with expertise in Power BI, Python, PySpark, and Databricks.
- Strong proficiency in Power BI, including data modeling, DAX calculations, and creating interactive reports and dashboards.
- Solid understanding of data analytics concepts and techniques.
- Experience working with Big Data technologies such as Hadoop, Spark, or Kafka.
- Proficiency in programming languages such as Python and SQL.
- Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
- Excellent analytical and problem-solving skills with attention to detail.
- Strong communication and collaboration skills to work effectively with cross-functional teams.
- Ability to work independently and manage multiple tasks simultaneously in a fast-paced environment.
Preferred Qualifications:
- Advanced degree in Computer Science, Engineering, or related field.
- Certifications in Power BI or related technologies.
- Experience with data visualization tools other than Power BI (e.g., Tableau, QlikView).
- Knowledge of machine learning concepts and frameworks.
About Us:
Cognitio Analytics is an award-winning, niche service provider that offers digital transformation solutions powered by AI and machine learning. We help clients realize the full potential of their data assets and the investments made in related technologies, be it analytics and big data platforms, digital technologies or your people. Our solutions include Health Analytics powered by Cognitio’s Health Data Factory that drives better care outcomes and higher ROI on care spending. We specialize in providing Data Governance solutions that help effectively use data in a consistent and compliant manner. Additionally, our smart intelligence solutions enable a deeper understanding of operations through the use of data science and advanced solutions like process mining technologies. We have offices in New Jersey and Connecticut in the USA and in Gurgaon in India.
What we're looking for:
- Ability in data modelling, design, build and deploy DW/BI systems for Insurance, Health Care, Banking, etc.
- Performance tuning of ETL process and SQL queries and recommend & implement ETL and query tuning techniques.
- Develop and create transformation queries, views, and stored procedures for ETL processes, and process automations.
- Translate business needs into technical specifications, evaluate, and improve existing BI systems.
- Use business intelligence and visualization software (e.g., Tableau, Qlik Sense, Power BI etc.) to empower customers to drive their analytics and reporting
- Develop and update technical documentation for BI systems.
Key Technical Skills:
- Hands on experience in MS SQL Server & MSBI (SSIS/SSRS/SSAS), with understanding of database concepts, star schema, SQL Tuning, OLAP, Databricks, Hadoop, Spark, cloud technologies.
- Experience in designing and building complete ETL / SSIS processes and transforming data for ODS, Staging, and Data Warehouse
- Experience building self-service reporting solutions using business intelligence software (e.g., Tableau, Qlik Sense, Power BI etc.)
- Graduate+ in Mathematics, Statistics, Computer Science, Economics, Business, Engineering or equivalent work experience.
- Total experience of 5+ years with at least 2 years in managing data quality for high scale data platforms.
- Good knowledge of SQL querying.
- Strong skill in analysing data and uncovering patterns using SQL or Python.
- Excellent understanding of data warehouse/big data concepts such data extraction, data transformation, data loading (ETL process).
- Strong background in automation and building automated testing frameworks for data ingestion and transformation jobs.
- Experience in big data technologies a big plus.
- Experience in machine learning, especially in data quality applications a big plus.
- Experience in building data quality automation frameworks a big plus.
- Strong experience working with an Agile development team with rapid iterations.
- Very strong verbal and written communication, and presentation skills.
- Ability to quickly understand business rules.
- Ability to work well with others in a geographically distributed team.
- Keen observation skills to analyse data, highly detail oriented.
- Excellent judgment, critical-thinking, and decision-making skills; can balance attention to detail with swift execution.
- Able to identify stakeholders, build relationships, and influence others to get work done.
- Self-directed and self-motivated individual who takes complete ownership of the product and its outcome.
About Kloud9:
Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.
Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.
At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.
Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.
We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.
What we are looking for:
● 3+ years’ experience developing Big Data & Analytic solutions
● Experience building data lake solutions leveraging Google Data Products (e.g. Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep, etc.), Hive, Spark
● Experience with relational SQL/No SQL
● Experience with Spark (Scala/Python/Java) and Kafka
● Work experience with using Databricks (Data Engineering and Delta Lake components)
● Experience with source control tools such as GitHub and related dev process
● Experience with workflow scheduling tools such as Airflow
● In-depth knowledge of any scalable cloud vendor(GCP preferred)
● Has a passion for data solutions
● Strong understanding of data structures and algorithms
● Strong understanding of solution and technical design
● Has a strong problem solving and analytical mindset
● Experience working with Agile Teams.
● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
● Able to quickly pick up new programming languages, technologies, and frameworks
● Bachelor’s Degree in computer science
Why Explore a Career at Kloud9:
With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers!
About RARA NOW :
-
RaRa Now is revolutionizing instant delivery for e-commerce in Indonesia through data-driven logistics.
-
RaRa Now is making instant and same-day deliveries scalable and cost-effective by leveraging a differentiated operating model and real-time optimization technology. RaRa makes it possible for anyone, anywhere to get same-day delivery in Indonesia. While others are focusing on - one-to-one- deliveries, the company has developed proprietary, real-time batching tech to do - many-to-many- deliveries within a few hours. RaRa is already in partnership with some of the top eCommerce players in Indonesia like Blibli, Sayurbox, Kopi Kenangan, and many more.
-
We are a distributed team with the company headquartered in Singapore, core operations in Indonesia, and a technology team based out of India.
Future of eCommerce Logistics :
-
Data driven logistics company that is bringing in same-day delivery revolution in Indonesia
-
Revolutionizing delivery as an experience
-
Empowering D2C Sellers with logistics as the core technology
About the Role :
- Create and maintain optimal data pipeline architecture,
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Prior experience on working on Big Query, Redshift or other data warehouses
Work Timing: 5 Days A Week
Responsibilities include:
• Ensure right stakeholders gets right information at right time
• Requirement gathering with stakeholders to understand their data requirement
• Creating and deploying reports
• Participate actively in datamarts design discussions
• Work on both RDBMS as well as Big Data for designing BI Solutions
• Write code (queries/procedures) in SQL / Hive / Drill that is both functional and elegant,
following appropriate design patterns
• Design and plan BI solutions to automate regular reporting
• Debugging, monitoring and troubleshooting BI solutions
• Creating and deploying datamarts
• Writing relational and multidimensional database queries
• Integrate heterogeneous data sources into BI solutions
• Ensure Data Integrity of data flowing from heterogeneous data sources into BI solutions.
Minimum Job Qualifications:
• BE/B.Tech in Computer Science/IT from Top Colleges
• 1-5 years of experience in Datawarehousing and SQL
• Excellent Analytical Knowledge
• Excellent technical as well as communication skills
• Attention to even the smallest detail is mandatory
• Knowledge of SQL query writing and performance tuning
• Knowledge of Big Data technologies like Apache Hadoop, Apache Hive, Apache Drill
• Knowledge of fundamentals of Business Intelligence
• In-depth knowledge of RDBMS systems, Datawarehousing and Datamarts
• Smart, motivated and team oriented
Desirable Requirements
• Sound knowledge of software development in Programming (preferably Java )
• Knowledge of the software development lifecycle (SDLC) and models
- Expert software implementation and automated testing
- Promoting development standards, code reviews, mentoring, knowledge sharing
- Improving our Agile methodology maturity
- Product and feature design, scrum story writing
- Build, release, and deployment automation
- Product support & troubleshooting
Who we have in mind:
- Demonstrated experience as a Java
- Should have a deep understanding of Enterprise/Distributed Architecture patterns and should be able to demonstrate the relevant usage of the same
- Turn high-level project requirements into application-level architecture and collaborate with the team members to implement the solution
- Strong experience and knowledge in Spring boot framework and microservice architecture
- Experience in working with Apache Spark
- Solid demonstrated object-oriented software development experience with Java, SQL, Maven, relational/NoSQL databases and testing frameworks
- Strong working experience with developing RESTful services
- Should have experience working on Application frameworks such as Spring, Spring Boot, AOP
- Exposure to tools – Jira, Bamboo, Git, Confluence would be an added advantage
- Excellent grasp of the current technology landscape, trends and emerging technologies
- Hands-on programming expertise in Java OR Python
- Strong production experience with Spark (Minimum of 1-2 years)
- Experience in data pipelines using Big Data technologies (Hadoop, Spark, Kafka, etc.,) on large scale unstructured data sets
- Working experience and good understanding of public cloud environments (AWS OR Azure OR Google Cloud)
- Experience with IAM policy and role management is a plus
Responsibilities for Data Scientist/ NLP Engineer
Work with customers to identify opportunities for leveraging their data to drive business
solutions.
• Develop custom data models and algorithms to apply to data sets.
• Basic data cleaning and annotation for any incoming raw data.
• Use predictive modeling to increase and optimize customer experiences, revenue
generation, ad targeting and other business outcomes.
• Develop company A/B testing framework and test model quality.
• Deployment of ML model in production.
Qualifications for Junior Data Scientist/ NLP Engineer
• BS, MS in Computer Science, Engineering, or related discipline.
• 3+ Years of experience in Data Science/Machine Learning.
• Experience with programming language Python.
• Familiar with at least one database query language, such as SQL
• Knowledge of Text Classification & Clustering, Question Answering & Query Understanding,
Search Indexing & Fuzzy Matching.
• Excellent written and verbal communication skills for coordinating acrossteams.
• Willing to learn and master new technologies and techniques.
• Knowledge and experience in statistical and data mining techniques:
GLM/Regression, Random Forest, Boosting, Trees, text mining, NLP, etc.
• Experience with chatbots would be bonus but not required