Job description
Role : Lead Architecture (Spark, Scala, Big Data/Hadoop, Java)
Primary Location : India-Pune, Hyderabad
Experience : 7 - 12 Years
Management Level: 7
Joining Time: Immediate Joiners are preferred
- Attend requirements gathering workshops, estimation discussions, design meetings and status review meetings
- Experience of Solution Design and Solution Architecture for the data engineer model to build and implement Big Data Projects on-premises and on cloud.
- Align architecture with business requirements and stabilizing the developed solution
- Ability to build prototypes to demonstrate the technical feasibility of your vision
- Professional experience facilitating and leading solution design, architecture and delivery planning activities for data intensive and high throughput platforms and applications
- To be able to benchmark systems, analyses system bottlenecks and propose solutions to eliminate them
- Able to help programmers and project managers in the design, planning and governance of implementing projects of any kind.
- Develop, construct, test and maintain architectures and run Sprints for development and rollout of functionalities
- Data Analysis, Code development experience, ideally in Big Data Spark, Hive, Hadoop, Java, Python, PySpark,
- Execute projects of various types i.e. Design, development, Implementation and migration of functional analytics Models/Business logic across architecture approaches
- Work closely with Business Analysts to understand the core business problems and deliver efficient IT solutions of the product
- Deployment sophisticated analytics program of code using any of cloud application.
Perks and Benefits we Provide!
- Working with Highly Technical and Passionate, mission-driven people
- Subsidized Meals & Snacks
- Flexible Schedule
- Approachable leadership
- Access to various learning tools and programs
- Pet Friendly
- Certification Reimbursement Policy
- Check out more about us on our website below!
www.datametica.com
About DataMetica
Similar jobs
Hi,
We are hiring for Data Scientist for Bangalore.
Req Skills:
- NLP
- ML programming
- Spark
- Model Deployment
- Experience processing unstructured data and building NLP models
- Experience with big data tools pyspark
- Pipeline orchestration using Airflow and model deployment experience is preferred
About Kloud9:
Kloud9 exists with the sole purpose of providing cloud expertise to the retail industry. Our team of cloud architects, engineers and developers help retailers launch a successful cloud initiative so you can quickly realise the benefits of cloud technology. Our standardised, proven cloud adoption methodologies reduce the cloud adoption time and effort so you can directly benefit from lower migration costs.
Kloud9 was founded with the vision of bridging the gap between E-commerce and cloud. The E-commerce of any industry is limiting and poses a huge challenge in terms of the finances spent on physical data structures.
At Kloud9, we know migrating to the cloud is the single most significant technology shift your company faces today. We are your trusted advisors in transformation and are determined to build a deep partnership along the way. Our cloud and retail experts will ease your transition to the cloud.
Our sole focus is to provide cloud expertise to retail industry giving our clients the empowerment that will take their business to the next level. Our team of proficient architects, engineers and developers have been designing, building and implementing solutions for retailers for an average of more than 20 years.
We are a cloud vendor that is both platform and technology independent. Our vendor independence not just provides us with a unique perspective into the cloud market but also ensures that we deliver the cloud solutions available that best meet our clients' requirements.
Responsibilities:
● Studying, transforming, and converting data science prototypes
● Deploying models to production
● Training and retraining models as needed
● Analyzing the ML algorithms that could be used to solve a given problem and ranking them by their respective scores
● Analyzing the errors of the model and designing strategies to overcome them
● Identifying differences in data distribution that could affect model performance in real-world situations
● Performing statistical analysis and using results to improve models
● Supervising the data acquisition process if more data is needed
● Defining data augmentation pipelines
● Defining the pre-processing or feature engineering to be done on a given dataset
● To extend and enrich existing ML frameworks and libraries
● Understanding when the findings can be applied to business decisions
● Documenting machine learning processes
Basic requirements:
● 4+ years of IT experience in which at least 2+ years of relevant experience primarily in converting data science prototypes and deploying models to production
● Proficiency with Python and machine learning libraries such as scikit-learn, matplotlib, seaborn and pandas
● Knowledge of Big Data frameworks like Hadoop, Spark, Pig, Hive, Flume, etc
● Experience in working with ML frameworks like TensorFlow, Keras, OpenCV
● Strong written and verbal communications
● Excellent interpersonal and collaboration skills.
● Expertise in visualizing and manipulating big datasets
● Familiarity with Linux
● Ability to select hardware to run an ML model with the required latency
● Robust data modelling and data architecture skills.
● Advanced degree in Computer Science/Math/Statistics or a related discipline.
● Advanced Math and Statistics skills (linear algebra, calculus, Bayesian statistics, mean, median, variance, etc.)
Nice to have
● Familiarity with Java, and R code writing.
● Exploring and visualizing data to gain an understanding of it, then identifying differences in data distribution that could affect performance when deploying the model in the real world
● Verifying data quality, and/or ensuring it via data cleaning
● Supervising the data acquisition process if more data is needed
● Finding available datasets online that could be used for training
Why Explore a Career at Kloud9:
With job opportunities in prime locations of US, London, Poland and Bengaluru, we help build your career paths in cutting edge technologies of AI, Machine Learning and Data Science. Be part of an inclusive and diverse workforce that's changing the face of retail technology with their creativity and innovative solutions. Our vested interest in our employees translates to deliver the best products and solutions to our customers.
Senior Data Engineer
Responsibilities:
● Clean, prepare and optimize data at scale for ingestion and consumption by machine learning models
● Drive the implementation of new data management projects and re-structure of the current data architecture
● Implement complex automated workflows and routines using workflow scheduling tools
● Build continuous integration, test-driven development and production deployment frameworks
● Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards
● Anticipate, identify and solve issues concerning data management to improve data quality
● Design and build reusable components, frameworks and libraries at scale to support machine learning products
● Design and implement product features in collaboration with business and Technology stakeholders
● Analyze and profile data for the purpose of designing scalable solutions
● Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues
● Mentor and develop other data engineers in adopting best practices
● Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders
Qualifications:
● 8+ years of experience developing scalable Big Data applications or solutions on distributed platforms
● Experience in Google Cloud Platform (GCP) and good to have other cloud platform tools
● Experience working with Data warehousing tools, including DynamoDB, SQL, and Snowflake
● Experience architecting data products in Streaming, Serverless and Microservices Architecture and platform.
● Experience with Spark (Scala/Python/Java) and Kafka
● Work experience with using Databricks (Data Engineering and Delta Lake components)
● Experience working with Big Data platforms, including Dataproc, Data Bricks etc
● Experience working with distributed technology tools including Spark, Presto, Databricks, Airflow
● Working knowledge of Data warehousing, Data modeling
● Experience working in Agile and Scrum development process
● Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject area
Role:
Senior Data Engineer
Total No. of Years:
8+ years of relevant experience
To be onboarded by:
Immediate
Notice Period:
Skills
Mandatory / Desirable
Min years (Project Exp)
Max years (Project Exp)
GCP Exposure
Mandatory Min 3 to 7
BigQuery, Dataflow, Dataproc, AI Building Blocks, Looker, Cloud Data Fusion, Dataprep .Spark and PySpark
Mandatory Min 5 to 9
Relational SQL
Mandatory Min 4 to 8
Shell scripting language
Mandatory Min 4 to 8
Python /scala language
Mandatory Min 4 to 8
Airflow/Kubeflow workflow scheduling tool
Mandatory Min 3 to 7
Kubernetes
Desirable 1 to 6
Scala
Mandatory Min 2 to 6
Databricks
Desirable Min 1 to 6
Google Cloud Functions
Mandatory Min 2 to 6
GitHub source control tool
Mandatory Min 4 to 8
Machine Learning
Desirable 1 to 6
Deep Learning
Desirable Min 1to 6
Data structures and algorithms
Mandatory Min 4 to 8
Function : Sr. DB Developer
Location : India/Gurgaon/Tamilnadu
>> THE INDIVIDUAL
- Have a strong background in data platform creation and management.
- Possess in-depth knowledge of Data Management, Data Modelling, Ingestion - Able to develop data models and ingestion frameworks based on client requirements and advise on system optimization.
- Hands-on experience in SQL database (PostgreSQL) and No-SQL database (MongoDB)
- Hands-on experience in performance tuning of DB
- Good to have knowledge of database setup in cluster node
- Should be well versed with data security aspects and data governance framework
- Hands-on experience in Spark, Airflow, ELK.
- Good to have knowledge on any data cleansing tool like apache Griffin
- Preferably getting involved during project implementation so have a background on business knowledge and technical requirement as well.
- Strong analytical and problem-solving skills. Have exposure to data analytics skills and knowledge of advanced data analytical tools will be an advantage.
- Strong written and verbal communication skills (presentation skills).
- Certifications in the above technologies is preferred.
>> Qualification
- Tech /B.E. / MCA /M. Tech from a reputed institute.
Experience of Data Management, Data Modelling, Ingestion for more than 4 years. Total experience of 8-10 Years
Data Warehouse and Analytics solutions that aggregate data across diverse sources and data types
including text, video and audio through to live stream and IoT in an agile project delivery
environment with a focus on DataOps and Data Observability. You will work with Azure SQL
Databases, Synapse Analytics, Azure Data Factory, Azure Datalake Gen2, Azure Databricks, Azure
Machine Learning, Azure Service Bus, Azure Serverless (LogicApps, FunctionApps), Azure Data
Catalogue and Purview among other tools, gaining opportunities to learn some of the most
advanced and innovative techniques in the cloud data space.
You will be building Power BI based analytics solutions to provide actionable insights into customer
data, and to measure operational efficiencies and other key business performance metrics.
You will be involved in the development, build, deployment, and testing of customer solutions, with
responsibility for the design, implementation and documentation of the technical aspects, including
integration to ensure the solution meets customer requirements. You will be working closely with
fellow architects, engineers, analysts, and team leads and project managers to plan, build and roll
out data driven solutions
Expertise:
Proven expertise in developing data solutions with Azure SQL Server and Azure SQL Data Warehouse (now
Synapse Analytics)
Demonstrated expertise of data modelling and data warehouse methodologies and best practices.
Ability to write efficient data pipelines for ETL using Azure Data Factory or equivalent tools.
Integration of data feeds utilising both structured (ex XML/JSON) and flat schemas (ex CSV,TXT,XLSX)
across a wide range of electronic delivery mechanisms (API/SFTP/etc )
Azure DevOps knowledge essential for CI/CD of data ingestion pipelines and integrations.
Experience with object-oriented/object function scripting languages such as Python, Java, JavaScript, C#,
Scala, etc is required.
Expertise in creating technical and Architecture documentation (ex: HLD/LLD) is a must.
Proven ability to rapidly analyse and design solution architecture in client proposals is an added advantage.
Expertise with big data tools: Hadoop, Spark, Kafka, NoSQL databases, stream-processing systems is a plus.
Essential Experience:
5 or more years of hands-on experience in a data architect role with the development of ingestion,
integration, data auditing, reporting, and testing with Azure SQL tech stack.
full data and analytics project lifecycle experience (including costing and cost management of data
solutions) in Azure PaaS environment is essential.
Microsoft Azure and Data Certifications, at least fundamentals, are a must.
Experience using agile development methodologies, version control systems and repositories is a must.
A good, applied understanding of the end-to-end data process development life cycle.
A good working knowledge of data warehouse methodology using Azure SQL.
A good working knowledge of the Azure platform, it’s components, and the ability to leverage it’s
resources to implement solutions is a must.
Experience working in the Public sector or in an organisation servicing Public sector is a must,
Ability to work to demanding deadlines, keep momentum and deal with conflicting priorities in an
environment undergoing a programme of transformational change.
The ability to contribute and adhere to standards, have excellent attention to detail and be strongly driven
by quality.
Desirables:
Experience with AWS or google cloud platforms will be an added advantage.
Experience with Azure ML services will be an added advantage Personal Attributes
Articulated and clear in communications to mixed audiences- in writing, through presentations and one-toone.
Ability to present highly technical concepts and ideas in a business-friendly language.
Ability to effectively prioritise and execute tasks in a high-pressure environment.
Calm and adaptable in the face of ambiguity and in a fast-paced, quick-changing environment
Extensive experience working in a team-oriented, collaborative environment as well as working
independently.
Comfortable with multi project multi-tasking consulting Data Architect lifestyle
Excellent interpersonal skills with teams and building trust with clients
Ability to support and work with cross-functional teams in a dynamic environment.
A passion for achieving business transformation; the ability to energise and excite those you work with
Initiative; the ability to work flexibly in a team, working comfortably without direct supervision.
We’re hiring a talented Data Engineer and Big Data enthusiast to work in our platform to help ensure that our data quality is flawless. As a company, we have millions of new data points every day that come into our system. You will be working with a passionate team of engineers to solve challenging problems and ensure that we can deliver the best data to our customers, on-time. You will be using the latest cloud data warehouse technology to build robust and reliable data pipelines. Duties/Responsibilities Include:
|
Requirements:
Exceptional candidates will have:
|
We are an early stage start-up, building new fintech products for small businesses. Founders are IIT-IIM alumni, with prior experience across management consulting, venture capital and fintech startups. We are driven by the vision to empower small business owners with technology and dramatically improve their access to financial services. To start with, we are building a simple, yet powerful solution to address a deep pain point for these owners: cash flow management. Over time, we will also add digital banking and 1-click financing to our suite of offerings.
We have developed an MVP which is being tested in the market. We have closed our seed funding from marquee global investors and are now actively building a world class tech team. We are a young, passionate team with a strong grip on this space and are looking to on-board enthusiastic, entrepreneurial individuals to partner with us in this exciting journey. We offer a high degree of autonomy, a collaborative fast-paced work environment and most importantly, a chance to create unparalleled impact using technology.
Reach out if you want to get in on the ground floor of something which can turbocharge SME banking in India!
Technology stack at Velocity comprises a wide variety of cutting edge technologies like, NodeJS, Ruby on Rails, Reactive Programming,, Kubernetes, AWS, NodeJS, Python, ReactJS, Redux (Saga) Redis, Lambda etc.
Key Responsibilities
-
Responsible for building data and analytical engineering pipelines with standard ELT patterns, implementing data compaction pipelines, data modelling and overseeing overall data quality
-
Work with the Office of the CTO as an active member of our architecture guild
-
Writing pipelines to consume the data from multiple sources
-
Writing a data transformation layer using DBT to transform millions of data into data warehouses.
-
Implement Data warehouse entities with common re-usable data model designs with automation and data quality capabilities
-
Identify downstream implications of data loads/migration (e.g., data quality, regulatory)
What To Bring
-
3+ years of software development experience, a startup experience is a plus.
-
Past experience of working with Airflow and DBT is preferred
-
2+ years of experience working in any backend programming language.
-
Strong first-hand experience with data pipelines and relational databases such as Oracle, Postgres, SQL Server or MySQL
-
Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development)
-
Experienced with the formulation of ideas; building proof-of-concept (POC) and converting them to production-ready projects
-
Experience building and deploying applications on on-premise and AWS or Google Cloud cloud-based infrastructure
-
Basic understanding of Kubernetes & docker is a must.
-
Experience in data processing (ETL, ELT) and/or cloud-based platforms
-
Working proficiency and communication skills in verbal and written English.
This will include:
Scorecards
Strategies
MIS
The verticals included are:
Risk
Marketing
Product