1. Understand the business problem and translate these to data services and
2. Expertise in working on cloud application designs, cloud approval plans, and
systems required to manage cloud storage.
3. Explore new technologies and learn new techniques to solve business problems
4. Collaborate with different teams - engineering and business, to build better data
5. Regularly evaluate cloud applications, hardware, and software.
6. Respond to technical issues in a professional and timely manner.
7. Identify the top cloud architecture solutions to successfully meet the strategic
needs of the company.
8. Offer guidance in infrastructure movement techniques including bulk application
transfers into the cloud.
9. Manage team and handle delivery of 2-3 projects
JD | Data Architect 24-Aug-2021
Solving for better 3 of 7
Is Education overrated? Yes. We believe so. But there is no way to locate you
otherwise. So we might look for at least a computer science, computer engineering,
information technology, or relevant field along with:
1. Over 4-6 years of experience in Data handling
2. Hands-on experience of any one programming language (Python, Java, Scala)
3. Understanding of SQL is must
4. Big data (Hadoop, Hive, Yarn, Sqoop)
5. MPP platforms (Spark, Presto)
6. Data-pipeline & scheduler tool (Ozzie, Airflow, Nifi)
7. Streaming engines (Kafka, Storm, Spark Streaming)
8. Any Relational database or DW experience
9. Any ETL tool experience
10. Hands-on experience in pipeline design, ETL and application development
11. Hands-on experience in cloud platforms like AWS, GCP etc.
12. Good communication skills and strong analytical skills
13. Experience in team handling and project delivery
About Searce Inc
Searce is a cloud, automation & analytics led process improvement company helping futurify businesses. Searce is a premier partner for Google Cloud for all products and services. Searce is the largest Cloud Systems Integrator for enterprises with the largest # of enterprise Google Cloud clients in India.
Searce specializes in helping businesses move to cloud, build on the next generation cloud, adopt SaaS - Helping reimagine the ‘why’ & redefining ‘what’s next’ for workflows, automation, machine learning & related futuristic use cases. Searce has been recognized by Google as one of the Top partners for the year 2015, 2016.
Searce's organizational culture encourages making mistakes and questioning the status quo and that allows us to specialize in simplifying complex business processes and use a technology agnostic approach to create, improve and deliver.
1+ years of proven experience in ML/AI with Python
Work with the manager through the entire analytical and machine learning model life cycle:
⮚ Define the problem statement
⮚ Build and clean datasets
⮚ Exploratory data analysis
⮚ Feature engineering
⮚ Apply ML algorithms and assess the performance
⮚ Codify for deployment
⮚ Test and troubleshoot the code
⮚ Communicate analysis to stakeholders
⮚ Proven experience in usage of Python and SQL
⮚ Excellent in programming and statistics
⮚ Working knowledge of tools and utilities - AWS, DevOps with Git, Selenium, Postman, Airflow, PySpark
- Minimum 1 years of relevant experience, in PySpark (mandatory)
- Hands on experience in development, test, deploy, maintain and improving data integration pipeline in AWS cloud environment is added plus
- Ability to play lead role and independently manage 3-5 member of Pyspark development team
- EMR ,Python and PYspark mandate.
- Knowledge and awareness working with AWS Cloud technologies like Apache Spark, , Glue, Kafka, Kinesis, and Lambda in S3, Redshift, RDS
- Minimum of 2 years Experience in Google Big Query and Google Cloud Platform.
- Design and develop the ETL framework using BigQuery
- Expertise in Big Query concepts like Nested Queries, Clustering, Partitioning, etc.
- Working Experience of Clickstream database, Google Analytics/ Adobe Analytics.
- Should be able to automate the data load from Big Query using APIs or scripting language.
- Good experience in Advanced SQL concepts.
- Good experience with Adobe launch Web, Mobile & e-commerce tag implementation.
- Identify complex fuzzy problems, break them down in smaller parts, and implement creative, data-driven solutions
- Responsible for defining, analyzing, and communicating key metrics and business trends to the management teams
- Identify opportunities to improve conversion & user experience through data. Influence product & feature roadmaps.
- Must have a passion for data quality and be constantly looking to improve the system. Drive data-driven decision making through the stakeholders & drive Change Management
- Understand requirements to translate business problems & technical problems into analytics problems.
- Effective storyboarding and presentation of the solution to the client and leadership.
- Client engagement & management
- Ability to interface effectively with multiple levels of management and functional disciplines.
- Assist in developing/coaching individuals technically as well as on soft skills during the project and as part of Client Project’s training program.
- 2 to 3 years of working experience in Google Big Query & Google Cloud Platform
- Relevant experience in Consumer Tech/CPG/Retail industries
- Bachelor’s in engineering, Computer Science, Math, Statistics or related discipline
- Strong problem solving and web analytical skills. Acute attention to detail.
- Experience in analyzing large, complex, multi-dimensional data sets.
- Experience in one or more roles in an online eCommerce or online support environment.
- Expertise in Google Big Query & Google Cloud Platform
- Experience in Advanced SQL, Scripting language (Python/R)
- Hands-on experience in BI tools (Tableau, Power BI)
- Working Experience & understanding of Adobe Analytics or Google Analytics
- Experience in creating and debugging website & app tracking (Omnibus, Dataslayer, GA debugger, etc.)
- Excellent analytical thinking, analysis, and problem-solving skills.
- Knowledge of other GCP services is a plus
Are you passionate about handling large & complex data problems, want to make an impact and have the desire to work on ground-breaking big data technologies? Then we are looking for you.
At Amagi, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Would you like to work in a fast-paced environment where your technical abilities will be challenged on a day-to-day basis? If so, Amagi’s Data Engineering and Business Intelligence team is looking for passionate, detail-oriented, technical savvy, energetic team members who like to think outside the box.
Amagi’s Data warehouse team deals with petabytes of data catering to a wide variety of real-time, near real-time and batch analytical solutions. These solutions are an integral part of business functions such as Sales/Revenue, Operations, Finance, Marketing and Engineering, enabling critical business decisions. Designing, developing, scaling and running these big data technologies using native technologies of AWS and GCP are a core part of our daily job.
- Experience in building highly cost optimised data analytics solutions
- Experience in designing and building dimensional data models to improve accessibility, efficiency and quality of data
- Experience (hands on) in building high quality ETL applications, data pipelines and analytics solutions ensuring data privacy and regulatory compliance.
- Experience in working with AWS or GCP
- Experience with relational and NoSQL databases
- Experience to full stack web development (Preferably Python)
- Expertise with data visualisation systems such as Tableau and Quick Sight
- Proficiency in writing advanced SQL queries with expertise in performance tuning handling large data volumes
- Familiarity with ML/AÍ technologies is a plus
- Demonstrate strong understanding of development processes and agile methodologies
- Strong analytical and communication skills. Should be self-driven, highly motivated and ability to learn quickly
Data Analytics is at the core of our work, and you will have the opportunity to:
- Design Data-warehousing solutions on Amazon S3 with Athena, Redshift, GCP Bigtable etc
- Lead quick prototypes by integrating data from multiple sources
- Do advanced Business Analytics through ad-hoc SQL queries
- Work on Sales Finance reporting solutions using tableau, HTML5, React applications
We build amazing experiences and create depth in knowledge for our internal teams and our leadership. Our team is a friendly bunch of people that help each other grow and have a passion for technology, R&D, modern tools and data science.
Our work relies on deep understanding of the company needs and an ability to go through vast amounts of internal data such as sales, KPIs, forecasts, Inventory etc. One of the key expectations of this role would be to do data analytics, building data lakes, end to end reporting solutions etc. If you have a passion for cost optimised analytics and data engineering and are eager to learn advanced data analytics at a large scale, this might just be the job for you..
Education & Experience
A bachelor’s/master’s degree in Computer Science with 5 to 7 years of experience and previous experience in data engineering is a plus.
empower healthcare payers, providers and members to quickly process medical data to
make informed decisions and reduce health care costs. You will be focusing on research,
development, strategy, operations, people management, and being a thought leader for
team members based out of India. You should have professional healthcare experience
using both structured and unstructured data to build applications. These applications
include but are not limited to machine learning, artificial intelligence, optical character
recognition, natural language processing, and integrating processes into the overall AI
pipeline to mine healthcare and medical information with high recall and other relevant
metrics. The results will be used dually for real-time operational processes with both
automated and human-based decision making as well as contribute to reducing
healthcare administrative costs. We work with all major cloud and big data vendors
offerings including (Azure, AWS, Google, IBM, etc.) to achieve our goals in healthcare and
The Director, Data Science will have the opportunity to build a team, shape team culture
and operating norms as a result of the fast-paced nature of a new, high-growth
• Strong communication and presentation skills to convey progress to a diverse group of stakeholders
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real-time streaming applications, DevOps and product delivery
• Experience building stakeholder trust and confidence in deployed models especially via application of the algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• Provide mentoring to data scientists and machine learning engineers as well as career development
• Meet project related team members for individual specific needs on a regular basis related to project/product deliverables
• Provide training and guidance for team members when required
• Provide performance feedback when required by leadership
The Experience You’ll Need (Required):
• MS/M.Tech degree or PhD in Computer Science, Mathematics, Physics or related STEM fields
• Significant healthcare data experience including but not limited to usage of claims data
• Delivered multiple data science and machine learning projects over 8+ years with values exceeding $10 Million or more and has worked on platform members exceeding 10 million lives
• 9+ years of industry experience in data science, machine learning, and artificial intelligence
• Strong expertise in data science, data engineering, software engineering, cloud vendors, big data technologies, real time streaming applications, DevOps, and product delivery
• Knows how to solve and launch real artificial intelligence and data science related problems and products along with managing and coordinating the
business process change, IT / cloud operations, meeting production level code standards
• Ownerships of key workflows part of data science life cycle like data acquisition, data quality, and results
• Experience building stakeholder trust and confidence in deployed models especially via application of algorithmic bias, interpretable machine learning,
data integrity, data quality, reproducible research, and reliable engineering 24x7x365 product availability, scalability
• Expertise in healthcare privacy, federated learning, continuous integration and deployment, DevOps support
• 3+ Years of experience managing directly five (5) or more senior level data scientists, machine learning engineers with advanced degrees and directly
made staff decisions
• Very strong understanding of mathematical concepts including but not limited to linear algebra, advanced calculus, partial differential equations, and
statistics including Bayesian approaches at master’s degree level and above
• 6+ years of programming experience in C++ or Java or Scala and data science programming languages like Python and R including strong understanding of
concepts like data structures, algorithms, compression techniques, high performance computing, distributed computing, and various computer architecture
• Very strong understanding and experience with traditional data science approaches like sampling techniques, feature engineering, classification, and
regressions, SVM, trees, model evaluations with several projects over 3+ years
• Very strong understanding and experience in Natural Language Processing,
reasoning, and understanding, information retrieval, text mining, search, with
3+ years of hands on experience
• Experience with developing and deploying several products in production with
experience in two or more of the following languages (Python, C++, Java, Scala)
• Strong Unix/Linux background and experience with at least one of the
following cloud vendors like AWS, Azure, and Google
• Three plus (3+) years hands on experience with MapR \ Cloudera \ Databricks
Big Data platform with Spark, Hive, Kafka etc.
• Three plus (3+) years of experience with high-performance computing like
Dask, CUDA distributed GPU, TPU etc.
• Presented at major conferences and/or published materials
(Hadoop, HDFS, Kafka, Spark, Hive)
Overall Experience - 8 to 12 years
Relevant exp on Big data - 3+ years in above
Salary: Max up-to 20LPA
Job location - Chennai / Bangalore /
Notice Period - Immediate joiner / 15-to-20-day Max
The Responsibilities of The Senior Data Engineer Are:
- Requirements gathering and assessment
- Breakdown complexity and translate requirements to specification artifacts and story boards to build towards, using a test-driven approach
- Engineer scalable data pipelines using big data technologies including but not limited to Hadoop, HDFS, Kafka, HBase, Elastic
- Implement the pipelines using execution frameworks including but not limited to MapReduce, Spark, Hive, using Java/Scala/Python for application design.
- Mentoring juniors in a dynamic team setting
- Manage stakeholders with proactive communication upholding TheDataTeam's brand and values
A Candidate Must Have the Following Skills:
- Strong problem-solving ability
- Excellent software design and implementation ability
- Exposure and commitment to agile methodologies
- Detail oriented with willingness to proactively own software tasks as well as management tasks, and see them to completion with minimal guidance
- Minimum 8 years of experience
- Should have experience in full life-cycle of one big data application
- Strong understanding of various storage formats (ORC/Parquet/Avro)
- Should have hands on experience in one of the Hadoop distributions (Hortoworks/Cloudera/MapR)
- Experience in at least one cloud environment (GCP/AWS/Azure)
- Should be well versed with at least one database (MySQL/Oracle/MongoDB/Postgres)
- Bachelor's in Computer Science, and preferably, a Masters as well - Should have good code review and debugging skills
Additional skills (Good to have):
- Experience in Containerization (docker/Heroku)
- Exposure to microservices
- Exposure to DevOps practices - Experience in Performance tuning of big data applications
To be considered as a candidate for a Senior Data Engineer position, a person must have a proven track record of architecting data solutions on current and advanced technical platforms. They must have leadership abilities to lead a team providing data centric solutions with best practices and modern technologies in mind. They look to build collaborative relationships across all levels of the business and the IT organization. They possess analytic and problem-solving skills and have the ability to research and provide appropriate guidance for synthesizing complex information and extract business value. Have the intellectual curiosity and ability to deliver solutions with creativity and quality. Effectively work with business and customers to obtain business value for the requested work. Able to communicate technical results to both technical and non-technical users using effective story telling techniques and visualizations. Demonstrated ability to perform high quality work with innovation both independently and collaboratively.
CANDIDATE WILL BE DEPLOYED IN A FINANCIAL CAPTIVE ORGANIZATION @ PUNE (KHARADI)
Below are the job Details :-
Experience 10 to 18 years
Mandatory skills –
- data migration,
- data flow
The ideal candidate for this role will have the below experience and qualifications:
- Experience of building a range of Services in a Cloud Service provider (ideally GCP)
- Hands-on design and development of Google Cloud Platform (GCP), across a wide range of GCP services including hands on experience of GCP storage & database technologies.
- Hands-on experience in architecting, designing or implementing solutions on GCP, K8s, and other Google technologies. Security and Compliance, e.g. IAM and cloud compliance/auditing/monitoring tools
- Desired Skills within the GCP stack - Cloud Run, GKE, Serverless, Cloud Functions, Vision API, DLP, Data Flow, Data Fusion
- Prior experience of migrating on-prem applications to cloud environments. Knowledge and hands on experience on Stackdriver, pub-sub, VPC, Subnets, route tables, Load balancers, firewalls both for on premise and the GCP.
- Integrate, configure, deploy and manage centrally provided common cloud services (e.g. IAM, networking, logging, Operating systems, Containers.)
- Manage SDN in GCP Knowledge and experience of DevOps technologies around Continuous Integration & Delivery in GCP using Jenkins.
- Hands on experience of Terraform, Kubernetes, Docker, Stackdriver, Terraform
- Knowledge or experience in DevOps tooling such as Jenkins, Git, Ansible, Splunk, Jira or Confluence, AppD, Docker, Kubernetes
- Act as a consultant and subject matter expert for internal teams to resolve technical deployment obstacles, improve product's vision. Ensure compliance with centrally defined Security
- Financial experience is preferred
- Ability to learn new technologies and rapidly prototype newer concepts
- Top-down thinker, excellent communicator, and great problem solver
Exp:- 10 to 18 years
Candidate must have experience in below.
- GCP Data Platform
- Data Processing:- Data Flow, Data Prep, Data Fusion
- Data Storage:- Big Query, Cloud Sql,
- Pub Sub, GCS Bucket