To be considered as a candidate for a Senior Data Engineer position, a person must have a proven track record of architecting data solutions on current and advanced technical platforms. They must have leadership abilities to lead a team providing data centric solutions with best practices and modern technologies in mind. They look to build collaborative relationships across all levels of the business and the IT organization. They possess analytic and problem-solving skills and have the ability to research and provide appropriate guidance for synthesizing complex information and extract business value. Have the intellectual curiosity and ability to deliver solutions with creativity and quality. Effectively work with business and customers to obtain business value for the requested work. Able to communicate technical results to both technical and non-technical users using effective story telling techniques and visualizations. Demonstrated ability to perform high quality work with innovation both independently and collaboratively.
Similar jobs
Qualifications & Experience:
▪ 2 - 4 years overall experience in ETLs, data pipeline, Data Warehouse development and database design
▪ Software solution development using Hadoop Technologies such as MapReduce, Hive, Spark, Kafka, Yarn/Mesos etc.
▪ Expert in SQL, worked on advanced SQL for at least 2+ years
▪ Good development skills in Java, Python or other languages
▪ Experience with EMR, S3
▪ Knowledge and exposure to BI applications, e.g. Tableau, Qlikview
▪ Comfortable working in an agile environment
About DeepIntent:
DeepIntent is a marketing technology company that helps healthcare brands strengthen communication with patients and healthcare professionals by enabling highly effective and performant digital advertising campaigns. Our healthcare technology platform, MarketMatch™, connects advertisers, data providers, and publishers to operate the first unified, programmatic marketplace for healthcare marketers. The platform’s built-in identity solution matches digital IDs with clinical, behavioural, and contextual data in real-time so marketers can qualify 1.6M+ verified HCPs and 225M+ patients to find their most clinically-relevant audiences and message them on a one-to-one basis in a privacy-compliant way. Healthcare marketers use MarketMatch to plan, activate, and measure digital campaigns in ways that best suit their business, from managed service engagements to technical integration or self-service solutions. DeepIntent was founded by Memorial Sloan Kettering alumni in 2016 and acquired by Propel Media, Inc. in 2017. We proudly serve major pharmaceutical and Fortune 500 companies out of our offices in New York, Bosnia and India.
What You’ll Do:
- Establish formal data practice for the organisation.
- Build & operate scalable and robust data architectures.
- Create pipelines for the self-service introduction and usage of new data
- Implement DataOps practices
- Design, Develop, and operate Data Pipelines which support Data scientists and machine learning
- Engineers.
- Build simple, highly reliable Data storage, ingestion, and transformation solutions which are easy
- to deploy and manage.
- Collaborate with various business stakeholders, software engineers, machine learning
- engineers, and analysts.
Who You Are:
- Experience in designing, developing and operating configurable Data pipelines serving high
- volume and velocity data.
- Experience working with public clouds like GCP/AWS.
- Good understanding of software engineering, DataOps, data architecture, Agile and
- DevOps methodologies.
- Experience building Data architectures that optimize performance and cost, whether the
- components are prepackaged or homegrown
- Proficient with SQL, Java, Spring boot, Python or JVM-based language, Bash
- Experience with any of Apache open source projects such as Spark, Druid, Beam, Airflow
- etc. and big data databases like BigQuery, Clickhouse, etc
- Good communication skills with the ability to collaborate with both technical and non-technical
- people.
- Ability to Think Big, take bets and innovate, Dive Deep, Bias for Action, Hire and Develop the Best, Learn and be Curious
Required skills and experience: · Solid experience working in Big Data ETL environments with Spark and Java/Scala/Python · Strong experience with AWS cloud technologies (EC2, EMR, S3, Kinesis, etc) · Experience building monitoring/alerting frameworks with tools like Newrelic and escalations with slack/email/dashboard integrations, etc · Executive-level communication, prioritization, and team leadership skills
Hi,
We are hiring for Data Scientist for Bangalore.
Req Skills:
- NLP
- ML programming
- Spark
- Model Deployment
- Experience processing unstructured data and building NLP models
- Experience with big data tools pyspark
- Pipeline orchestration using Airflow and model deployment experience is preferred
•3+ years of experience in big data & data warehousing technologies
•Experience in processing and organizing large data sets
•Experience with big data tool sets such Airflow and Oozie
•Experience working with BigQuery, Snowflake or MPP, Kafka, Azure, GCP and AWS
•Experience developing in programming languages such as SQL, Python, Java or Scala
•Experience in pulling data from variety of databases systems like SQL Server, maria DB, Cassandra
NOSQL databases
•Experience working with retail, advertising or media data at large scale
•Experience working with data science engineering, advanced data insights development
•Strong quality proponent and thrives to impress with his/her work
•Strong problem-solving skills and ability to navigate complicated database relationships
•Good written and verbal communication skills , Demonstrated ability to work with product
management and/or business users to understand their needs.
* Formulates and recommends standards for achieving maximum performance
and efficiency of the DW ecosystem.
* Participates in the Pre-sales activities for solutions of various customer
problem-statement/situations.
* Develop business cases and ROI for the customer/clients.
* Interview stakeholders and develop BI roadmap for success given project
prioritization
* Evangelize self-service BI and visual discovery while helping to automate any
manual process at the client site.
* Work closely with the Engineering Manager to ensure prioritization of
customer deliverables.
* Champion data quality, integrity, and reliability throughout the organization by
designing and promoting best practices.
*Implementation 20%
* Help DW/DE team members with issues needing technical expertise or
complex systems and/or programming knowledge.
* Provide on-the-job training for new or less experienced team members.
* Develop a technical excellence team
Requirements
- experience designing business intelligence solutions
- experience with ETL Process, Data warehouse architecture
- experience with Azure Data services i.e., ADF, ADLS Gen 2, Azure SQL dB,
Synapse, Azure Databricks, and Power BI
- Good analytical and problem-solving skills
- Fluent in relational database concepts and flat file processing concepts
- Must be knowledgeable in software development lifecycles/methodologies
• 2+ years of experience in data engineering & strong understanding of data engineering principles using big data technologies
• Excellent programming skills in Python is mandatory
• Expertise in relational databases (MSSQL/MySQL/Postgres) and expertise in SQL. Exposure to NoSQL such as Cassandra. MongoDB will be a plus.
• Exposure to deploying ETL pipelines such as AirFlow, Docker containers & Lambda functions
• Experience in AWS loud services such as AWS CLI, Glue, Kinesis etc
• Experience using Tableau for data visualization is a plus
• Ability to demonstrate a portfolio of projects (GitHub, papers, etc.) is a plus
• Motivated, can-do attitude and desire to make a change is a must
• Excellent communication skills
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.
Develop complex queries, pipelines and software programs to solve analytics and data mining problems
Interact with other data scientists, product managers, and engineers to understand business problems, technical requirements to deliver predictive and smart data solutions
Prototype new applications or data systems
Lead data investigations to troubleshoot data issues that arise along the data pipelines
Collaborate with different product owners to incorporate data science solutions
Maintain and improve data science platform
Must Have
BS/MS/PhD in Computer Science, Electrical Engineering or related disciplines
Strong fundamentals: data structures, algorithms, database
5+ years of software industry experience with 2+ years in analytics, data mining, and/or data warehouse
Fluency with Python
Experience developing web services using REST approaches.
Proficiency with SQL/Unix/Shell
Experience in DevOps (CI/CD, Docker, Kubernetes)
Self-driven, challenge-loving, detail oriented, teamwork spirit, excellent communication skills, ability to multi-task and manage expectations
Preferred
Industry experience with big data processing technologies such as Spark and Kafka
Experience with machine learning algorithms and/or R a plus
Experience in Java/Scala a plus
Experience with any MPP analytics engines like Vertica
Experience with data integration tools like Pentaho/SAP Analytics Cloud
- Create and maintain optimal data pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Author data services using a variety of programming languages
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using Snowflake Cloud Datawarehouse as well as SQL and Azure ‘big data’ technologies
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and Azure regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Work in an Agile environment with Scrum teams.
- Ensure data quality and help in achieving data governance.
Basic Qualifications
- 3+ years of experience in a Data Engineer or Software Engineer role
- Undergraduate degree required (Graduate degree preferred) in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
- Experience using the following software/tools:
- Experience with “Snowflake Cloud Datawarehouse”
- Experience with Azure cloud services: ADLS, ADF, ADLA, AAS
- Experience with data pipeline and workflow management tools
- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
- Understanding of Datawarehouse (DWH) systems, and migration from DWH to data lakes/Snowflake
- Understanding of ELT and ETL patterns and when to use each. Understanding of data models and transforming data into the models
- Strong analytic skills related to working with unstructured datasets
- Build processes supporting data transformation, data structures, metadata, dependency and workload management
- Experience supporting and working with cross-functional teams in a dynamic environment.