● Able contribute to the gathering of functional requirements, developing technical
specifications, and project & test planning
● Demonstrating technical expertise, and solving challenging programming and design
problems
● Roughly 80% hands-on coding
● Generate technical documentation and PowerPoint presentations to communicate
architectural and design options, and educate development teams and business users
● Resolve defects/bugs during QA testing, pre-production, production, and post-release
patches
● Work cross-functionally with various bidgely teams including: product management,
QA/QE, various product lines, and/or business units to drive forward results
Requirements
● BS/MS in computer science or equivalent work experience
● 2-4 years’ experience designing and developing applications in Data Engineering
● Hands-on experience with Big data Eco Systems.
● Hadoop,Hdfs,Map Reduce,YARN,AWS Cloud, EMR, S3, Spark, Cassandra, Kafka,
Zookeeper
● Expertise with any of the following Object-Oriented Languages (OOD): Java/J2EE,Scala,
Python
● Strong leadership experience: Leading meetings, presenting if required
● Excellent communication skills: Demonstrated ability to explain complex technical
issues to both technical and non-technical audiences
● Expertise in the Software design/architecture process
● Expertise with unit testing & Test-Driven Development (TDD)
● Experience on Cloud or AWS is preferable
● Have a good understanding and ability to develop software, prototypes, or proofs of
concepts (POC's) for various Data Engineering requirements.
About AI-powered cloud-based SaaS solution
Similar jobs
Carsome’s Data Department is on the lookout for a Data Scientist/Senior Data Scientist who has a strong passion in building data powered products.
Data Science function under the Data Department has a responsibility for standardisation of methods, mentoring team of data science resources/interns, including code libraries and documentation, quality assurance of outputs, modeling techniques and statistics, leveraging a variety of technologies, open-source languages, and cloud computing platform.
You will get to lead & implement projects such as price optimization/prediction, enabling iconic personalization experiences for our customer, inventory optimization etc.
Job Descriptions
- Identifying and integrating datasets that can be leveraged through our product and work closely with data engineering team to develop data products.
- Execute analytical experiments methodically to help solve various problems and make a true impact across functions such as operations, finance, logistics, marketing.
- Identify, prioritize, and design testing opportunities that will inform algorithm enhancements.
- Devise and utilize algorithms and models to mine big data stores, perform data and error analysis to improve models and clean and validate data for uniformity and accuracy.
- Unlock insights by analyzing large amounts of complex website traffic and transactional data.
- Implement analytical models into production by collaborating with data analytics engineers.
Technical Requirements
- Expertise in model design, training, evaluation, and implementation ML Algorithm expertise K-nearest neighbors, Random Forests, Naive Bayes, Regression Models. PyTorch, TensorFlow, Keras, deep learning expertise, tSNE, gradient boosting expertise, regression implementation expertise, Python, Pyspark, SQL, R, AWS Sagemaker /personalize etc.
- Machine Learning / Data Science Certification
Experience & Education
- Bachelor’s in Engineering / Master’s in Data Science / Postgraduate Certificate in Data Science.
Responsibilities: - Write and maintain production level code in Python for deploying machine learning models - Create and maintain deployment pipelines through CI/CD tools (preferribly GitLab CI) - Implement alerts and monitoring for prediction accuracy and data drift detection - Implement automated pipelines for training and replacing models - Work closely with with the data science team to deploy new models to production Required Qualifications: - Degree in Computer Science, Data Science, IT or a related discipline. - 2+ years of experience in software engineering or data engineering. - Programming experience in Python - Experience in data profiling, ETL development, testing and implementation - Experience in deploying machine learning models
Good to have: - Experience in AWS resources for ML and data engineering (SageMaker, Glue, Athena, Redshift, S3) - Experience in deploying TensorFlow models - Experience in deploying and managing ML Flow
ROLE & RESPONSIBILITIES:
- To coordinate with the sales team to deliver proposals, solutions etc. (including Request for proposals, request for information, idea and solution generation, road-map creation, and more)
- Coordinate with the tech team to generate appropriate project estimates and to come up with execution timelines and milestones.
- Where relevant, support sales team by presenting solution to prospects and clients.
- Ensure that the proposed (technical) solution is suitable for development, its associated architecture and also get the timelines required to develop the proposed scope.
- To build knowledge repository for further business development.
- To prepare case studies of all projects.
- Be updated with latest industry and technology trends, and to market research.
- Conduct account specific research on key accounts.
- Create solution frameworks for industries / domain we choose to play in.
- Provide support to sales team for any presentations and collaterals on our service offerings.
- Create and maintain a catalogue of service offerings along with necessary collateral in terms of presentations, documents etc.
QUALITIES & QUALIFICATIONS:
- Strong communication skills, written as well as oral
- Good sense of design
- Strong analytical and problem-solving capabilities
- Familiarity with digital technologies (trends, key players, and more) as well as with software -
developmentprocess, methodologies, standard practices and conventions, and so on.
- MBA from a reputed institution
- Strong research skills, especially with regard to scanning financial reports, industry data trends,
and so on.
You will be responsible for designing, building, and maintaining data pipelines that handle Real-world data at Compile. You will be handling both inbound and outbound data deliveries at Compile for datasets including Claims, Remittances, EHR, SDOH, etc.
You will
- Work on building and maintaining data pipelines (specifically RWD).
- Build, enhance and maintain existing pipelines in pyspark, python and help build analytical insights and datasets.
- Scheduling and maintaining pipeline jobs for RWD.
- Develop, test, and implement data solutions based on the design.
- Design and implement quality checks on existing and new data pipelines.
- Ensure adherence to security and compliance that is required for the products.
- Maintain relationships with various data vendors and track changes and issues across vendors and deliveries.
You have
- Hands-on experience with ETL process (min of 5 years).
- Excellent communication skills and ability to work with multiple vendors.
- High proficiency with Spark, SQL.
- Proficiency in Data modeling, validation, quality check, and data engineering concepts.
- Experience in working with big-data processing technologies using - databricks, dbt, S3, Delta lake, Deequ, Griffin, Snowflake, BigQuery.
- Familiarity with version control technologies, and CI/CD systems.
- Understanding of scheduling tools like Airflow/Prefect.
- Min of 3 years of experience managing data warehouses.
- Familiarity with healthcare datasets is a plus.
Compile embraces diversity and equal opportunity in a serious way. We are committed to building a team of people from many backgrounds, perspectives, and skills. We know the more inclusive we are, the better our work will be.
TOP 3 SKILLS
Python (Language)
Spark Framework
Spark Streaming
Docker/Jenkins/ Spinakar
AWS
Hive Queries
He/She should be good coder.
Preff: - Airflow
Must have experience: -
Python
Spark framework and streaming
exposure to Machine Learning Lifecycle is mandatory.
Project:
This is searching domain project. Any searching activity which is happening on website this team create the model for the same, they create sorting/scored model for any search. This is done by the data
scientist This team is working more on the streaming side of data, the candidate would work extensively on Spark streaming and there will be a lot of work in Machine Learning.
INTERVIEW INFORMATION
3-4 rounds.
1st round based on data engineering batching experience.
2nd round based on data engineering streaming experience.
3rd round based on ML lifecycle (3rd round can be a techno-functional round based on previous
feedbacks otherwise 4th round will be a functional round if required.
Position: ETL Developer
Location: Mumbai
Exp.Level: 4+ Yrs
Required Skills:
* Strong scripting knowledge such as: Python and Shell
* Strong relational database skills especially with DB2/Sybase
* Create high quality and optimized stored procedures and queries
* Strong with scripting language such as Python and Unix / K-Shell
* Strong knowledge base of relational database performance and tuning such as: proper use of indices, database statistics/reorgs, de-normalization concepts.
* Familiar with lifecycle of a trade and flows of data in an investment banking operation is a plus.
* Experienced in Agile development process
* Java Knowledge is a big plus but not essential
* Experience in delivery of metrics / reporting in an enterprise environment (e.g. demonstrated experience in BI tools such as Business Objects, Tableau, report design & delivery) is a plus
* Experience on ETL processes and tools such as Informatica is a plus. Real time message processing experience is a big plus.
* Good team player; Integrity & ownership
Job Description
- Solid technical skills with a proven and successful history working with data at scale and empowering organizations through data
- Big data processing frameworks: Spark, Scala, Hadoop, Hive, Kafka, EMR with Python
- Advanced experience and hands-on architecture and administration experience on big data platforms
We are currently looking for a Junior Data Scientist to join our growing Data Science team in Panchkula. As a Jr. Data Scientist, you will work closely with the Head of Data Science and a variety of cross-functional teams to identify opportunities to enhance the customer journey, reduce churn, improve user retention, and drive revenue.
Experience Required
- Medium to Expert level proficiency in either R or Python.
- Expert level proficiency in SQL scripting for RDBMS and NoSQL DBs (especially MongoDB)
- Tracking and insights on key metrics around User Journey, User Retention, Churn Modelling and Prediction, etc.
- Medium-to-Highly skilled in data-structures and ML algorithms, with the ability to create efficient solutions to complex problems.
- Experience of working on an end-to-end data science pipeline: problem scoping, data gathering, EDA, modeling, insights, visualizations, monitoring and maintenance.
- Medium-to-Proficient in creating beautiful Tableau dashboards.
- Problem-solving: Ability to break the problem into small parts and apply relevant techniques to drive the required outcomes.
- Intermediate to advanced knowledge of machine learning, probability theory, statistics, and algorithms. You will be required to discuss and use various algorithms and approaches on a daily basis.
- Proficient in at least a few of the following: regression, Bayesian methods, tree-based learners, SVM, RF, XGBOOST, time series modelling, GLM, GLMM, clustering, Deep learning etc.
Good to Have
- Experience in one of the upcoming technologies like deep learning, recommender systems, etc.
- Experience of working in the Gaming domain
- Marketing analytics, cross-sell, up-sell, campaign analytics, fraud detection
- Experience in building and maintaining Data Warehouses in AWS would be a big plus!
Benefits
- PF and gratuity
- Working 5 days a week
- Paid leaves (CL, SL, EL, ML) and holidays
- Parties, festivals, birthday celebrations, etc
- Equability: absence of favouritism in hiring & promotion
- Building and operationalizing large scale enterprise data solutions and applications using one or more of AZURE data and analytics services in combination with custom solutions - Azure Synapse/Azure SQL DWH, Azure Data Lake, Azure Blob Storage, Spark, HDInsights, Databricks, CosmosDB, EventHub/IOTHub.
- Experience in migrating on-premise data warehouses to data platforms on AZURE cloud.
- Designing and implementing data engineering, ingestion, and transformation functions
- Experience with Azure Analysis Services
- Experience in Power BI
- Experience with third-party solutions like Attunity/Stream sets, Informatica
- Experience with PreSales activities (Responding to RFPs, Executing Quick POCs)
- Capacity Planning and Performance Tuning on Azure Stack and Spark.