Role : Talend developer
Location : Coimbatore
Experience : 4+Years
Skills : Talend, any DB
Notice period : Immediate to 15 Days
About Product based company
Similar jobs
Proficiency in Linux.
Must have SQL knowledge and experience working with relational databases,
query authoring (SQL) as well as familiarity with databases including Mysql,
Mongo, Cassandra, and Athena.
Must have experience with Python/Scala.
Must have experience with Big Data technologies like Apache Spark.
Must have experience with Apache Airflow.
Experience with data pipeline and ETL tools like AWS Glue.
Experience working with AWS cloud services: EC2, S3, RDS, Redshift.
DATA SCIENTIST-MACHINE LEARNING
GormalOne LLP. Mumbai IN
Job Description
GormalOne is a social impact Agri tech enterprise focused on farmer-centric projects. Our vision is to make farming highly profitable for the smallest farmer, thereby ensuring India's “Nutrition security”. Our mission is driven by the use of advanced technology. Our technology will be highly user-friendly, for the majority of farmers, who are digitally naive. We are looking for people, who are keen to use their skills to transform farmers' lives. You will join a highly energized and competent team that is working on advanced global technologies such as OCR, facial recognition, and AI-led disease prediction amongst others.
GormalOne is looking for a machine learning engineer to join. This collaborative yet dynamic, role is suited for candidates who enjoy the challenge of building, testing, and deploying end-to-end ML pipelines and incorporating ML Ops best practices across different technology stacks supporting a variety of use cases. We seek candidates who are curious not only about furthering their own knowledge of ML Ops best practices through hands-on experience but can simultaneously help uplift the knowledge of their colleagues.
Location: Bangalore
Roles & Responsibilities
- Individual contributor
- Developing and maintaining an end-to-end data science project
- Deploying scalable applications on different platform
- Ability to analyze and enhance the efficiency of existing products
What are we looking for?
- 3 to 5 Years of experience as a Data Scientist
- Skilled in Data Analysis, EDA, Model Building, and Analysis.
- Basic coding skills in Python
- Decent knowledge of Statistics
- Creating pipelines for ETL and ML models.
- Experience in the operationalization of ML models
- Good exposure to Deep Learning, ANN, DNN, CNN, RNN, and LSTM.
- Hands-on experience in Keras, PyTorch or Tensorflow
Basic Qualifications
- Tech/BE in Computer Science or Information Technology
- Certification in AI, ML, or Data Science is preferred.
- Master/Ph.D. in a relevant field is preferred.
Preferred Requirements
- Exp in tools and packages like Tensorflow, MLFlow, Airflow
- Exp in object detection techniques like YOLO
- Exposure to cloud technologies
- Operationalization of ML models
- Good understanding and exposure to MLOps
Kindly note: Salary shall be commensurate with qualifications and experience
Job Sector: IT, Software
Job Type: Permanent
Location: Chennai
Experience: 10 - 20 Years
Salary: 12 – 40 LPA
Education: Any Graduate
Notice Period: Immediate
Key Skills: Python, Spark, AWS, SQL, PySpark
Contact at triple eight two zero nine four two double seven
Job Description:
Requirements
- Minimum 12 years experience
- In depth understanding and knowledge on distributed computing with spark.
- Deep understanding of Spark Architecture and internals
- Proven experience in data ingestion, data integration and data analytics with spark, preferably PySpark.
- Expertise in ETL processes, data warehousing and data lakes.
- Hands on with python for Big data and analytics.
- Hands on in agile scrum model is an added advantage.
- Knowledge on CI/CD and orchestration tools is desirable.
- AWS S3, Redshift, Lambda knowledge is preferred
You will be responsible for designing, building, and maintaining data pipelines that handle Real-world data at Compile. You will be handling both inbound and outbound data deliveries at Compile for datasets including Claims, Remittances, EHR, SDOH, etc.
You will
- Work on building and maintaining data pipelines (specifically RWD).
- Build, enhance and maintain existing pipelines in pyspark, python and help build analytical insights and datasets.
- Scheduling and maintaining pipeline jobs for RWD.
- Develop, test, and implement data solutions based on the design.
- Design and implement quality checks on existing and new data pipelines.
- Ensure adherence to security and compliance that is required for the products.
- Maintain relationships with various data vendors and track changes and issues across vendors and deliveries.
You have
- Hands-on experience with ETL process (min of 5 years).
- Excellent communication skills and ability to work with multiple vendors.
- High proficiency with Spark, SQL.
- Proficiency in Data modeling, validation, quality check, and data engineering concepts.
- Experience in working with big-data processing technologies using - databricks, dbt, S3, Delta lake, Deequ, Griffin, Snowflake, BigQuery.
- Familiarity with version control technologies, and CI/CD systems.
- Understanding of scheduling tools like Airflow/Prefect.
- Min of 3 years of experience managing data warehouses.
- Familiarity with healthcare datasets is a plus.
Compile embraces diversity and equal opportunity in a serious way. We are committed to building a team of people from many backgrounds, perspectives, and skills. We know the more inclusive we are, the better our work will be.
Responsibilities :
- Involve in planning, design, development and maintenance of large-scale data repositories, pipelines, analytical solutions and knowledge management strategy
- Build and maintain optimal data pipeline architecture to ensure scalability, connect operational systems data for analytics and business intelligence (BI) systems
- Build data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
- Reporting and obtaining insights from large data chunks on import/export and communicating relevant pointers for helping in decision-making
- Preparation, analysis, and presentation of reports to the management for further developmental activities
- Anticipate, identify and solve issues concerning data management to improve data quality
Requirements :
- Ability to build and maintain ETL pipelines
- Technical Business Analysis experience and hands-on experience developing functional spec
- Good understanding of Data Engineering principles including data modeling methodologies
- Sound understanding of PostgreSQL
- Strong analytical and interpersonal skills as well as reporting capabilities
- We are looking for a Data Engineer with 3-5 years experience in Python, SQL, AWS (EC2, S3, Elastic Beanstalk, API Gateway), and Java.
- The applicant must be able to perform Data Mapping (data type conversion, schema harmonization) using Python, SQL, and Java.
- The applicant must be familiar with and have programmed ETL interfaces (OAUTH, REST API, ODBC) using the same languages.
- The company is looking for someone who shows an eagerness to learn and who asks concise questions when communicating with teammates.
2. Assemble large, complex data sets that meet business requirements
3. Identify, design, and implement internal process improvements
4. Optimize data delivery and re-design infrastructure for greater scalability
5. Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies
6. Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics
7. Work with internal and external stakeholders to assist with data-related technical issues and support data infrastructure needs
8. Create data tools for analytics and data scientist team members
Skills Required:
1. Working knowledge of ETL on any cloud (Azure / AWS / GCP)
2. Proficient in Python (Programming / Scripting)
3. Good understanding of any of the data warehousing concepts (Snowflake / AWS Redshift / Azure Synapse Analytics / Google Big Query / Hive)
4. In-depth understanding of principles of database structure
5. Good understanding of any of the ETL technologies (Informatica PowerCenter / AWS Glue / Data Factory / SSIS / Spark / Matillion / Talend / Azure)
6. Proficient in SQL (query solving)
7. Knowledge in Change case Management / Version Control – (VSS / DevOps / TFS / GitHub, Bit bucket, CICD Jenkin)
Minimum 2 years of work experience on Snowflake and Azure storage.
Minimum 3 years of development experience in ETL Tool Experience.
Strong SQL database skills in other databases like Oracle, SQL Server, DB2 and Teradata
Good to have Hadoop and Spark experience.
Good conceptual knowledge on Data-Warehouse and various methodologies.
Working knowledge in any of the scripting like UNIX / Shell
Good Presentation and communication skills.
Should be flexible with the overlapping working hours.
Should be able to work independently and be proactive.
Good understanding of Agile development cycle.
We are looking for a Data Engineer that will be responsible for collecting, storing, processing, and analyzing huge sets of data that is coming from different sources.
Responsibilities
Working with Big Data tools and frameworks to provide requested capabilities Identify development needs in order to improve and streamline operations Develop and manage BI solutions Implementing ETL process and Data Warehousing Monitoring performance and managing infrastructure
Skills
Proficient understanding of distributed computing principles Proficiency with Hadoop and Spark Experience with building stream-processing systems, using solutions such as Kafka and Spark-Streaming Good knowledge of Data querying tools SQL and Hive Knowledge of various ETL techniques and frameworks Experience with Python/Java/Scala (at least one) Experience with cloud services such as AWS or GCP Experience with NoSQL databases, such as DynamoDB,MongoDB will be an advantage Excellent written and verbal communication skills