Cutshort logo
Event & Unstructured Data
5 - 7 yrs
₹20L - ₹25L / yr
Mumbai
Skills
AWS KINESYS
Data engineering
AWS Lambda
DynamoDB
data pipeline
Data governance
Data processing
skill iconAmazon Web Services (AWS)
athena
Audio
Linux/Unix
skill iconPython
SQL
WebLogic
KINESYS
Lambda
  • Key responsibility is to design & develop a data pipeline for real-time data integration, processing, executing of the model (if required), and exposing output via MQ / API / No-SQL DB for consumption
  • Provide technical expertise to design efficient data ingestion solutions to store & process unstructured data, such as Documents, audio, images, weblogs, etc
  • Developing API services to provide data as a service
  • Prototyping Solutions for complex data processing problems using AWS cloud-native solutions
  • Implementing automated Audit & Quality assurance Checks in Data Pipeline
  • Document & maintain data lineage from various sources to enable data governance
  • Coordination with BIU, IT, and other stakeholders to provide best-in-class data pipeline solutions, exposing data via APIs, loading in down streams, No-SQL Databases, etc

Skills

  • Programming experience using Python & SQL
  • Extensive working experience in Data Engineering projects, using AWS Kinesys,  AWS S3, DynamoDB, EMR, Lambda, Athena, etc for event processing
  • Experience & expertise in implementing complex data pipeline
  • Strong Familiarity with AWS Toolset for Storage & Processing. Able to recommend the right tools/solutions available to address specific data processing problems
  • Hands-on experience in Unstructured (Audio, Image, Documents, Weblogs, etc) Data processing.
  • Good analytical skills with the ability to synthesize data to design and deliver meaningful information
  • Know-how on any No-SQL DB (DynamoDB, MongoDB, CosmosDB, etc) will be an advantage.
  • Ability to understand business functionality, processes, and flows
  • Good combination of technical and interpersonal skills with strong written and verbal communication; detail-oriented with the ability to work independently

Functional knowledge

  • Real-time Event Processing
  • Data Governance & Quality assurance
  • Containerized deployment
  • Linux
  • Unstructured Data Processing
  • AWS Toolsets for Storage & Processing
  • Data Security

 

Read more
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos

About They provide both wholesale and retail funding. PM1

Founded
Type
Size
Stage
About
N/A
Company social profiles
N/A

Similar jobs

OJCommerce
at OJCommerce
3 recruiters
Rajalakshmi N
Posted by Rajalakshmi N
Chennai
2 - 5 yrs
₹7L - ₹12L / yr
Beautiful Soup
Web Scraping
skill iconPython
Selenium

Role : Web Scraping Engineer

Experience : 2 to 3 Years

Job Location : Chennai

About OJ Commerce: 


OJ Commerce (OJC), a rapidly expanding and profitable online retailer, is headquartered in Florida, USA, with a fully-functional office in Chennai, India. We deliver exceptional value to our customers by harnessing cutting-edge technology, fostering innovation, and establishing strategic brand partnerships to enable a seamless, enjoyable shopping experience featuring high-quality products at unbeatable prices. Our advanced, data-driven system streamlines operations with minimal human intervention.

Our extensive product portfolio encompasses over a million SKUs and more than 2,500 brands across eight primary categories. With a robust presence on major platforms such as Amazon, Walmart, Wayfair, Home Depot, and eBay, we directly serve consumers in the United States.

As we continue to forge new partner relationships, our flagship website, www.ojcommerce.com, has rapidly emerged as a top-performing e-commerce channel, catering to millions of customers annually.

Job Summary:

We are seeking a Web Scraping Engineer and Data Extraction Specialist who will play a crucial role in our data acquisition and management processes. The ideal candidate will be proficient in developing and maintaining efficient web crawlers capable of extracting data from large websites and storing it in a database. Strong expertise in Python, web crawling, and data extraction, along with familiarity with popular crawling tools and modules, is essential. Additionally, the candidate should demonstrate the ability to effectively utilize API tools for testing and retrieving data from various sources. Join our team and contribute to our data-driven success!


Responsibilities:


  • Develop and maintain web crawlers in Python.
  • Crawl large websites and extract data.
  • Store data in a database.
  • Analyze and report on data.
  • Work with other engineers to develop and improve our web crawling infrastructure.
  • Stay up to date on the latest crawling tools and techniques.



Required Skills and Qualifications:


  • Bachelor's degree in computer science or a related field.
  • 2-3 years of experience with Python and web crawling.
  • Familiarity with tools / modules such as
  • Scrapy, Selenium, Requests, Beautiful Soup etc.
  • API tools such as Postman or equivalent. 
  • Working knowledge of SQL.
  • Experience with web crawling and data extraction.
  • Strong problem-solving and analytical skills.
  • Ability to work independently and as part of a team.
  • Excellent communication and documentation skills.


What we Offer

• Competitive salary

• Medical Benefits/Accident Cover

• Flexi Office Working Hours

• Fast paced start up

Read more
Chennai
4 - 6 yrs
₹18L - ₹23L / yr
skill iconMachine Learning (ML)
skill iconData Science
Natural Language Processing (NLP)
Computer Vision
recommendation algorithm
+6 more

Roles & Responsibilities:


-Adopt novel and breakthrough Deep Learning/Machine Learning technology to fully solve real world problems for different industries. -Develop prototypes of machine learning models based on existing research papers.

-Utilize published/existing models to meet business requirements. Tweak existing implementations to improve efficiencies and adapt for use-case variations.

-Optimize machine learning model training and inference time. -Work closely with development and QA teams in transitioning prototypes to commercial products

-Independently work end-to-end from data collection, preparation/annotation to validation of outcomes.


-Define and develop ML infrastructure to improve efficiency of ML development workflows.


Must Have:

- Experience in productizing and deployment of ML solutions.

- AI/ML expertise areas: Computer Vision with Deep Learning. Experience with object detection, classification, recognition; document layout and understanding tasks, OCR/ICR

. - Thorough understanding of full ML pipeline, starting from data collection to model building to inference.

- Experience with Python, OpenCV and at least a few framework/libraries (TensorFlow / Keras / PyTorch / spaCy / fastText / Scikit-learn etc.)

- Years with relevant experience:


5+ -Experience or Knowledge in ML OPS.


Good to Have: NLP: Text classification, entity extraction, content summarization. AWS, Docker.

Read more
CarWale
at CarWale
5 recruiters
Vanita Acharya
Posted by Vanita Acharya
Navi Mumbai, Mumbai
3 - 5 yrs
₹10L - ₹15L / yr
skill iconData Science
Data Scientist
skill iconR Programming
skill iconPython
skill iconMachine Learning (ML)
+1 more

About CarWale: CarWale's mission is to bring delight in car buying, we offer a bouquet of reliable tools and services to help car consumers decide on buying the right car, at the right price and from the right partner. CarWale has always strived to serve car buyers and owners in the most comprehensive and convenient way possible. We provide a platform where car buyers and owners can research, buy, sell and come together to discuss and talk about their cars.We aim to empower Indian consumers to make informed car buying and ownership decisions with exhaustive and un-biased information on cars through our expert reviews, owner reviews, detailed specifications and comparisons. We understand that a car is by and large the second-most expensive asset a consumer associates his lifestyle with! Together with CarTrade & BikeWale, we are the market leaders in the personal mobility media space.About the Team:We are a bunch of enthusiastic analysts assisting all business functions with their data needs. We deal with huge but diverse datasets to find relationships, patterns and meaningful insights. Our goal is to help drive growth across the organization by creating a data-driven culture.

We are looking for an experienced Data Scientist who likes to explore opportunities and know their way around data to build world class solutions making a real impact on the business. 

 

Skills / Requirements –

  • 3-5 years of experience working on Data Science projects
  • Experience doing statistical modelling of big data sets
  • Expert in Python, R language with deep knowledge of ML packages
  • Expert in fetching data from SQL
  • Ability to present and explain data to management
  • Knowledge of AWS would be beneficial
  • Demonstrate Structural and Analytical thinking
  • Ability to structure and execute data science project end to end

 

Education –

Bachelor’s degree in a quantitative field (Maths, Statistics, Computer Science). Masters will be preferred.

 

Read more
Hy-Vee
Bengaluru (Bangalore)
5 - 10 yrs
₹15L - ₹33L / yr
ETL
Informatica
Data Warehouse (DWH)
skill iconPython
skill iconGit
+4 more

Technical & Business Expertise:

-Hands on integration experience in SSIS/Mulesoft
- Hands on experience Azure Synapse
- Proven advanced level of writing database experience in SQL Server
- Proven advanced level of understanding about Data Lake
- Proven intermediate level of writing Python or similar programming language
- Intermediate understanding of Cloud Platforms (GCP) 
- Intermediate understanding of Data Warehousing
- Advanced Understanding of Source Control (Github)

Read more
Internshala
at Internshala
5 recruiters
Sarvari Juneja
Posted by Sarvari Juneja
Gurugram
3 - 5 yrs
₹15L - ₹19L / yr
Spark
Hadoop
Big Data
Data engineering
PySpark
+2 more

Internshala is a dot com business with the heart of dot org.

We are a technology company on a mission to equip students with relevant skills & practical exposure through internships, fresher jobs, and online trainings. Imagine a world full of freedom and possibilities. A world where you can discover your passion and turn it into your career. A world where your practical skills matter more than your university degree. A world where you do not have to wait till 21 to taste your first work experience (and get a rude shock that it is nothing like you had imagined it to be). A world where you graduate fully assured, fully confident, and fully prepared to stake a claim on your place in the world.

At Internshala, we are making this dream a reality!

👩🏻‍💻 Your responsibilities would include-

  • Designing, implementing, testing, deploying, and maintaining stable, secure, and scalable data engineering solutions and pipelines in support of data and analytics projects, including integrating new sources of data into our central data warehouse, and moving data out to applications and affiliates
  • Developing analytical tools and programs that can help in Analyzing and organizing raw data
  • Evaluating business needs and objectives
  • Conducting complex data analysis and report on results
  • Collaborating with data scientists and architects on several projects
  • Maintaining reliability of the system and being on-call for mission-critical systems
  • Performing infrastructure cost analysis and optimization
  • Generating architecture recommendations and the ability to implement them
  • Designing, building, and maintaining data architecture and warehousing using AWS services.
  • ETL optimization, designing, coding, and tuning big data processes using Apache Spark, R, Python, C#, and/or similar technologies.
  • Disaster recovery planning and implementation when it comes to ETL and data-related services
  • Define actionable KPIs and configure monitoring/alerting

🍒 You will get-

  • A chance to build and lead an awesome team working on one of the best recruitment and online trainings products in the world that impact millions of lives for the better
  • Awesome colleagues & a great work environment
  • Loads of autonomy and freedom in your work

💯 You fit the bill if-

  • You have the zeal to build something from scratch
  • You have experience in Data engineering and infrastructure work for analytical and machine learning processes.
  • You have experience in a Linux environment and familiarity with writing shell scripts using Python or any other scripting language
  • You have 3-5 years of experience as a Data Engineer or similar software engineering role
Read more
Navi Mumbai
3 - 5 yrs
₹7L - ₹18L / yr
PySpark
Data engineering
Big Data
Hadoop
Spark
+6 more
  • Proficiency in shell scripting
  • Proficiency in automation of tasks
  • Proficiency in Pyspark/Python
  • Proficiency in writing and understanding of sqoop
  • Understanding of CloudEra manager
  • Good understanding of RDBMS
  • Good understanding of Excel

 

Read more
Dataweave Pvt Ltd
at Dataweave Pvt Ltd
32 recruiters
Megha M
Posted by Megha M
Bengaluru (Bangalore)
3 - 7 yrs
Best in industry
skill iconPython
Data Structures
Algorithms
Web Scraping
Relevant set of skills
● Good communication and collaboration skills with 4-7 years of experience.
● Ability to code and script with strong grasp of CS fundamentals, excellent problem solving abilities.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Good understanding of RDBMS
● Experience in building Data pipelines and processing large datasets .
● Knowledge of building Web Scraping and data mining is a plus.
● Working knowledge of open source tools such as mysql, Solr, ElasticSearch, Cassandra ( data stores )
would be a plus.
● Expert in Python programming
Role and responsibilities
● Inclined towards working in a start-up environment.
● Comfort with frequent, incremental code testing and deployment, Data management skills
● Design and Build robust and scalable data engineering solutions for structured and unstructured data for
delivering business insights, reporting and analytics.
● Expertise in troubleshooting, debugging, data completeness and quality issues and scaling overall
system performance.
● Build robust API ’s that powers our delivery points (Dashboards, Visualizations and other integrations).
Read more
Japan Based Leading Company
Bengaluru (Bangalore)
3 - 10 yrs
₹0L - ₹20L / yr
Big Data
skill iconAmazon Web Services (AWS)
skill iconJava
skill iconPython
MySQL
+2 more
A data engineer with AWS Cloud infrastructure experience to join our Big Data Operations team. This role will provide advanced operations support, contribute to automation and system improvements, and work directly with enterprise customers to provide excellent customer service.
The candidate,
1. Must have a very good hands-on technical experience of 3+ years with JAVA or Python
2. Working experience and good understanding of AWS Cloud; Advanced experience with IAM policy and role management
3. Infrastructure Operations: 5+ years supporting systems infrastructure operations, upgrades, deployments using Terraform, and monitoring
4. Hadoop: Experience with Hadoop (Hive, Spark, Sqoop) and / or AWS EMR
5. Knowledge on PostgreSQL/MySQL/Dynamo DB backend operations
6. DevOps: Experience with DevOps automation - Orchestration/Configuration Management and CI/CD tools (Jenkins)
7. Version Control: Working experience with one or more version control platforms like GitHub or GitLab
8. Knowledge on AWS Quick sight reporting
9. Monitoring: Hands on experience with monitoring tools such as AWS CloudWatch, AWS CloudTrail, Datadog and Elastic Search
10. Networking: Working knowledge of TCP/IP networking, SMTP, HTTP, load-balancers (ELB) and high availability architecture
11. Security: Experience implementing role-based security, including AD integration, security policies, and auditing in a Linux/Hadoop/AWS environment. Familiar with penetration testing and scan tools for remediation of security vulnerabilities.
12. Demonstrated successful experience learning new technologies quickly
WHAT WILL BE THE ROLES AND RESPONSIBILITIES?
1. Create procedures/run books for operational and security aspects of AWS platform
2. Improve AWS infrastructure by developing and enhancing automation methods
3. Provide advanced business and engineering support services to end users
4. Lead other admins and platform engineers through design and implementation decisions to achieve balance between strategic design and tactical needs
5. Research and deploy new tools and frameworks to build a sustainable big data platform
6. Assist with creating programs for training and onboarding for new end users
7. Lead Agile/Kanban workflows and team process work
8. Troubleshoot issues to resolve problems
9. Provide status updates to Operations product owner and stakeholders
10. Track all details in the issue tracking system (JIRA)
11. Provide issue review and triage problems for new service/support requests
12. Use DevOps automation tools, including Jenkins build jobs
13. Fulfil any ad-hoc data or report request queries from different functional groups
Read more
LatentView Analytics
at LatentView Analytics
3 recruiters
Kannikanti madhuri
Posted by Kannikanti madhuri
Chennai
5 - 8 yrs
₹5L - ₹8L / yr
skill iconData Science
Analytics
skill iconData Analytics
Data modeling
Data mining
+7 more
Job Overview :We are looking for an experienced Data Science professional to join our Product team and lead the data analytics team and manage the processes and people responsible for accurate data collection, processing, modelling and analysis. The ideal candidate has a knack for seeing solutions in sprawling data sets and the business mindset to convert insights into strategic opportunities for our clients. The incumbent will work closely with leaders across product, sales, and marketing to support and implement high-quality, data-driven decisions. They will ensure data accuracy and consistent reporting by designing and creating optimal processes and procedures for analytics employees to follow. They will use advanced data modelling, predictive modelling, natural language processing and analytical techniques to interpret key findings.Responsibilities for Analytics Manager :- Build, develop and maintain data models, reporting systems, data automation systems, dashboards and performance metrics support that support key business decisions.- Design and build technical processes to address business issues.- Manage and optimize processes for data intake, validation, mining and engineering as well as modelling, visualization and communication deliverables.- Examine, interpret and report results to stakeholders in leadership, technology, sales, marketing and product teams.- Develop and implement quality controls and standards to ensure quality standards- Anticipate future demands of initiatives related to people, technology, budget and business within your department and design/implement solutions to meet these needs.- Communicate results and business impacts of insight initiatives to stakeholders within and outside of the company.- Lead cross-functional projects using advanced data modelling and analysis techniques to discover insights that will guide strategic decisions and uncover optimization opportunities.Qualifications for Analytics Manager :- Working knowledge of data mining principles: predictive analytics, mapping, collecting data from multiple cloud-based data sources- Strong SQL skills, ability to perform effective querying- Understanding of and experience using analytical concepts and statistical techniques: hypothesis development, designing tests/experiments, analysing data, drawing conclusions, and developing actionable recommendations for business units.- Experience and knowledge of statistical modelling techniques: GLM multiple regression, logistic regression, log-linear regression, variable selection, etc.- Experience working with and creating databases and dashboards using all relevant data to inform decisions.- Strong problem solving, quantitative and analytical abilities.- Strong ability to plan and manage numerous processes, people and projects simultaneously.- Excellent communication, collaboration and delegation skills.- We- re looking for someone with at least 5 years of experience in a position monitoring, managing and drawing insights from data, and someone with at least 3 years of experience leading a team. The right candidate will also be proficient and experienced with the following tools/programs :- Strong programming skills with querying languages: R, Python etc.- Experience with big data tools like Hadoop- Experience with data visualization tools: Tableau, d3.js, etc.- Experience with Excel, Word, and PowerPoint.
Read more
Atyeti Inc
at Atyeti Inc
3 recruiters
Yash G
Posted by Yash G
Pune
5 - 8 yrs
₹8L - ₹16L / yr
skill iconData Science
skill iconMachine Learning (ML)
Natural Language Processing (NLP)
skill iconPython
skill iconR Programming
+3 more
• Exposure to Deep Learning, Neural Networks, or related fields and a strong interest and desire to pursue them. • Experience in Natural Language Processing, Computer Vision, Machine Learning or Machine Intelligence (Artificial Intelligence). • Programming experience in Python. • Knowledge of machine learning frameworks like Tensorflow. • Experience with software version control systems like Github. • Understands the concept of Big Data like Hadoop, MongoDB, Apache Spark
Read more
Why apply to jobs via Cutshort
people_solving_puzzle
Personalized job matches
Stop wasting time. Get matched with jobs that meet your skills, aspirations and preferences.
people_verifying_people
Verified hiring teams
See actual hiring teams, find common social connections or connect with them directly. No 3rd party agencies here.
ai_chip
Move faster with AI
We use AI to get you faster responses, recommendations and unmatched user experience.
21,01,133
Matches delivered
37,12,187
Network size
15,000
Companies hiring
Did not find a job you were looking for?
icon
Search for relevant jobs from 10000+ companies such as Google, Amazon & Uber actively hiring on Cutshort.
companies logo
companies logo
companies logo
companies logo
companies logo
Get to hear about interesting companies hiring right now
Company logo
Company logo
Company logo
Company logo
Company logo
Linkedin iconFollow Cutshort
Users love Cutshort
Read about what our users have to say about finding their next opportunity on Cutshort.
Subodh Popalwar's profile image

Subodh Popalwar

Software Engineer, Memorres
For 2 years, I had trouble finding a company with good work culture and a role that will help me grow in my career. Soon after I started using Cutshort, I had access to information about the work culture, compensation and what each company was clearly offering.
Companies hiring on Cutshort
companies logos