
We are seeking an experienced Web Scraping Engineer to data extraction efforts for our enterprise clients. In this role, you will be tasked with creating and maintaining robust, large-scale scraping systems for gathering structured data.
Responsibilities:
Develop and optimize custom web scraping tools and workflows.
Integrate scraping systems with data storage solutions like SQL and NoSQL databases.
Troubleshoot and resolve scraping challenges, including CAPTCHAs, rate limiting, and IP blocking.
Provide technical guidance on scraping best practices and standards.
Skills Required:
Expert in Python and scraping libraries such as Scrapy and BeautifulSoup.
Deep understanding of web scraping techniques and challenges (CAPTCHAs, anti-bot measures).
Experience with cloud platforms (AWS, Google Cloud).
Strong background in databases and data storage systems (SQL, MongoDB).

Similar jobs
Position: Lead Python Developer
Location: Ahmedabad, Gujarat
The Client company includes a team of experienced information services professionals who are passionate about growing and enhancing the value of information services businesses. They provide support with talent, technology, tools, infrastructure and expertise required to deliver across the Data ecosystem. Position Summary We are seeking a skilled and experienced Backend Developer with strong expertise in TypeScript, Python, and web scraping. You will be responsible for designing, developing, and maintaining scalable backend services and APIs that power our data-driven products. Your role will involve collaborating with cross functional teams, optimizing system performance, ensuring data integrity, and contributing to the design of efficient and secure architectures.
Job Responsibility
● Design, develop, and maintain backend systems and services using Python and TypeScript.
● Develop and maintain web scraping solutions to extract, process, and manage large-scale data from multiple sources.
● Work with relational and non-relational
databases, ensuring high availability, scalability, and performance.
● Implement authentication, authorization, and security best practices across services.
● Write clean, maintainable, and testable code following best practices and coding standards.
● Collaborate with frontend engineers, data engineers, and DevOps teams to deliver robust solutions and troubleshoot, debug, and upgrade existing applications.
● Stay updated with backend development trends, tools, and frameworks to continuously improve processes.
● Utilize core crawling experience to design efficient strategies for scraping the data from different websites and applications.
● Collaborate with technology teams, data collection teams to build end to end technology-enabled ecosystems and partner in research projects to analyze the massive data inputs.
● Responsible for the design and development of web crawlers, able to independently solve various problems encountered in the actual development process.
● Stay updated with the latest web scraping techniques, tools, and industry trends to continuously improve the scraping processes.
Job Requirements
● 4+ years of professional experience in backend development with TypeScript and Python.
● Strong understanding of TypeScript-based server-side frameworks (e.g., Node.js, NestJS, Express) and Python frameworks (e.g., FastAPI, Django, Flask).
● Experience with tools and libraries for web scraping (e.g., Scrapy, BeautifulSoup, Selenium, Puppeteer)
● Hands-on experience with Temporal for creating and orchestrating workflows
● Proven hands-on experience in web scraping, including crawling, data extraction, deduplication, and handling dynamic websites.
● Proficient in implementing proxy solutions and handling bot-detection challenges (e.g., Cloudflare).
● Experience working with Docker, containerized deployments, and cloud environments (GCP or Azure).
● Proficiency with database systems such as MongoDB and Elastic Search.
● Hands-on experience with designing and maintaining scalable APIs.
● Knowledge of software testing practices (unit, integration, end-to-end).
● Familiarity with CI/CD pipelines and version control systems (Git).
● Strong problem-solving skills, attention to detail, and ability to work in agile environments.
● Great communication skills and ability to navigate in undirected situations.
Job Exposure:
● Opportunity to apply creative methods in acquiring and filtering the North American government, agencies data from various websites, sources
● In depth industry exposure on data harvesting techniques to build, scale the robust and sustainable model, using open-source applications ● Effectively collaboration with IT team to design the tailor-made solutions basis upon clients’ requirement
● Unique opportunity to research on various agencies, vendors, products as well as technology tools to compose a solution
Web Scraping engineer
Python web scraping will be responsible for efficient web scraping/web crawling and parsing. The candidate should have demonstrated experience in web scraping and data extraction along with the ability to communicate effectively and adhere to set deadlines.
Responsibilities:
- Develop and maintain a service that extracts website data using scrapers and APIs across multiple sophisticated websites.
- Extract structured/unstructured data and manipulate data through text processing, image processing, regular expressions etc.
- Writing reusable, testable, and efficient code
- Seeking a Python Developer to develop and maintain web scraping solutions using BeautifulSoup, Scrapy, and Selenium.
- Responsibilities include handling dynamic content, proxies, CAPTCHAs, data extraction, optimization, and ensuring data accuracy.
- Implement and maintain robust, full-stack applications for web crawlers.
- Troubleshoot, debug, and improve existing web crawlers and data extraction systems.
- Utilize tools such as Scrapy and the Spider tool to enhance data crawling capabilities.
- Requirements:
- 0.5-2 years of work experience in Python-based web scraping
- Sound understanding and knowledge of Python and good experience in any of the web crawling tools like requests, scrapy, BeautifulSoup, Selenium etc.
- Strong interpersonal, verbal, and written communication skills in English
Responsibilities
· Develop Python-based APIs using FastAPI and Flask frameworks.
· Develop Python-based Automation scripts and Libraries.
· Develop Front End Components using VueJS and ReactJS.
· Writing and modifying Docker files for the Back-End and Front-End Components.
· Integrate CI/CD pipelines for Automation and Code quality checks.
· Writing complex ORM mappings using SQLAlchemy.
Required Skills:
· Strong experience in Python development in a full stack environment is a requirement, including NodeJS, VueJS/Vuex, Flask, etc.
· Experience with SQLAchemy or similar ORM frameworks.
· Experience working with Geolocation APIs (e.g., Google Maps, Mapbox).
· Experience using Elasticsearch and Airflow is a plus.
· Strong knowledge of SQL, comfortable working with MySQL and/or PostgreSQL databases.
· Understand concepts of Data Modeling.
· Experience with REST.
· Experience with Git, GitFlow, and code review process.
· Good understanding of basic UI and UX principles.
· Project excellent problem-solving and communication skills.
JD / Skills Sets
1. Good knowledge on Python
2. Good knowledge on My-Sql, mongodb
3. Design Pattern
4. OOPs
5. Automation
6. Web scraping
7. Redis queue
8. Basic idea of Finance Domain will be beneficial.
9. Git10. AWS (EC2, RDS, S3)
1: proficient in python, flask, pandas, GitHub and AWS
2: good knowledge of databases both SQL and NoSQL
3:Strong experience in REST and SOAP APIs
4: Experience with working on scalable interactive web applications
5:Basic knowledge of JavaScript and Html
6: Automation and crawling tools and modules
7: Multithreading and Multiprocessing
8:Good Understanding of test-driven Development
9: Preferred exposure to finance domain
Your Responsibilities would be to:
- Architect new and optimize existing software codebases and systems used to crawl, launch, run, and monitor the Anakin family of app crawlers
- Deeply own the lifecycle of software, including rolling out to operations, managing configurations, maintaining and upgrading, and supporting end-users
- Configure and optimize the automated testing and deployment systems used to maintain over 1000+ crawlers across the company
- Analyze data and bugs that require in-depth investigations
- Interface directly with external customers including managing relationships and steering requirements
Basic Qualifications:
- Extremely effective, self-driven builder
- 2+ years of experience as a backend software engineer
- 2+ years of experience with Python
- 2+ years of experience with AWS services such as EC2, S3, Lambda, etc.
- Should have managed a team of software engineers
- Deep experience with network debugging across all OSI layers (Wireshark)
- Knowledge of networks or/and cybersecurity
Preferred Skills and Experience
- Broad understanding of the landscape of software engineering design patterns and principles
- Ability to work quickly and accurately in a highly stressful environment during removing bugs in run-time within minutes
- Excellent communicator, both written and verbal
Additional Requirements
- Must be available to work extended hours and weekends when needed to meet critical deadlines
- Must have an aversion to politics and BS. Should let his/her work speak for him/her.
- Must be comfortable with uncertainty. In almost all the cases, your job will be to figure it out.
- Must not be bounded to comfort zone. Often, you will need to challenge yourself to go above and beyond.
Roles and Responsibilities
- Apply knowledge set to fetch data from multiple online sources, cleanse it and build APIs on top of it
- Develop a deep understanding of our vast data sources on the web and know exactly how, when, and which data to scrap, parse and store
- We're looking for people who will naturally take ownership of data products and who can bring a project all the way from a fast prototype to production.
- Integrating and maintaining Python services
- Developing robust microservices and applications
Desired Candidate Profile
- Strong relevant experience of at least 2-3 years.
- Strong coding experience in Python3.
- Should have good experience with Django and NodeJs.
- Proficient on modelling applications on both RDBMS and NOSQL databases
- Should have experience in web scraping
- Good understanding and hands-on with scheduling and managing tasks with cron
- Should have experience with microservice architecture
- Compile and analyze data, processes, and codes to troubleshoot problems and identify areas for improvement
- Should have experience in shell scripting, GIT and docker
- Writing unit tests for 100% code coverage.
Company Introduction –
- Information Security & Data Analytics Series A funded company.
- Working in cutting edge technologies - Using AI for predictive intelligence and Facial Biometrics.
- Among Top 5 Cyber excellence companies globally (Holger Schulze awards)
- Bronze award for best startup of the year (Indian Express IT awards), only cyber Security Company in top 3.
- More than 100+ clients in India.
Job Description:-
Job Title: Python Developer
Key Requirements:-
- Mine data from structured and unstructured data sources.
- Extract data (text, images, and videos) from multiple documents in different formats.
- Extract information and intelligence from data.
- Extract data based on regular expressions.
- Collect data from structured RDBMS databases.
- Work closely with Project/Business/Research teams to provide mined data/intelligence for analysis.
- Should have strong exposure to core python skills like multiprocessing, multithreading, file handling, data structure like JSON, Data frames, and User Defined Data structure.
- Should have excellent knowledge of classes, file handling, memory manipulations.
- Strong Knowledge in Python.
- Strong exposure to frond end languages like CSS, JavaScript, Ajax etc.
- Should have exposure to requests, Frontera, scarpy-cluster, elastic-search, distributed computing tools like Kafka, Hbase, Redis, Zookeeper, restAPI.
- Should be familiar with *nix development environment.
- Knowledge of Django will be added advantage.
- Excellent knowledge on Web Crawling/Web scraping.
- Should have used scraping modules like Selenium, Scrapy, and Beautiful soup.
- Experience with text processing.
- Basics of databases. Good troubleshooting and debugging skills.
Experience : 1-4 Years Experiene
Education
B.Tech, MCA, Computer Engineering










