This page was automatically translated and may contain errors. View in English.
م

Senior Data Engineer

Merit Data And Technology

Chennai, Tamil Nadu, India دوام كامل

كن أول من يتقدم بطلب

خبرة
أي
مرتب
INR 900,000 – INR 1,200,000 / year
الوظائف الشاغرة
1
تم النشر
أكثر من 12 ساعة
وضع العمل
في المكتب
تعليم
أي خريج
الأهلية
Any graduate can apply. Candidates with degrees in Computer Science, Engineering, or related disciplines, or those with equivalent practical experience, are eligible.
سيرة ذاتية
مطلوب للتقديم

مكان عملك

المسمى الوظيفي

About the Company

Merit Data and Technology is a London-headquartered AI-led technology company with engineering hubs in Chennai, Mumbai, and London. The business focuses on collecting, enriching, and engineering data, and it supports trusted B2B brands through proprietary data management systems and data solutions. Its work spans resilient, scalable cloud and on-premise products, from straightforward web apps to large enterprise-grade data systems.

Role Overview

This position is for a Senior Data Engineer focused on scraping and large-scale data harvesting. The role is responsible for building dependable pipelines that gather, parse, refine, and deliver high-quality information from web sources and APIs. The engineer will work with modern scraping tools, manage anti-bot restrictions, process data at scale, and coordinate end-to-end workflows alongside the DataHarvest team.

Key Responsibilities

The selected candidate will design and support scalable scraping and data-harvesting pipelines, develop scrapers using Python-based tools, handle JavaScript-heavy sites, and work around anti-bot controls such as proxy rotation, user-agent rotation, rate limiting, and CAPTCHA challenges. The role also includes ETL development, large-scale data processing, workflow orchestration, storage management across SQL and NoSQL systems, and ensuring robust monitoring, logging, retries, and error handling. Compliance with robots.txt, site terms, and privacy rules is also part of the role, along with collaboration with technical and downstream stakeholders to meet quality and delivery expectations.

Technical Scope

  • Build and maintain web scraping and harvesting workflows that can operate reliably at scale.
  • Use Python scraping libraries and browser automation tools to extract data from static and dynamic web sources.
  • Work with REST and GraphQL APIs, including reverse-engineering internal endpoints when required.
  • Process and transform structured and semi-structured data using ETL methods and distributed computing.
  • Schedule and orchestrate pipelines using tools such as Apache Airflow, Dagster, Prefect, or similar systems.
  • Store, manage, and move data across PostgreSQL, MySQL, MongoDB, and standard file formats such as CSV, JSON, XML, and Parquet.
  • Set up monitoring, alerting, retries, and fail-safes to keep pipelines stable and recoverable.
  • Follow legal, policy, and data privacy requirements while collecting information from external sources.

Requirements

  • Strong Python development skills; Node.js or JavaScript knowledge is an added advantage.
  • Practical experience with scraping stacks such as Scrapy, BeautifulSoup, lxml, requests/httpx, Selenium, Playwright, or Puppeteer.
  • Good command of web technologies including HTML, CSS, DOM structure, XPath, CSS selectors, and HTTP concepts such as headers, cookies, sessions, and status codes.
  • Experience with JSON, XML, HTML parsing, ETL workflows, and common data formats.
  • Hands-on exposure to PySpark or similar distributed data-processing approaches.
  • Familiarity with orchestration platforms such as Apache Airflow, with Dagster, Prefect, or Luigi as a plus.
  • Working knowledge of SQL and NoSQL databases.
  • Understanding of concurrency, asynchronous programming, and distributed scraping for high-volume workloads.
  • Comfort with Git, Docker, cloud environments, and pipeline monitoring/alerting practices.
  • Awareness of compliance and legal considerations related to scraping and data collection.
  • Education in Computer Science, Engineering, or a related discipline, or equivalent hands-on experience.

Eligibility

Any graduate may apply. Candidates with a bachelor’s or master’s degree in Computer Science, Engineering, or a related area are preferred, though equivalent practical experience is also acceptable.

Compensation

The salary range offered for this role is INR 9,00,000 to INR 12,00,000 per year.

Additional Information

This position is based in Chennai, India. No vacancy count, joining date, or notice-period details were specified in the source information.

اتركها إذا كنت ترغب في الحصول على رد - لن نستخدمها لأي غرض آخر.

انقر للتصفح، السحب والإفلات، أو لصق لقطة شاشة

PNG، JPG، GIF، MP4، WebM، MOV · الحد الأقصى 20 ميجابايت لكل ملف · حتى 5 ملفات

🤖
مساعد بروكسر
عبر الإنترنت · مساعدة فورية بالذكاء الاصطناعي
🤖
مدعوم بالذكاء الاصطناعي · إجابات من مساعدة بروكسر