Responsibilities - Responsible for implementation and ongoing administration of Hadoopinfrastructure. - Aligning with the systems engineering team to propose and deploy newhardware and software environments required for Hadoop and to expand existingenvironments. - Working with data delivery teams to setup new Hadoop users. This job includessetting up Linux users, setting up Kerberos principals and testing HDFS, Hive, Pigand MapReduce access for the new users. - Cluster maintenance as well as creation and removal of nodes using tools likeGanglia, Nagios, Cloudera Manager Enterprise, Dell Open Manage and other tools - Performance tuning of Hadoop clusters and Hadoop MapReduce routines - Screen Hadoop cluster job performances and capacity planning - Monitor Hadoop cluster connectivity and security - Manage and review Hadoop log files. - File system management and monitoring. - Diligently teaming with the infrastructure, network, database, application andbusiness intelligence teams to guarantee high data quality and availability - Collaboration with application teams to install operating system and Hadoopupdates, patches, version upgrades when required. READ MORE OF THE JOB DESCRIPTION QualificationsQualifications - Bachelors Degree in Information Technology, Computer Science or otherrelevant fields - General operational expertise such as good troubleshooting skills,understanding of systems capacity, bottlenecks, basics of memory, CPU, OS,storage, and networks. - Hadoop skills like HBase, Hive, Pig, Mahout - Ability to deploy Hadoop cluster, add and remove nodes, keep track of jobs,monitor critical parts of the cluster, configure name node high availability, scheduleand configure it and take backups. - Good knowledge of Linux as Hadoop runs on Linux. - Familiarity with open source configuration management and deployment toolssuch as Puppet or Chef and Linux scripting. Nice to Have - Knowledge of Troubleshooting Core Java Applications is a plus.
We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms. Roles and Responsibilities: Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies. Develop programs in Scala and Python as part of data cleaning and processing. Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization. Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems. Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it. Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement. Provide high operational excellence guaranteeing high availability and platform stability. Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms. Skills: Experience with Big Data pipeline, Big Data analytics, Data warehousing. Experience with SQL/No-SQL, schema design and dimensional data modeling. Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce. Experience in designing systems that process structured as well as unstructured data at large scale. Experience in AWS/Spark/Java/Scala/Python development. Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization. Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies. Prior exposure to streaming data sources such as Kafka. Should have knowledge on Shell Scripting and Python scripting. High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures. Experience with NoSQL databases such as Cassandra / MongoDB. Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission. Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development). Experience building and deploying applications on on-premise and cloud-based infrastructure. Having a good understanding of machine learning landscape and concepts. Qualifications and Experience: Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years. Certifications: Good to have at least one of the Certifications listed here: AZ 900 - Azure Fundamentals DP 200, DP 201, DP 203, AZ 204 - Data Engineering AZ 400 - Devops Certification
Must be a Looker certified LookML Developer (It is very important)As a Looker Specialist at Searce, you will: Work with the business team to identify the best technical solution for a given problem or scenario in the Looker Platform Design and develop LookML models, explores, dashboards, workflows in the Looker platform Utilize established processes and best practices in designing solutions or blaze new trails and drive organization-wide data quality methodologies Help create and maintain coding standards (style guides, etc.) and perform code reviews and technical analysts Write SQL, build derived tables, and work with data engineering to ensure queries are performant and maintainable Ensure project:model:view structure is optimized and re-architect as needed for growing business use cases Build and maintain monitoring dashboards for the Looker platform Optimize performance through re-architecting as needed Partner with data engineering on larger projects to optimize the data warehouse layer for Looker performance Sounds like you? What we are looking for: Looker certification is must have(It is very important) Minimum 1 - 2 years of experience using Looker and developing Look ML Experience in dashboard building and with LookerML build from scratch experience Excellent SQL skills and advanced proficiency in LookML Previous experience building scalable BI platforms, including the use of code review processes, automated testing, staging and production environments, and release schedules is preferred Experience with cloud database platforms preferred Working knowledge of scripting languages to allow scheduling of BI dashboards Knowledge of software development processes and best practices What You Can Expect From UsYou’ll join an entrepreneurial, inclusive culture. One where we succeed together – across the desk and around the globe. Where like-minded people work naturally together to achieve great things.Our Total Rewards program reflects our commitment to helping you achieve your ambitions in career, recognition, well-being, benefits and pay.Join us to develop your strengths and enjoy a fulfilling career full of varied experiences.Keep those ambitions in sights and imagine where Searce can take you..Apply today!
3+ years experience in practical implementation and deployment of ML based systems preferred. BE/B Tech or M Tech (preferred) in CS/Engineering with strong mathematical/statistical background Strong mathematical and analytical skills, especially statistical and ML techniques, with familiarity with different supervised and unsupervised learning algorithms Implementation experiences and deep knowledge of Classification, Time Series Analysis, Pattern Recognition, Reinforcement Learning, Deep Learning, Dynamic Programming and Optimisation Experience in working on modeling graph structures related to spatiotemporal systems Programming skills in Python Experience in developing and deploying on cloud (AWS or Google or Azure) Good verbal and written communication skills Familiarity with well-known ML frameworks such as Pandas, Keras, TensorFlow
BitClass is looking to hire Senior Data Engineers to drive a data driven decision making culture in the organization by helping in setting up data pipelines to organize and monitor data across all the systems.Right from interactive live classes to content feed and community chat, there are hundreds of moving parts on the BitClass platform contribution to hundreds of thousands of data points which are needed to be channelized and organized to derive insights for taking business decisions.Data Engineer Job Responsibilities: Develops and maintains scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity. Collaborates with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organization. Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it. Writes unit/integration tests, contributes to engineering wiki, and documents work. Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues. Works closely with a team of frontend and backend engineers, product managers, and analysts. Defines company data assets (data models) jobs to populate data models. Designs data integrations and data quality framework. Designs and evaluates open source and vendor tools for data lineage. Works closely with all business units and engineering teams to develop strategy for long term data platform architecture. Required Skills : BS or MS degree in Computer Science or a related technical field 3+ years of Python or Java development experience 3+ years of SQL experience (No-SQL experience is a plus) 3+ years of experience with schema design and dimensional data modeling Ability in managing and communicating data warehouse plans to internal clients Experience designing, building, and maintaining data processing systems Experience working with either a Map Reduce or an MPP system on any size/scale
We are looking to add a Lead BigData Engineer to our team, Would love to connect and tell you more about what were building, and to learn more about what interests you. Company Name : PayU - Kindly visit our company site for more details . Location : Bangalore, Mumbai & Gurgaon Role and Background Information: As a Lead/Manager - Data Engineering, you will execute the strategy and roadmap for data engineering at PayU, you will define, evolve and mature PayU's existing data platform and processing framework that handles terabytes of data from our various properties. You will work with a team of data engineers, analytics and data scientist teams to identify the needs and contribute towards building the next generation data ecosystem. You will engage with analysts and leaders to research and develop new data engineering capabilities. Responsible to ingest data from files, streams and databases. Process the data with Python and Pyspark and its storage into time-series database. Develop programs in Python as part of data extraction, data cleaning, transformation and processing. Develop and maintain scalable data pipelines. Rest API’s development. What you’d need to bring to the table: Overall 5-10 years of experience as Data Engineer. Advanced working SQL knowledge to create complex queries. Experience in working with Time-series Database, relational databases as well as working familiarity with a variety of databases (structured & Unstructured). Hands On experience on visualization tools like Grafana & Power BI. Experience in designing and implementing scalable architecture. Good experience in doing object-oriented programming in python. Very strong in Object-Oriented Analysis and Design (OOAD). Strong knowledge on REST APIs. Experience working on Azure Cloud services IaaS PaaS. Hands on Experience in working on Microsoft Azure Services like ADLS/Blob Storage solutions, Event Hubs, Service Bus, scale sets, Load Balancers, Azure Functions, Databricks. - Hands on Experience in working on Kafka. Knowledge on continuous integration/continuous deployment. Experience of data migration and deployment from On-Prem to Cloud environment and vice-versa. Individual Contributor. Education- Bachelor of Engineering (IIT/NIT/Bits is Preferred)
Data Engineer• Drive the data engineering implementation• Strong experience in building data pipelines• AWS stack experience is must• Deliver Conceptual, Logical and Physical data models for the implementationteams.• SQL stronghold is must. Advanced SQL working knowledge and experienceworking with a variety of relational databases, SQL query authoring• AWS Cloud data pipeline experience is must. Data pipelines and data centricapplications using distributed storage platforms like S3 and distributed processingplatforms like Spark, Airflow, Kafka• Working knowledge of AWS technologies such as S3, EC2, EMR, RDS, Lambda,Elasticsearch• Ability to use a major programming (e.g. Python /Java) to process data formodelling.
Hands-on programming expertise in Java OR Python Strong production experience with Spark (Minimum of 1-2 years) Experience in data pipelines using Big Data technologies (Hadoop, Spark, Kafka, etc.,) on large scale unstructured data sets Working experience and good understanding of public cloud environments (AWS OR Azure OR Google Cloud) Experience with IAM policy and role management is a plus
5+ years of experience in a Data Engineer role Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field. Experience with big data tools: Hadoop, Spark, Kafka, etc. Experience with relational SQL and NoSQL databases such as Cassandra. Experience with AWS cloud services: EC2, EMR, Athena Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc. Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as familiarity with unstructured datasets. Deep problem-solving skills to perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
A data engineer with AWS Cloud infrastructure experience to join our Big Data Operations team. This role will provide advanced operations support, contribute to automation and system improvements, and work directly with enterprise customers to provide excellent customer service.The candidate,1. Must have a very good hands-on technical experience of 3+ years with JAVA or Python2. Working experience and good understanding of AWS Cloud; Advanced experience with IAM policy and role management3. Infrastructure Operations: 5+ years supporting systems infrastructure operations, upgrades, deployments using Terraform, and monitoring4. Hadoop: Experience with Hadoop (Hive, Spark, Sqoop) and / or AWS EMR5. Knowledge on PostgreSQL/MySQL/Dynamo DB backend operations6. DevOps: Experience with DevOps automation - Orchestration/Configuration Management and CI/CD tools (Jenkins)7. Version Control: Working experience with one or more version control platforms like GitHub or GitLab8. Knowledge on AWS Quick sight reporting9. Monitoring: Hands on experience with monitoring tools such as AWS CloudWatch, AWS CloudTrail, Datadog and Elastic Search10. Networking: Working knowledge of TCP/IP networking, SMTP, HTTP, load-balancers (ELB) and high availability architecture11. Security: Experience implementing role-based security, including AD integration, security policies, and auditing in a Linux/Hadoop/AWS environment. Familiar with penetration testing and scan tools for remediation of security vulnerabilities.12. Demonstrated successful experience learning new technologies quicklyWHAT WILL BE THE ROLES AND RESPONSIBILITIES?1. Create procedures/run books for operational and security aspects of AWS platform2. Improve AWS infrastructure by developing and enhancing automation methods3. Provide advanced business and engineering support services to end users4. Lead other admins and platform engineers through design and implementation decisions to achieve balance between strategic design and tactical needs5. Research and deploy new tools and frameworks to build a sustainable big data platform6. Assist with creating programs for training and onboarding for new end users7. Lead Agile/Kanban workflows and team process work8. Troubleshoot issues to resolve problems9. Provide status updates to Operations product owner and stakeholders10. Track all details in the issue tracking system (JIRA)11. Provide issue review and triage problems for new service/support requests12. Use DevOps automation tools, including Jenkins build jobs13. Fulfil any ad-hoc data or report request queries from different functional groups