11+ AWS Simple Queuing Service (SQS) Jobs in Pune | AWS Simple Queuing Service (SQS) Job openings in Pune
Apply to 11+ AWS Simple Queuing Service (SQS) Jobs in Pune on CutShort.io. Explore the latest AWS Simple Queuing Service (SQS) Job opportunities across top companies like Google, Amazon & Adobe.
- Data Engineer
Required skill set: AWS GLUE, AWS LAMBDA, AWS SNS/SQS, AWS ATHENA, SPARK, SNOWFLAKE, PYTHON
- Experience in AWS Glue
- Experience in Apache Parquet
- Proficient in AWS S3 and data lake
- Knowledge of Snowflake
- Understanding of file-based ingestion best practices.
- Scripting language - Python & pyspark
- Create and manage cloud resources in AWS
- Data ingestion from different data sources which exposes data using different technologies, such as: RDBMS, REST HTTP API, flat files, Streams, and Time series data based on various proprietary systems. Implement data ingestion and processing with the help of Big Data technologies
- Data processing/transformation using various technologies such as Spark and Cloud Services. You will need to understand your part of business logic and implement it using the language supported by the base data platform
- Develop automated data quality check to make sure right data enters the platform and verifying the results of the calculations
- Develop an infrastructure to collect, transform, combine and publish/distribute customer data.
- Define process improvement opportunities to optimize data collection, insights and displays.
- Ensure data and results are accessible, scalable, efficient, accurate, complete and flexible
- Identify and interpret trends and patterns from complex data sets
- Construct a framework utilizing data visualization tools and techniques to present consolidated analytical and actionable results to relevant stakeholders.
- Key participant in regular Scrum ceremonies with the agile teams
- Proficient at developing queries, writing reports and presenting findings
- Mentor junior members and bring best industry practices
- 5-7+ years’ experience as data engineer in consumer finance or equivalent industry (consumer loans, collections, servicing, optional product, and insurance sales)
- Strong background in math, statistics, computer science, data science or related discipline
- Advanced knowledge one of language: Java, Scala, Python, C#
- Production experience with: HDFS, YARN, Hive, Spark, Kafka, Oozie / Airflow, Amazon Web Services (AWS), Docker / Kubernetes, Snowflake
- Proficient with
- Data mining/programming tools (e.g. SAS, SQL, R, Python)
- Database technologies (e.g. PostgreSQL, Redshift, Snowflake. and Greenplum)
- Data visualization (e.g. Tableau, Looker, MicroStrategy)
- Comfortable learning about and deploying new technologies and tools.
- Organizational skills and the ability to handle multiple projects and priorities simultaneously and meet established deadlines.
- Good written and oral communication skills and ability to present results to non-technical audiences
- Knowledge of business intelligence and analytical tools, technologies and techniques.
Familiarity and experience in the following is a plus:
- AWS certification
- Spark Streaming
- Kafka Streaming / Kafka Connect
- ELK Stack
- Cassandra / MongoDB
- CI/CD: Jenkins, GitLab, Jira, Confluence other related tools
- Design, build & test ETL processes using Python & SQL for the corporate data warehouse
- Inform, influence, support, and execute our product decisions
- Maintain advertising data integrity by working closely with R&D to organize and store data in a format that provides accurate data and allows the business to quickly identify issues.
- Evaluate and prototype new technologies in the area of data processing
- Think quickly, communicate clearly and work collaboratively with product, data, engineering, QA and operations teams
- High energy level, strong team player and good work ethic
- Data analysis, understanding of business requirements and translation into logical pipelines & processes
- Identification, analysis & resolution of production & development bugs
- Support the release process including completing & reviewing documentation
- Configure data mappings & transformations to orchestrate data integration & validation
- Provide subject matter expertise
- Document solutions, tools & processes
- Create & support test plans with hands-on testing
- Peer reviews of work developed by other data engineers within the team
- Establish good working relationships & communication channels with relevant departments
Skills and Qualifications we look for
- University degree 2.1 or higher (or equivalent) in a relevant subject. Master’s degree in any data subject will be a strong advantage.
- 4 - 6 years experience with data engineering.
- Strong coding ability and software development experience in Python.
- Strong hands-on experience with SQL and Data Processing.
- Google cloud platform (Cloud composer, Dataflow, Cloud function, Bigquery, Cloud storage, dataproc)
- Good working experience in any one of the ETL tools (Airflow would be preferable).
- Should possess strong analytical and problem solving skills.
- Good to have skills - Apache pyspark, CircleCI, Terraform
- Motivated, self-directed, able to work with ambiguity and interested in emerging technologies, agile and collaborative processes.
- Understanding & experience of agile / scrum delivery methodology
Graas uses predictive AI to turbo-charge growth for eCommerce businesses. We are “Growth-as-a-Service”. Graas is a technology solution provider using predictive AI to turbo-charge growth for eCommerce businesses. Graas integrates traditional data silos and applies a machine-learning AI engine, acting as an in-house data scientist to predict trends and give real-time insights and actionable recommendations for brands. The platform can also turn insights into action by seamlessly executing these recommendations across marketplace store fronts, brand.coms, social and conversational commerce, performance marketing, inventory management, warehousing, and last mile logistics - all of which impacts a brand’s bottom line, driving profitable growth.
Roles & Responsibilities:
Work on implementation of real-time and batch data pipelines for disparate data sources.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies.
- Build and maintain an analytics layer that utilizes the underlying data to generate dashboards and provide actionable insights.
- Identify improvement areas in the current data system and implement optimizations.
- Work on specific areas of data governance including metadata management and data quality management.
- Participate in discussions with Product Management and Business stakeholders to understand functional requirements and interact with other cross-functional teams as needed to develop, test, and release features.
- Develop Proof-of-Concepts to validate new technology solutions or advancements.
- Work in an Agile Scrum team and help with planning, scoping and creation of technical solutions for the new product capabilities, through to continuous delivery to production.
- Work on building intelligent systems using various AI/ML algorithms.
- Must have worked on Analytics Applications involving Data Lakes, Data Warehouses and Reporting Implementations.
- Experience with private and public cloud architectures with pros/cons.
- Ability to write robust code in Python and SQL for data processing. Experience in libraries such as Pandas is a must; knowledge of one of the frameworks such as Django or Flask is a plus.
- Experience in implementing data processing pipelines using AWS services: Kinesis, Lambda, Redshift/Snowflake, RDS.
- Knowledge of Kafka, Redis is preferred
- Experience on design and implementation of real-time and batch pipelines. Knowledge of Airflow is preferred.
- Familiarity with machine learning frameworks (like Keras or PyTorch) and libraries (like scikit-learn)
We are looking out for a Snowflake developer for one of our premium clients for their PAN India loaction
- 9 years and above of total experience preferably in bigdata space.
- Creating spark applications using Scala to process data.
- Experience in scheduling and troubleshooting/debugging Spark jobs in steps.
- Experience in spark job performance tuning and optimizations.
- Should have experience in processing data using Kafka/Pyhton.
- Individual should have experience and understanding in configuring Kafka topics to optimize the performance.
- Should be proficient in writing SQL queries to process data in Data Warehouse.
- Hands on experience in working with Linux commands to troubleshoot/debug issues and creating shell scripts to automate tasks.
- Experience on AWS services like EMR.
good exposure to concepts and/or technology across the broader spectrum. Enterprise Risk Technology
covers a variety of existing systems and green-field projects.
A Full stack Hadoop development experience with Scala development
A Full stack Java development experience covering Core Java (including JDK 1.8) and good understanding
of design patterns.
• Strong hands-on development in Java technologies.
• Strong hands-on development in Hadoop technologies like Spark, Scala and experience on Avro.
• Participation in product feature design and documentation
• Requirement break-up, ownership and implantation.
• Product BAU deliveries and Level 3 production defects fixes.
Qualifications & Experience
• Degree holder in numerate subject
• Hands on Experience on Hadoop, Spark, Scala, Impala, Avro and messaging like Kafka
• Experience across a core compiled language – Java
• Proficiency in Java related frameworks like Springs, Hibernate, JPA
• Hands on experience in JDK 1.8 and strong skillset covering Collections, Multithreading with
For internal use only
For internal use only
experience working on Distributed applications.
• Strong hands-on development track record with end-to-end development cycle involvement
• Good exposure to computational concepts
• Good communication and interpersonal skills
• Working knowledge of risk and derivatives pricing (optional)
• Proficiency in SQL (PL/SQL), data modelling.
• Understanding of Hadoop architecture and Scala program language is a good to have.
We are hiring for Senior Data Architect for a reputed company
Experience required- 10-19 yrs
Skills required- Having hands on experience on Kafka, Stored procedures, Snowflakes.
We’re hiring a talented Data Engineer and Big Data enthusiast to work in our platform to help ensure that our data quality is flawless. As a company, we have millions of new data points every day that come into our system. You will be working with a passionate team of engineers to solve challenging problems and ensure that we can deliver the best data to our customers, on-time. You will be using the latest cloud data warehouse technology to build robust and reliable data pipelines.
Exceptional candidates will have:
- Sr. Data Engineer:
Core Skills – Data Engineering, Big Data, Pyspark, Spark SQL and Python
Candidate with prior Palantir Cloud Foundry OR Clinical Trial Data Model background is preferred
- Responsible for Data Engineering, Foundry Data Pipeline Creation, Foundry Analysis & Reporting, Slate Application development, re-usable code development & management and Integrating Internal or External System with Foundry for data ingestion with high quality.
- Have good understanding on Foundry Platform landscape and it’s capabilities
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Defines company data assets (data models), Pyspark, spark SQL, jobs to populate data models.
- Designs data integrations and data quality framework.
- Design & Implement integration with Internal, External Systems, F1 AWS platform using Foundry Data Connector or Magritte Agent
- Collaboration with data scientists, data analyst and technology teams to document and leverage their understanding of the Foundry integration with different data sources - Actively participate in agile work practices
- Coordinating with Quality Engineer to ensure the all quality controls, naming convention & best practices have been followed
Desired Candidate Profile :
- Strong data engineering background
- Experience with Clinical Data Model is preferred
- Experience in
- SQL Server ,Postgres, Cassandra, Hadoop, and Spark for distributed data storage and parallel computing
- Java and Groovy for our back-end applications and data integration tools
- Python for data processing and analysis
- Cloud infrastructure based on AWS EC2 and S3
- 7+ years IT experience, 2+ years’ experience in Palantir Foundry Platform, 4+ years’ experience in Big Data platform
- 5+ years of Python and Pyspark development experience
- Strong troubleshooting and problem solving skills
- BTech or master's degree in computer science or a related technical field
- Experience designing, building, and maintaining big data pipelines systems
- Hands-on experience on Palantir Foundry Platform and Foundry custom Apps development
- Able to design and implement data integration between Palantir Foundry and external Apps based on Foundry data connector framework
- Hands-on in programming languages primarily Python, R, Java, Unix shell scripts
- Hand-on experience in AWS / Azure cloud platform and stack
- Strong in API based architecture and concept, able to do quick PoC using API integration and development
- Knowledge of machine learning and AI
- Skill and comfort working in a rapidly changing environment with dynamic objectives and iteration with users.
Demonstrated ability to continuously learn, work independently, and make decisions with minimal supervision
- Data Steward :
Data Steward will collaborate and work closely within the group software engineering and business division. Data Steward has overall accountability for the group's / Divisions overall data and reporting posture by responsibly managing data assets, data lineage, and data access, supporting sound data analysis. This role requires focus on data strategy, execution, and support for projects, programs, application enhancements, and production data fixes. Makes well-thought-out decisions on complex or ambiguous data issues and establishes the data stewardship and information management strategy and direction for the group. Effectively communicates to individuals at various levels of the technical and business communities. This individual will become part of the corporate Data Quality and Data management/entity resolution team supporting various systems across the board.
- Responsible for data quality and data accuracy across all group/division delivery initiatives.
- Responsible for data analysis, data profiling, data modeling, and data mapping capabilities.
- Responsible for reviewing and governing data queries and DML.
- Accountable for the assessment, delivery, quality, accuracy, and tracking of any production data fixes.
- Accountable for the performance, quality, and alignment to requirements for all data query design and development.
- Responsible for defining standards and best practices for data analysis, modeling, and queries.
- Responsible for understanding end-to-end data flows and identifying data dependencies in support of delivery, release, and change management.
- Responsible for the development and maintenance of an enterprise data dictionary that is aligned to data assets and the business glossary for the group responsible for the definition and maintenance of the group's data landscape including overlays with the technology landscape, end-to-end data flow/transformations, and data lineage.
- Responsible for rationalizing the group's reporting posture through the definition and maintenance of a reporting strategy and roadmap.
- Partners with the data governance team to ensure data solutions adhere to the organization’s data principles and guidelines.
- Owns group's data assets including reports, data warehouse, etc.
- Understand customer business use cases and be able to translate them to technical specifications and vision on how to implement a solution.
- Accountable for defining the performance tuning needs for all group data assets and managing the implementation of those requirements within the context of group initiatives as well as steady-state production.
- Partners with others in test data management and masking strategies and the creation of a reusable test data repository.
- Responsible for solving data-related issues and communicating resolutions with other solution domains.
- Actively and consistently support all efforts to simplify and enhance the Clinical Trial Predication use cases.
- Apply knowledge in analytic and statistical algorithms to help customers explore methods to improve their business.
- Contribute toward analytical research projects through all stages including concept formulation, determination of appropriate statistical methodology, data manipulation, research evaluation, and final research report.
- Visualize and report data findings creatively in a variety of visual formats that appropriately provide insight to the stakeholders.
- Achieve defined project goals within customer deadlines; proactively communicate status and escalate issues as needed.
- Strong understanding of the Software Development Life Cycle (SDLC) with Agile Methodologies
- Knowledge and understanding of industry-standard/best practices requirements gathering methodologies.
- Knowledge and understanding of Information Technology systems and software development.
- Experience with data modeling and test data management tools.
- Experience in the data integration project • Good problem solving & decision-making skills.
- Good communication skills within the team, site, and with the customer
Knowledge, Skills and Abilities
- Technical expertise in data architecture principles and design aspects of various DBMS and reporting concepts.
- Solid understanding of key DBMS platforms like SQL Server, Azure SQL
- Results-oriented, diligent, and works with a sense of urgency. Assertive, responsible for his/her own work (self-directed), have a strong affinity for defining work in deliverables, and be willing to commit to deadlines.
- Experience in MDM tools like MS DQ, SAS DM Studio, Tamr, Profisee, Reltio etc.
- Experience in Report and Dashboard development
- Statistical and Machine Learning models
- Python (sklearn, numpy, pandas, genism)
- Nice to Have:
- 1yr of ETL experience
- Natural Language Processing
- Neural networks and Deep learning
- xperience in keras,tensorflow,spacy, nltk, LightGBM python library
Interaction : Frequently interacts with subordinate supervisors.
Education : Bachelor’s degree, preferably in Computer Science, B.E or other quantitative field related to the area of assignment. Professional certification related to the area of assignment may be required
Experience : 7 years of Pharmaceutical /Biotech/life sciences experience, 5 years of Clinical Trials experience and knowledge, Excellent Documentation, Communication, and Presentation Skills including PowerPoint
Role and Responsibilities
- Execute data mining projects, training and deploying models over a typical duration of 2 -12 months.
- The ideal candidate should be able to innovate, analyze the customer requirement, develop a solution in the time box of the project plan, execute and deploy the solution.
- Integrate the data mining projects embedded data mining applications in the FogHorn platform (on Docker or Android).
Candidates must meet ALL of the following qualifications:
- Have analyzed, trained and deployed at least three data mining models in the past. If the candidate did not directly deploy their own models, they will have worked with others who have put their models into production. The models should have been validated as robust over at least an initial time period.
- Three years of industry work experience, developing data mining models which were deployed and used.
- Programming experience in Python is core using data mining related libraries like Scikit-Learn. Other relevant Python mining libraries include NumPy, SciPy and Pandas.
- Data mining algorithm experience in at least 3 algorithms across: prediction (statistical regression, neural nets, deep learning, decision trees, SVM, ensembles), clustering (k-means, DBSCAN or other) or Bayesian networks
Any of the following extra qualifications will make a candidate more competitive:
- Soft Skills
- Sets expectations, develops project plans and meets expectations.
- Experience adapting technical dialogue to the right level for the audience (i.e. executives) or specific jargon for a given vertical market and job function.
- Technical skills
- Commonly, candidates have a MS or Ph.D. in Computer Science, Math, Statistics or an engineering technical discipline. BS candidates with experience are considered.
- Have managed past models in production over their full life cycle until model replacement is needed. Have developed automated model refreshing on newer data. Have developed frameworks for model automation as a prototype for product.
- Training or experience in Deep Learning, such as TensorFlow, Keras, convolutional neural networks (CNN) or Long Short Term Memory (LSTM) neural network architectures. If you don’t have deep learning experience, we will train you on the job.
- Shrinking deep learning models, optimizing to speed up execution time of scoring or inference.
- OpenCV or other image processing tools or libraries
- Cloud computing: Google Cloud, Amazon AWS or Microsoft Azure. We have integration with Google Cloud and are working on other integrations.
- Decision trees like XGBoost or Random Forests is helpful.
- Complex Event Processing (CEP) or other streaming data as a data source for data mining analysis
- Time series algorithms from ARIMA to LSTM to Digital Signal Processing (DSP).
- Bayesian Networks (BN), a.k.a. Bayesian Belief Networks (BBN) or Graphical Belief Networks (GBN)
- Experience with PMML is of interest (see www.DMG.org).
- Vertical experience in Industrial Internet of Things (IoT) applications:
- Energy: Oil and Gas, Wind Turbines
- Manufacturing: Motors, chemical processes, tools, automotive
- Smart Cities: Elevators, cameras on population or cars, power grid
- Transportation: Cars, truck fleets, trains
About FogHorn Systems
FogHorn is a leading developer of “edge intelligence” software for industrial and commercial IoT application solutions. FogHorn’s Lightning software platform brings the power of advanced analytics and machine learning to the on-premise edge environment enabling a new class of applications for advanced monitoring and diagnostics, machine performance optimization, proactive maintenance and operational intelligence use cases. FogHorn’s technology is ideally suited for OEMs, systems integrators and end customers in manufacturing, power and water, oil and gas, renewable energy, mining, transportation, healthcare, retail, as well as Smart Grid, Smart City, Smart Building and connected vehicle applications.
- 2019 Edge Computing Company of the Year – Compass Intelligence
- 2019 Internet of Things 50: 10 Coolest Industrial IoT Companies – CRN
- 2018 IoT Planforms Leadership Award & Edge Computing Excellence – IoT Evolution World Magazine
- 2018 10 Hot IoT Startups to Watch – Network World. (Gartner estimated 20 billion connected things in use worldwide by 2020)
- 2018 Winner in Artificial Intelligence and Machine Learning – Globe Awards
- 2018 Ten Edge Computing Vendors to Watch – ZDNet & 451 Research
- 2018 The 10 Most Innovative AI Solution Providers – Insights Success
- 2018 The AI 100 – CB Insights
- 2017 Cool Vendor in IoT Edge Computing – Gartner
- 2017 20 Most Promising AI Service Providers – CIO Review
Our Series A round was for $15 million. Our Series B round was for $30 million October 2017. Investors include: Saudi Aramco Energy Ventures, Intel Capital, GE, Dell, Bosch, Honeywell and The Hive.
About the Data Science Solutions team
In 2018, our Data Science Solutions team grew from 4 to 9. We are growing again from 11. We work on revenue generating projects for clients, such as predictive maintenance, time to failure, manufacturing defects. About half of our projects have been related to vision recognition or deep learning. We are not only working on consulting projects but developing vertical solution applications that run on our Lightning platform, with embedded data mining.
Our data scientists like our team because:
- We care about “best practices”
- Have a direct impact on the company’s revenue
- Give or receive mentoring as part of the collaborative process
- Questions and challenging the status quo with data is safe
- Intellectual curiosity balanced with humility
- Present papers or projects in our “Thought Leadership” meeting series, to support continuous learning