Responsibilities
- Own the design, development, testing, deployment, and craftsmanship of the team’s infrastructure and systems capable of handling massive amounts of requests with high reliability and scalability
- Leverage the deep and broad technical expertise to mentor engineers and provide leadership on resolving complex technology issues
- Entrepreneurial and out-of-box thinking essential for a technology startup
- Guide the team for unit-test code for robustness, including edge cases, usability, and general reliability
Requirements
- In-depth understanding of image processing algorithms, pattern recognition methods, and rule-based classifiers
- Experience in feature extraction, object recognition and tracking, image registration, noise reduction, image calibration, and correction
- Ability to understand, optimize and debug imaging algorithms
- Understating and experience in openCV library
- Fundamental understanding of mathematical techniques involved in ML and DL schemas (Instance-based methods, Boosting methods, PGM, Neural Networks etc.)
- Thorough understanding of state-of-the-art DL concepts (Sequence modeling, Attention, Convolution etc.) along with knack to imagine new schemas that work for the given data.
- Understanding of engineering principles and a clear understanding of data structures and algorithms
- Experience in writing production level codes using either C++ or Java
- Experience with technologies/libraries such as python pandas, numpy, scipy
- Experience with tensorflow and scikit.
Similar jobs
Job Description
We are looking for an experienced engineer to join our data science team, who will help us design, develop, and deploy machine learning models in production. You will develop robust models, prepare their deployment into production in a controlled manner, while providing appropriate means to monitor their performance and stability after deployment.
What You’ll Do will include (But not limited to):
- Preparing datasets needed to train and validate our machine learning models
- Anticipate and build solutions for problems that interrupt availability, performance, and stability in our systems, services, and products at scale.
- Defining and implementing metrics to evaluate the performance of the models, both for computing performance (such as CPU & memory usage) and for ML performance (such as precision, recall, and F1)
- Supporting the deployment of machine learning models on our infrastructure, including containerization, instrumentation, and versioning
- Supporting the whole lifecycle of our machine learning models, including gathering data for retraining, A/B testing, and redeployments
- Developing, testing, and evaluating tools for machine learning models deployment, monitoring, retraining.
- Working closely within a distributed team to analyze and apply innovative solutions over billions of documents
- Supporting solutions ranging from rule-bases, classical ML techniques to the latest deep learning systems.
- Partnering with cross-functional team members to bring large scale data engineering solutions to production
- Communicating your approach and results to a wider audience through presentations
Your Qualifications:
- Demonstrated success with machine learning in a SaaS or Cloud environment, with hands–on knowledge of model creation and deployments in production at scale
- Good knowledge of traditional machine learning methods and neural networks
- Experience with practical machine learning modeling, especially on time-series forecasting, analysis, and causal inference.
- Experience with data mining algorithms and statistical modeling techniques for anomaly detection in time series such as clustering, classification, ARIMA, and decision trees is preferred.
- Ability to implement data import, cleansing and transformation functions at scale
- Fluency in Docker, Kubernetes
- Working knowledge of relational and dimensional data models with appropriate visualization techniques such as PCA.
- Solid English skills to effectively communicate with other team members
Due to the nature of the role, it would be nice if you have also:
- Experience with large datasets and distributed computing, especially with the Google Cloud Platform
- Fluency in at least one deep learning framework: PyTorch, TensorFlow / Keras
- Experience with No–SQL and Graph databases
- Experience working in a Colab, Jupyter, or Python notebook environment
- Some experience with monitoring, analysis, and alerting tools like New Relic, Prometheus, and the ELK stack
- Knowledge of Java, Scala or Go-Lang programming languages
- Familiarity with KubeFlow
- Experience with transformers, for example the Hugging Face libraries
- Experience with OpenCV
About Egnyte
In a content critical age, Egnyte fuels business growth by enabling content-rich business processes, while also providing organizations with visibility and control over their content assets. Egnyte’s cloud-native content services platform leverages the industry’s leading content intelligence engine to deliver a simple, secure, and vendor-neutral foundation for managing enterprise content across business applications and storage repositories. More than 16,000 customers trust Egnyte to enhance employee productivity, automate data management, and reduce file-sharing cost and complexity. Investors include Google Ventures, Kleiner Perkins, Caufield & Byers, and Goldman Sachs. For more information, visit www.egnyte.com
#LI-Remote
Concepts of RDBMS, Normalization techniques
Entity Relationship diagram/ ER-Model
Transaction, commit, rollback, ACID properties
Transaction log
Difference in behavior of the column if it is nullable
SQL Statements
Join Operations
DDL, DML, Data Modelling
Optimal Query writing - with Aggregate fn, Group By, having clause, Order by etc. Should be
hands on for scenario-based query Writing
Query optimizing technique, Indexing in depth
Understanding query plan
Batching
Locking schemes
Isolation levels
Concept of stored procedure, Cursor, trigger, View
Beginner level - PL/SQL - Procedure Function writing skill.
Spring JPA and Spring Data basics
Hibernate mappings
UNIX
Basic Concepts on Unix
Commonly used Unix Commands with their options
Combining Unix commands using Pipe Filter etc.
Vi Editor & its different modes
Basic level Scripting and basic knowledge on how to execute jar files from host
Files and directory permissions
Application based scenarios.
Job Description
We are looking for a highly capable machine learning engineer to optimize our deep learning systems. You will be evaluating existing deep learning (DL) processes, do hyperparameter tuning, performing statistical analysis (logging and evaluating model’s performance) to resolve data set problems, and enhancing the accuracy of our AI software's predictive automation capabilities.
You will be working with technologies like AWS Sagemaker, TensorFlow JS, TensorFlow/ Keras/TensorBoard to create Deep Learning backends that powers our application.
To ensure success as a machine learning engineer, you should demonstrate solid data science knowledge and experience in Deep Learning role. A first-class machine learning engineer will be someone whose expertise translates into the enhanced performance of predictive automation software. To do this job successfully, you need exceptional skills in DL and programming.
Responsibilities
-
Consulting with managers to determine and refine machine learning objectives.
-
Designing deep learning systems and self-running artificial intelligence (AI) software to
automate predictive models.
-
Transforming data science prototypes and applying appropriate ML algorithms and
tools.
-
Carry out data engineering subtasks such as defining data requirements, collecting,
labeling, inspecting, cleaning, augmenting, and moving data.
-
Carry out modeling subtasks such as training deep learning models, defining
evaluation metrics, searching hyperparameters, and reading research papers.
-
Carry out deployment subtasks such as converting prototyped code into production
code, working in-depth with AWS services to set up cloud environment for training,
improving response times and saving bandwidth.
-
Ensuring that algorithms generate robust and accurate results.
-
Running tests, performing analysis, and interpreting test results.
-
Documenting machine learning processes.
-
Keeping abreast of developments in machine learning.
Requirements
-
Proven experience as a Machine Learning Engineer or similar role.
-
Should have indepth knowledge of AWS Sagemaker and related services (like S3).
-
Extensive knowledge of ML frameworks, libraries, algorithms, data structures, data
modeling, software architecture, and math & statistics.
-
Ability to write robust code in Python & Javascript (TensorFlow JS).
-
Experience with Git and Github.
-
Superb analytical and problem-solving abilities.
-
Excellent troubleshooting skills.
-
Good project management skills.
-
Great communication and collaboration skills.
-
Excellent time management and organizational abilities.
-
Bachelor's degree in computer science, data science, mathematics, or a related field;
Master’s degree is a plus.
About LodgIQ
LodgIQ is led by a team of experienced hospitality technology experts, data scientists and product domain experts. Seed funded by Highgate Ventures, a venture capital platform focused on early stage technology investments in the hospitality industry and Trilantic Capital Partners, a global private equity firm, LodgIQ has made a significant investment in advanced machine learning platforms and data science.
Title : Data Scientist
Job Description:
- Apply Data Science and Machine Learning to a REAL-LIFE problem - “Predict Guest Arrivals and Determine Best Prices for Hotels”
- Apply advanced analytics in a BIG Data Environment – AWS, MongoDB, SKLearn
- Help scale up the product in a global offering across 100+ global markets
Qualifications:
- Minimum 3 years of experience with advanced data analytic techniques, including data mining, machine learning, statistical analysis, and optimization. Student projects are acceptable.
- At least 1 year of experience with Python / Numpy / Pandas / Scipy/ MatPlotLib / Scikit-Learn
- Experience in working with massive data sets, including structured and unstructured with at least 1 prior engagement involving data gathering, data cleaning, data mining, and data visualization
- Solid grasp over optimization techniques
- Master's or PhD degree in Business Analytics. Data science, Statistics or Mathematics
- Ability to show a track record of solving large, complex problems
Data Warehouse and Analytics solutions that aggregate data across diverse sources and data types
including text, video and audio through to live stream and IoT in an agile project delivery
environment with a focus on DataOps and Data Observability. You will work with Azure SQL
Databases, Synapse Analytics, Azure Data Factory, Azure Datalake Gen2, Azure Databricks, Azure
Machine Learning, Azure Service Bus, Azure Serverless (LogicApps, FunctionApps), Azure Data
Catalogue and Purview among other tools, gaining opportunities to learn some of the most
advanced and innovative techniques in the cloud data space.
You will be building Power BI based analytics solutions to provide actionable insights into customer
data, and to measure operational efficiencies and other key business performance metrics.
You will be involved in the development, build, deployment, and testing of customer solutions, with
responsibility for the design, implementation and documentation of the technical aspects, including
integration to ensure the solution meets customer requirements. You will be working closely with
fellow architects, engineers, analysts, and team leads and project managers to plan, build and roll
out data driven solutions
Expertise:
Proven expertise in developing data solutions with Azure SQL Server and Azure SQL Data Warehouse (now
Synapse Analytics)
Demonstrated expertise of data modelling and data warehouse methodologies and best practices.
Ability to write efficient data pipelines for ETL using Azure Data Factory or equivalent tools.
Integration of data feeds utilising both structured (ex XML/JSON) and flat schemas (ex CSV,TXT,XLSX)
across a wide range of electronic delivery mechanisms (API/SFTP/etc )
Azure DevOps knowledge essential for CI/CD of data ingestion pipelines and integrations.
Experience with object-oriented/object function scripting languages such as Python, Java, JavaScript, C#,
Scala, etc is required.
Expertise in creating technical and Architecture documentation (ex: HLD/LLD) is a must.
Proven ability to rapidly analyse and design solution architecture in client proposals is an added advantage.
Expertise with big data tools: Hadoop, Spark, Kafka, NoSQL databases, stream-processing systems is a plus.
Essential Experience:
5 or more years of hands-on experience in a data architect role with the development of ingestion,
integration, data auditing, reporting, and testing with Azure SQL tech stack.
full data and analytics project lifecycle experience (including costing and cost management of data
solutions) in Azure PaaS environment is essential.
Microsoft Azure and Data Certifications, at least fundamentals, are a must.
Experience using agile development methodologies, version control systems and repositories is a must.
A good, applied understanding of the end-to-end data process development life cycle.
A good working knowledge of data warehouse methodology using Azure SQL.
A good working knowledge of the Azure platform, it’s components, and the ability to leverage it’s
resources to implement solutions is a must.
Experience working in the Public sector or in an organisation servicing Public sector is a must,
Ability to work to demanding deadlines, keep momentum and deal with conflicting priorities in an
environment undergoing a programme of transformational change.
The ability to contribute and adhere to standards, have excellent attention to detail and be strongly driven
by quality.
Desirables:
Experience with AWS or google cloud platforms will be an added advantage.
Experience with Azure ML services will be an added advantage Personal Attributes
Articulated and clear in communications to mixed audiences- in writing, through presentations and one-toone.
Ability to present highly technical concepts and ideas in a business-friendly language.
Ability to effectively prioritise and execute tasks in a high-pressure environment.
Calm and adaptable in the face of ambiguity and in a fast-paced, quick-changing environment
Extensive experience working in a team-oriented, collaborative environment as well as working
independently.
Comfortable with multi project multi-tasking consulting Data Architect lifestyle
Excellent interpersonal skills with teams and building trust with clients
Ability to support and work with cross-functional teams in a dynamic environment.
A passion for achieving business transformation; the ability to energise and excite those you work with
Initiative; the ability to work flexibly in a team, working comfortably without direct supervision.
Hi We are Looking foe Data Science AI Professional role for our KPHB, Hyderabad Branch Requirements:
• Min 6 months to 1 Year experience in Data Science AI
• Need to have proven Experience with Good GitHub profile and projects
• Need to good with Data Science & AI Concepts
• Need to be good with Python, ML, Stats, Deep Learning, NLP, OpenCV etc
• Good Communication and presentation skills
We are looking for an outstanding Big Data Engineer with experience setting up and maintaining Data Warehouse and Data Lakes for an Organization. This role would closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Roles and Responsibilities:
- Develop and maintain scalable data pipelines and build out new integrations and processes required for optimal extraction, transformation, and loading of data from a wide variety of data sources using 'Big Data' technologies.
- Develop programs in Scala and Python as part of data cleaning and processing.
- Assemble large, complex data sets that meet functional / non-functional business requirements and fostering data-driven decision making across the organization.
- Responsible to design and develop distributed, high volume, high velocity multi-threaded event processing systems.
- Implement processes and systems to validate data, monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Perform root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Provide high operational excellence guaranteeing high availability and platform stability.
- Closely collaborate with the Data Science team and assist the team build and deploy machine learning and deep learning models on big data analytics platforms.
Skills:
- Experience with Big Data pipeline, Big Data analytics, Data warehousing.
- Experience with SQL/No-SQL, schema design and dimensional data modeling.
- Strong understanding of Hadoop Architecture, HDFS ecosystem and eexperience with Big Data technology stack such as HBase, Hadoop, Hive, MapReduce.
- Experience in designing systems that process structured as well as unstructured data at large scale.
- Experience in AWS/Spark/Java/Scala/Python development.
- Should have Strong skills in PySpark (Python & SPARK). Ability to create, manage and manipulate Spark Dataframes. Expertise in Spark query tuning and performance optimization.
- Experience in developing efficient software code/frameworks for multiple use cases leveraging Python and big data technologies.
- Prior exposure to streaming data sources such as Kafka.
- Should have knowledge on Shell Scripting and Python scripting.
- High proficiency in database skills (e.g., Complex SQL), for data preparation, cleaning, and data wrangling/munging, with the ability to write advanced queries and create stored procedures.
- Experience with NoSQL databases such as Cassandra / MongoDB.
- Solid experience in all phases of Software Development Lifecycle - plan, design, develop, test, release, maintain and support, decommission.
- Experience with DevOps tools (GitHub, Travis CI, and JIRA) and methodologies (Lean, Agile, Scrum, Test Driven Development).
- Experience building and deploying applications on on-premise and cloud-based infrastructure.
- Having a good understanding of machine learning landscape and concepts.
Qualifications and Experience:
Engineering and post graduate candidates, preferably in Computer Science, from premier institutions with proven work experience as a Big Data Engineer or a similar role for 3-5 years.
Certifications:
Good to have at least one of the Certifications listed here:
AZ 900 - Azure Fundamentals
DP 200, DP 201, DP 203, AZ 204 - Data Engineering
AZ 400 - Devops Certification